Infrequent words are difficult to comprehend.


Introduction:
Parameters like saccade length, fixation duration and regression frequency reveal a lot about the text in silent reading. Like whether the text is conceptually difficult or not, has difficult words or not and lot more.
This is in accordance with the view that eye's saccadic movements reflect the online cognitive processing of the text being read(Processing versus Oculomotor model). Confusion in parsing information is reflected in reduction in saccade length, and increase in fixation duration and regression frequency.

Experiments have been conducted to show that as the frequency of word increases, the fixation time on that particular word decreases. Short within-word regression indicates difficulty in processing that word, hence frequency can also play a role here. It can be useful in measuring difficulty in processing long frequent vs long infrequent words. Longer regression(spanning more than a word) signifies not understanding the sentence as a whole, hence a measure of perplexity in sentence comprehension.
Morover, parafoveal preview(slight focus on first few letters of the word on right of currently fixated word) triggers preliminary parafoveal word analysis. Here also parafoveal preview does better if the parafoveal word has higher frequency. Richer the analysis, faster the reading rate.

Proposal:
We propose to do the experiment that infrequent words are difficult to comprehend with words drawn from "hindi" language. The dataset to be used will likely be IIT Bombay's Hindi Wordnet.

References:
[1]K. Rayner. Eye movements in reading and information processing:20 years of research. Psychological bulletin, 124:372-422, 1998.
[2]Matthew S. Starr and Keith Rayner. Eye movements during reading: some current controversies. Trends in Cognitive Science, 5 (2001), pp. 156-163.
[3]Inhoff, A.W. and Rayner, K. (1986) Parafoveal word processing during eye fixations in reading:effects of word frequency. Percept. Psychophys. 40, 431-439.