Surprisal in Garden-Path Sentence Processing: An Empirical Study


Guide : Prof. Amitabha Mukerjee

Alankrita Bhatt
{alankb@iitk.ac.in}

Sharbatanu Chatterjee
{sharbatc@iitk.ac.in}



Motivation :

One of the best-known phenomena in psycholinguistics is the Garden-path sentence, in which a local ambiguity biases the comprehender’s incremental syntactic interpretation so strongly that upon encountering disambiguating input the correct interpretation can only be recovered with great effort, if at all.[1] "Garden path" refers to the saying "to be led down the garden path", meaning "to be misled".

Locally ambiguous sentences therefore have been used as test cases to investigate the influence of a number of different factors on human sentence processing, as for example, to illustrate the fact that when human beings read, they process language one word at a time. Since recently, researchers like Levy have attempted to computationally model the processing of garden path sentences using a statistical framework, in which the cognitive effort required is quantified in terms of the “Surprisal” or the “Shannon Information Content”[5][2]. Such a probabilistic representation has achieved considerable success due to high correlation with empirical results.

Related Work :

Hale (2001) introduced surprisal theory and the notion of cognitive load in terms of total probability of structural options that have been disconfirmed at some point in a sentence [2]. They used the Earley parser, which parses probabilistic context-free phase structure grammar (PCFG). For example,

S -> NP VP Probability = 1.0

NP -> Det. N Probability = 0.5

and so on.

Such a theory defined the probability of a sentence in terms of the product of probabilities of all “rules” used to generate a sentence. So, garden pathing occurs when many structures that together comprise a high prior probability are discarded. Also, the cognitive effort

wi ∝ log (P(wi/w0 , w1, ... , wi-1, context))

Levy (2008) built on this work and introduced an information-theoretic characterization of processing difficulty[3]. Also considering the possibility of a noisy linguistic signal, he introduced the Levenshtein-Distance kernel to incorporate noise effects in the input[4].

Most recently, Levy (2011) proposed that linguistic information is used both proactively and retroactively - the current input is used not just to predict upcoming input, but also to revise beliefs about the previous input. He goes on to suggest possible “hallucinations” - distortions in belief of what the input is, because of the extremely high prior probability of the distorted input.

In the context of garden-path sentences, when faced with sufficiently biasing input, comprehenders might under some circumstances adopt a grammatical analysis inconsistent with the true raw input comprising a sentence they are presented with, but consistent with a slightly perturbed version of the input that has higher prior probability. If this is the case, then subsequent input strongly disconfirming this “hallucinated” garden-path analysis might be expected to induce the same effects as seen in classic cases of garden-path disambiguation traditionally studied in the psycholinguistic literature. He thus integrated the uncertain-input theory of [4] and the surprisal theory of [2]. An experiment was also conducted and the empirical data thus obtained was used to support the theories proposed.

Proposed Methodology :

We aim to replicate the experiment conducted in [1]:

  • A self paced reading study is done. Participants read by pressing a button to reveal each successive word in a sentence. The time between subsequent button presses is taken as an indicator of incremental processing difficulty.
  • There is a wide multitude of sentences to be read, some of which are the garden path sentences whose processing time is to be studied. They are interspersed with “filler” sentences. No two experimental items are adjacent.
  • Each sentence is followed with a yes/no comprehension question.

Following this, we aim to substantiate our experiment with a further gaze tracking experiment along the lines of the experiment described in [5], according to which we shall be using the available technology (as provided to us) to track eye movements in both hallucinating (input- unfaithful) garden path sentences as well as normal traditional garden path sentences to track the time and hence the cognitive load required to process these sentences. We plan to begin each track with a gaze trigger, presented in the form of a black square in the position of the first character of the text. Once a stable fixation had been detected on the gaze trigger, the sentence is presented in full. The participant shall press a button on the button box to indicate that he/she had finished reading the sentence. At this point, the sentence shall disappear, and, in 50% of the trials, a yes/no comprehension question will presented, which the participants answer by pressing the appropriate button on the button box. Sentences shall be presented in a random order intermixed with filler sentences of varying structures, all of which were grammatical, some without garden paths, as the verbs were used transitively.

Such an experiment was not done regarding garden path sentences using eye tracking by Levy. We have also planned to come up with garden path like sentences in Hindi or Bengali and trying to judge whether there is any difference between the comprehensions of these languages and English.

Data Analysis :

  • We plan to evaluate the surprisal described in terms of the Shannon Information content, and then find out the correlation between reading times and the surprisal.
  • We also plan to evaluate the prior probability of various sentence structures used using the PCFG probabilities, and plot them versus surprisal, to find out which sentence structure induces highest surprisal.


References :

  1. Levy, Roger. 2011. “Integrating surprisal and uncertain-input models in online sentence comprehension: Formal techniques and empirical results”. In Proceedings of the 49th annual meeting of the Association for Computational Linguistics, Stroudsburg, PA: Association for Computational Linguistics.
  2. Hale, J. (2001). A probabilistic Earley parser as a psycholinguistic model. In Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics, pages 159-166.
  3. Levy, R. (2008). Expectation based syntactic comprehension. Cognition, 106:1126-1177.
  4. Levy, R. (2008). A noisy-channel model of rational human sentence comprehension under uncertain input. In Proceedings of the 13th Conference on Empirical Methods in Natural Language Processing, pages 234-243.
  5. Slattery, Timothy J. et al (2013) Lingering misinterpretations of garden path sentences arise from competing syntactic representations, Journal of Memory and Language, pages 104-120.