Gaze-tracking Studies to Identify Structure of Natural Language

 

Guide: Prof. Amitabha Mukerjee

Chirag Gupta (chiragvg@iitk.ac.in)

 

Motivation:

There has been classical work by linguists and psychologists, on studying topics in natural language such as anaphora disambiguation, word sense disambiguation and context disambiguation using gaze-tracking data of participants performing a sentence reading task.

 

We seek to use gaze tracking studies to understand the underlying structure of Language by using ease of resolution as our foremost parameter. Our conjecture is that the difficult sentences are in essence 'unnatural' and that sentences that are resolved more easily vis-a-vis their more difficult counterparts are 'natural'.

 

Related Work:

Previous studies that have employed Gaze-tracking to understand natural language have studied context disambiguation, anaphora disambiguation, word sense disambiguation and so on. Traxler[5] employed Gaze-tracking to study the effect of implausible and plausible context elements in a sentence and how easy each of these were to resolve. Subjects experienced greater processing difficulty in implausible sentences than plausible sentences right after encountering the verb which was incongruent in context and did not wait until the purported gap location to form unbounded dependencies.  

 

Traxler also considered sentences which locally appeared to have an unbounded dependency that turn out to be incorrect later. Data from this experiment revealed that readers formed unbounded dependencies immediately despite having to reanalyze them later.  Other studies have studied the effect of plausibility and different modes of priming i.e. identity, schemas and so on. Currently, various theories that model priming exist which have not yet been resolved and when combined give a holistic description of how priming in sentences affects context disambiguation and in turn leads to sentences which are easier or more difficult to resolve and process depending upon the kind of priming employed in the given sentence.


Details of experiment:

Core to our idea is the research by Sturt (2003)[2], where he showed that resolving anaphoric references in sentences (typically pronouns) is a task whose difficulty increases if the stereotypes/expectations associated with the natural grammatical binds of these references is contradictory to the properties implied by the anaphora. We wish to extend this study from expectation concerning particular objects in the sentence, to expectations regarding the structure of the sentence itself. As a simple example, consider the sentences.

 

  1. The ball was kicked by Bhutia into the goal.
  2. The ball was kicked into the goal by Bhutia.
  3. Bhutia kicked the ball into the goal.

 

Grammatically, all the above sentences are correct. However, only the third (active tense) one seems to be 'natural'. We propose that we can translate this feeling of 'unnaturalness' into empirical evidence in form of longer gaze fixation on sentences 1 and 2, especially those parts that seem to be particularly unnatural. For example, the first sentence has an added adverbial phrase in the end that could also be avoided, hence making it unexpected. We expect participants to fixate longer on that part.

 

Proposed Methodology:

We will first identify candidate sentences for our experiment. The sentences would have a number of constraints; primarily they need to be semantically equivalent or close. Pairs of sentences would differ on parameters including and not limited to voice, structure etc.

 

Data would be collected through Gaze-tracking studies as in [4]. Number of subjects would ideally be 30-35. Each subject will be shown candidate sentences and asked to read them. The apparatus' calibration would be checked in between pairs to ensure smooth conduction of experiment. We will collect the fixation time data of each subject for all candidate sentences.

 

We aim to analyze the data thus obtained in terms of total fixation times and fixation times of individual parts of sentences and to find correlation between various choice of structures and their relative ease of resolution ('naturalness').

 

Goals / What we hope to achieve through this study:

Through this study, we hope to formulate an empirical method of identifying structures in language that have arisen through general interaction, over others that are grammatically equivalent but not so common. This study could, in the long run, be an empirical way of resolving debates related to Universal Grammar (Chomsky).

 

References:

  1. http://www.amazon.com/The-On-line-Study-Sentence-Comprehension/dp/1841694002
  2. http://web.stanford.edu/~jurafsky/sturtjml.pdf
  3. http://psycnet.apa.org/psycinfo/1998-11174-004
  4. http://download.springer.com/static/pdf/819/art%253A10.1023%252FA%253A1026416225168.pdf?auth66=1411997422_f3ce847c87cefbf2df23fb31629efb51&ext=.pdf
  5. http://ling.umd.edu/~ellenlau/courses/ling440/Traxler_1996.pdf