Learning to parse Natural Language Commands to a Robot control System (For Hindi Language) Motivation: As robots are becoming more advanced and capable of performing of complex tasks , the importance of enabling untrained users to interact with them has increased. Hence unconstrained natural-language interaction of humans with robots has emerged as a significant research area. I will discuss problem of parsing natural language (in our case Hindi) to actions and control structures that can be implemented in a robot execution system. My approach will be to making a robot learn natural language instructions in navigation system, which in my case will be a conditional random field. I will evaluate the approach based on whether my robot is not only able to perform the route instructoins but with keeping semantic intent of the statements involving complex structures. Problem Statement: This project will be based on grounding natural language intrepreting human lanugage (natural language) into a semantically informed structure in the contest of robot simulation on a grid type navigation system. Route instructions (given in the natural language) will be converted into robot control language) and their accuracy of correctness will be estimated. My goal is primarily to understand that whether it is possible to learn a parser that produces correct , robot executable commands for such instructions. My primary measure of succesfulness will be how well the parser can execute the complex structured instructions. Previous Work: All the previous work done in this area (excluding the first paper given in reference) is based on mapping the language structure related to naviagation directly to commands rather than learning. Task: 1. Going from NL to robot control. First the natural language command is parsed into a formal, procedural description representing the intent of the person. The robot control commands are then used by the executor, along with the local state of the world, to control the robot, thereby grounding the NL commands into actions while exploring the environment. 2. Specifications of RCL (robot control language) will be taken from [1]. 3. Parsing natural language to expressive formal representations such as labmda-calculus will be used. For this work, parsing is performed using an extended version of the Unification Based Learner, UBL [6]. The grammatical formalism used by UBL is a probabilistic version of combinatory categorial grammars, or CCGs [7], a type of phrase structure grammar. CCGs model both the syntax (language constructs such as NP for noun phrase) and the semantics (expressions in lambda -calculus) of a sentence. UBL creates a parser by inducing a probabilistic CCG (PCCG) from a set of training examples. DataSets and Maps: 1. Maps are labeled according to area type (room, hallway, pathway , junctions etc.), but are not known to the robot in advance. Instead, the robot explores the map simultaneously with following an RCL program. These experiments are performed in simulation. 2. 4 Maps will be used same mentioned in [1]. Map A and B will be generated by CRF (conditional Random Field). So they are simple maps. C and D will be manually generated maps and will have complex structures. 3. Base training set will S_base will contain 150 unique sentences that will be generated by non experts (eg my iitk friends and wingmates in Hindi language). This 150 sentences will be having only simple instructions containing only 2 or 3 NL instructions in 1 sentence. By adding 20 more route instructions generated by non experts for complex map C and D , a new data set will be called S_enriched. 4. Now for each test , S_enriched will be split into training and test data. All Hindi instructions from some number of non - expert instruction givers (the number would be 5 for my case) and used to create test data named S_test. So the training set will be S_enriched - S_test or basically set of all sentences collected from remaining participants. 5. For each of 10 different sets S_test 1 - 10 (every combination of 5 different participants) , 1200 (50 * 24) paths through map D will be generated. 1000 short paths of only 1 NL instructions and remaining complex paths of on an average 5 NL instructions. References: 1. Learning to Parse Natural Language Commands to a Robot Control System Cynthia Matuszek, Evan Herbst, Luke Zettlemoyer, Dieter Fox 2. Y. Artzi and L.S. Zettlemoyer. Bootstrapping semantic parsers from conversations. In Proc. of the Conf. on Empirical Methods in Natural Language Processing, 2011. 3. A. Ferrein and G. Lakemeyer. Logic-based robot control in highly dynamic domains. Robotics and Autonomous Systems, 56(11), 2008. 4. T. Kwiatkowski, L.S. Zettlemoyer, S. Goldwater, and M. Steedman. Inducing probabilistic CCG grammars from logical form with higher-order unification. In Proc. of the Conf. on Empirical Methods in Natural Language Processing, 2010. 5. P. Lison and G-J. M. Kruijff. An integrated approach to robust processing of situated spoken dialogue. In Proc. of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language, pages 58–65, Athens, Greece, March 2009. Association for Computational linguistics. 6. Luke Zettlemoyer and Yoav Artzi Learning to Recover Meaning from Unannotated Conversational Interactions