Seminar by Dr. Rajeev Sangal
Shakti Machine Translation System
Prof. Rajeev Sangal
Language Technologies Research Centre
International Institute of Information Technology
Hyderabad
Date: Wed, Sep 15, 2004
Time: 3:15 PM
Venue: CS-101
Abstract
This talk focuses on a Shakti Machine Translation (MT) system from English to different languages of the world. Through this talk some interesting problems in processing of human language by machine will be high-lighted, particularly related to machine learning. Shakti uses statistical as well as rule-based approaches for processing language. It uses constituent structure at the chunk level, and dependency relations at the sentence level of analysis. It has specialized components for word sense disambiguation, parsing, preposition attachement, phrasal verb identification, transfer grammar, sentence and word generation, and many others.
The architecture of Shakti is highly modular. The complex problem of MT has been broken into tens of smaller sub-problems. Every sub-problem corresponds to a task which is handled by an independent module (and a research student). The modules are put together using a common extensible representation using trees and feature structures, called the Shakti Standard Format (SSF). Modules are pipelined and the output of the previous module becomes the input of the following module.
Special care for robustness has been taken in the design of Shakti. At the architectural level, if a module fails to perform an operation, the common format ensures that there is no immediate breakdown. Virtually every module is also designed to deal with failure.
The Shakti system is presently working for the following pairs of languages: English-Hindi, English-Telugu and English-Marathi. A proof of concept system has also been built for English-Amharic (an Ethiopian language). The systems are accessible through http://shakti.iiit.net.
About the Speaker
Prof. Rajeev Sangal is Director of International Institute of Information Technology.