CS671: Natural Language Processing
Semester I, 2015-16
Ekansh Gupta, 12252
egupta@iitk.ac.in
Finding Syllables in Indic languages
Language 1 - Hindi in Devanagari script
Corpus - Munshi Premchand's Novel - Karmabhoomi (कर्मभूमि). File Size - 1.5MB
Language 2 - Sanskrit in Latin Extended script
Corpus - Bhagavad Gita. File Size - 90KB
Change browser's encoding if unable to properly view any corpus.
Top syllables in Language 1 (Devanagari Hindi)
Top syllable bigrams in Language 1 (Devanagari Hindi)
Top syllables in Language 2 (Latin transliteration of Sanskrit)
Top syllable bigrams in Language 2 (Latin transliteration of Sanskrit)
Log frequency plot of top syllables in Language 1 (Devanagari Hindi)
Log frequency plot of top syllable bigrams in Language 1 (Devanagari Hindi)
Log frequency plot of top syllables in Language 2 (Latin transliteration of Sanskrit)
Log frequency plot of top syllable bigrams in Language 2 (Latin transliteration of Sanskrit)