Home Index
AU-KBC RESEARCH CENTRE
Development of Lexical Resources in Tamil
1. TransLexGram        TransLexGram is a name for a set of electronic lexical resources being developed for use in machine-aided translation from English to Indian languages. It is an abbreviation for "Transfer Lexicon and Grammar". This is a project of LERIL (Lexical Resources for Indian Languages), an open-source, collaborative initiative of several groups (and individuals) to create shareable resources for Indian languages. This initiative was launched at the "Workshop on Lexical Resources for Natural Language Processing", 5 - 8 Jan 2001, IIIT Hyderabad. The purpose behind this effort is to fill the lacuna in such resources for Indian languages. TransLexGram will help in the development of machine translation systems from English to Indian languages. TransLexGram is a collaborative effort among individuals and institutions.We are working on TransLexGram for Tamil. 2. AnnCorra        The name AnnCorra, shortened for " Annotated Corpora", is for an electronic lexical resource of annotated corpora. It will be an important resource for the developement of Indian language parsers, machine learning of grammars, lakshancharts ( discrimination nets for sense disambiguation) and a host of other tools.        The AnnCorra effort is being started based on the electronic corpora available freely for various Indian languages. One such resource is the English- Hindi Electronic Dictionary developed through a voluntary collaborative effort Co- ordinated by Language Technologies Research Centre, Indian Institute of Infromation Technology, Hyderabad. Another resource is an electronic corpus of Hindi developed by Ministry of Information Technology, Government of India.  

S. Arulmozi