Advancing machine translation through discourse integration
Building a Speech to Speech Translation System (SSMT) across Indian languages is essential in India's multilingual context to overcome language barriers. This project creates an environment to leverage modern language computing technologies for translating from one language to another by developing Machine Translation (MT) systems capable of translating video or speech transcripts from English to Indian languages.
As India encompasses many language families, it is crucial to connect these families and develop translation systems across them. The Dravidian language to Dravidian language (DL-DiscoMT) translation connects to Hindi-Tamil systems and from Tamil to other Dravidian languages, forming a vital component of the Indian Language to Indian Language System and the larger Speech to Speech Machine Translation system.
Texts have properties that go beyond individual sentences, manifested in the frequency and distribution of words, word senses, referential forms, and syntactic structures. This includes:
Therefore, this project incorporates discourse information at the cross-sentential level to enhance translation quality.
Government policies and administration
Biology, Chemistry, Physics, Computer Science, Engineering
FinTech and financial services
Insurance and eMedical charts
Farmer-related information
A comprehensive platform for handling discourse analysis and conversational context
Text-to-text MT from Hindi to Tamil, Tamil to Hindi, Kannada, Malayalam, and Telugu (bi-directional). Incorporating discourse information in NMT and Sampark systems
Platform for continuous evaluation and benchmarking of translation quality
Machine Translation solutions as APIs for integration with SSMT systems and end-user applications
Develop and deploy MT systems as services from Hindi to Tamil and Tamil to other Dravidian languages, playing a critical role in Indian language SSMT and TTMT systems
Include Discourse Analysis in MT to improve translation by identifying discourse markers and bringing coherence and cohesion to translated text
Create and nurture an ecosystem involving startups and Central/State Government institutions to develop and deploy innovative products and translation services
Increase content in Dravidian languages on the Internet across domains of Governance, Science & Technology, Education, Health, and Agriculture
3 Years
1 Lakh+ Sentences
8 Systems
7 Organizations