Finalists at DataVerse Challenge - ITVerse 2023

Date:

The goal of this competition was to recognize model IPA transcription from Bengali texts, and to improve Bengali computational linguistics and NLP research using the first Bengali sentence level IPA transcription dataset from bengali.ai. Bengali Text to IPA (International Phonetic Alphabet) Transcription is an area that has seen relatively limited development compared to other languages, despite Bengali being one of the world’s most widely spoken native languages. There is a growing need for automated systems that can accurately convert Bengali text into IPA notation due to the vast audience and various applications in linguistics, language learning, and phonetic research. Our submission was among the first open-source IPA transcription methods for Bengali. We built a model trained on a linguist validated dataset containing Bengali text from different domains. The test set contained numbers, loan-words and domain-specific words to add to the challenge. My team consisted of me, Abdullah Arean, and Md Fahim bhaia. Our team were among the top 10 finalists for this kaggle competition involving 62 teams from Bangladesh. The competition was organized by IIT Software Engineers’ Community (IITSEC) as they partnered with bengali.ai to advance research in Bengali text to IPA domain.

More information about the competition here

Some photos from the event

dataverse-image-1

dataverse-image-2

dataverse-image-3

dataverse-image-4