Researchers at the Institute of Smart Systems and Artificial Intelligence at Nazarbayev University (NU ISSAI) have developed an automatic speech recognition model that can recognize Turkic languages, QazMonitor reports.
The voice recognition system, which converts human speech into text, can recognize Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek, along with English and Russian.
“Our aim was to develop an ASR model for Turkic languages for most of which very few publicly available speech data on Internet exist,” NU ISSAI data scientist Saida Mussakhojayeva says.
“By utilizing the common features of Turkic languages in terms of lexis, phonology, and morphology, we sought to develop a robust joint model in which the ten Turkic languages in our study would reciprocally benefit from each other.”
Kaisar Dauletbek, a fourth-year NU student and an ISSAI research assistant, says the model makes minimal errors since it takes advantage of the similarity of the Turkic languages.
“For Bashkir, Kazakh, Tatar, Turkish, Uyghur, and Uzbek, the percentage of errors in characters made by our model is below 5%,” he says. “These results would not have been possible to achieve had we created separate models for each language”.
The multilingual ASR model developed by NU ISSAI can be freely tested on ISSAI’s website. In addition, all the models developed as well as datasets and codes used in the research project are also publicly available for download.
“We believe that the most important outcome of these projects is the training of highly-qualified technical experts who will not only drive the technological development of Kazakhstan but also willingly share and apply their professional knowledge and know-how to contribute to the advancement of technologies in other countries, thus creating a better world for future generations.”, Prof. Huseyin Atakan Varol, NU ISSAI Founding Director says.
So far, the Institute’s researchers have already achieved well-deserved success in creating the first open-source Kazakh speech corpora (KSC and KSC2), large-scale open-source Kazakh text-to-speech corpora (KazakhTTS and KazakhTTS2), as well as the largest publicly available Kazakh named entity recognition dataset (KazNERD).
Prof. Varol adds: “The Institute has put constant and considerable effort into promoting the Kazakh language in the digital world. However, our Institute’s interest in language and speech technologies also extends to other Turkic languages. In this way, our Institute will emerge as one of the scientific centers for artificial intelligence and data science in the Turkic world and Eurasia.”