21.A Kazakh Language Dataset of Lip Movements for Command Recognition // Scientific Data.
Kenzheakhmetov B., Amankos A., Amirgaliyev B., Zhanibekova Zh., Zhalgas A., Yedilkhan D.
A Kazakh Language Dataset of Lip Movements for Command Recognition // Scientific Data. — 2025. — DOI: 10.1038/s41597-025-06193-0.
Abstract:
Lip reading systems determine the content of speech based on the visual tracking of lips of the speaker and therefore serve to offer communicative substitutes when acoustic information is not available in the environment. The training of strong lip reading models requires acquisition of specialised corpora that not only characterise the linguistic variability but also visual articulatory variability. In the given work, we introduce a new visual corpus which is specific to Kazakh lip reading. The corpus contains a collection of 102 nouns taking place most commonly in the Kazakh language with specific articulatory patterns and are recorded in 26 participants with a wide age range. The resultant series takes up about 34,000 short video clips, which resulted in about 1,2 million single frames. The resulting database is widely annotated, and thus would represent a very useful resource in improving the Kazakh lip reading technologies. It will promise great prospect for further researches in this area and the various researches that can be done in multi modal speech recognition.
Link / DOI: https://doi.org/10.1038/s41597-025-06193-0
Отправить комментарий