Dr. Ekapol Chuangsuwanich
Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University
Go to Project Home PageWe are developing a large-scale Thai automated speech recognition model on using combinations of Thai-language speech datasets. The experiment took a similar approach to SpeechStew model (https://arxiv.org/abs/2104.02133), but our model is specifically tuned for Thai language. Our combined dataset has more than 3000 hours of speech content (over one billion input frames) ranging from read speech, lectures, and conversational speech. The model is based on Nvidia’s conformer-CTC model with roughly 30 million parameters. Our preliminary result shows that the model (which is still actively training on the cluster) can outperform models trained on any single domain. It also outperforms a domain specific model on a held-out domain.
In collaboration with other universities and research teams across the country, these are some AI research projects being powered by the Apex AI Infrastructure.