The Vietnamese Language and Speech Processing - VLSP competition is part of the annual international conference on Vietnamese Language and Speech Processing organized by the VLSP club, a branch of the Vietnam Association of Information Technology. VLSP 2023 organizes 10 competitions on speech and text processing, bringing together leading researchers, experts and technology development units. Participating in Vietnamese Language and Speech Processing 2023, Viettel AI won big with First Prize in 2 categories: Speech Recognition and Speech Emotion Recognition; Vietnamese - Lao Machine Translation.

In particular, Automatic Speech Recognition is one of the important problems of speech processing to convert input speech signals into corresponding text. This year, with the innovation in the structure of the competition's categories, the teams had to perform two problems at the same time: speech recognition and speech emotion recognition. Viettel AI not only overcame this challenge to win first prize, but also impressed with an outstanding score of 89.18% (the following teams were 83.40% and 78.45% respectively).

According to Viettel AI representative, the key point leading to this outstanding accuracy result is that Viettel AI has mastered the technology early. Instead of using models from available research results, Viettel AI has developed a model specifically for processing Vietnamese speech from scratch and continuously updated and optimized its efficiency. Combined with the establishment of a training cycle that can process all data in different quality conditions, the engineers have successfully built a model that can recognize both text and emotions of the sentence with high accuracy, under limited data conditions.

aaaaaaaaaaaaa.jpg
Viettel AI Virtual Assistant Platform engineers participate in the Speech Recognition and Speech Emotion Recognition categories

Advanced Speech Processing technology has brought significant results to Viettel AI products such as virtual assistant systems, virtual switchboards that can recognize voice with up to 95% accuracy and identify customer intentions with up to 96% accuracy. In particular, the results of research on voice and emotion recognition technology from the competition will open up new applications in customer care, exploiting information from switchboard calls... Complaints and negative calls from customers to the switchboard often account for a small number of hundreds of thousands of calls to the support switchboard every day but have a great impact on service quality. Instead of spending money on hiring people to listen and mark these calls as before, Viettel Cyberbot virtual switchboard will be able to automatically identify and handle customer complaints as soon as the call is received.

Through the competition, Viettel AI affirms its determination to pioneer in developing and applying the most advanced speech processing technologies to improve product and service quality.

Quoc Tuan