Acoustic Analysis of the Speakers’ Variability for Regional Accent-Affected Pronunciation in Bangladeshi Bangla: A Study on Sylheti Accent

Citation: S. Kibria, M. S. Rahman, M. R. Selim and M. Z. Iqbal, “Acoustic Analysis of the Speakers’ Variability for Regional Accent-Affected Pronunciation in Bangladeshi Bangla: A Study on Sylheti Accent,” in IEEE Access, vol. 8, pp. 35200-35221, 2020, doi: 10.1109/ACCESS.2020.2974799.
URL: https://ieeexplore.ieee.org/document/9001082

Abstract: Accented pronunciation variability is one of the key elements that deteriorate the accuracy of the automatic speech recognition (ASR). This article reports the results of the acoustic analysis of the two groups of speakers’ variability caused by regional accents in Bangladeshi Bangla. The analysis considers the seven monophthongal and four diphthongal vowels of Bangla to investigate the acoustic characteristics of two groups of single-accent speakers and their correlation on the articulation of the Standard Colloquial Bangladeshi Bangla (SCBB). An accent is the speaker’s regional signature and is shaped by his/her community and educational background. This study examines both male and female speakers from the Sylhet region, which has one of the extremely deviant dialects in Bangla, and comparatively less deviant speakers from different districts of North-West and Middle Part of Bangladesh. Accent-related acoustic features such as pitch slope, formant frequencies, and vowel duration have been considered to examine the prominent characteristics of the accents and to classify the accents from these features. Both gender groups are distinctly analyzed. It has been found that there are significant deviations in formant frequencies and various steepness of the rise/fall in pitch slope within accents of both gender groups. In this study, it has been observed that accent-related changes in speech affect the ASR performance. This has emphasized the need for accent-specific acoustic models to handle the speakers from highly deviant dialects as well as considering the accent-affected speakers’ variability in the corpora development for robust ASR system in Bangladeshi Bangla.