My Writings

My Writings, Research Papers in ESL

DIALECT-DRIVEN ASR ERRORS: PHONETIC MISMATCH IN SOUTH ASIAN AMERICAN ENGLISH SPEECH

1Muhammad Ansar MA Data and Discourse Studies Department of History and Social Sciences Technische Universität Darmstadt, Germany muhammad.ansar@stud.tu-darmstadt.de ORCID: https://orcid.org/0009-0007-0649-2033 1*Anosh Rehman Department of English Linguistics and Language Studies, University of Sargodha Email: anoshhamza338@gmail.com ORCID: https://orcid.org/0009-0001-0412-0160 2Hamza Nawaz Chaudhary Department of CS, University of Sargodha Email: hamnaw66@gmail.com Official Link https://jalt.com.pk/index.php/jalt/article/view/2124 Abstract ASR systems have reached almost human accuracy with Mainstream American English (MAE), but still make systematic errors on non-mainstream varieties. This paper examines how the ASR errors are formed in South Asian American English (SAAE), and it has been argued that the errors are due to a systematic discrepancy between the phonetic realizations of SAAE speakers and the acoustic-phonetic distributions coded into MAE-trained models, the Phonetic Mismatch Hypothesis. A convergent mixed-methods design was used and a controlled speech elicitation and quantitative analysis of error. The 40 SAAE speech samples were put together to form a corpus that reflects major segmental and suprasegmental aspects, such as variation in the quality of vowels, reduction of consonant clusters, epenthesis, and the presence of prosodic transfer. A pretrained Whisper ASR model was tested on reference transcriptions with the calculation of Word Error Rate (WER). A total of 170 errors were identified and classified as substitutions (82; 48.2%), deletions (52; 30.6%), and insertions (36; 21.2%). The speech of SAAE generated a WER of about 43, as opposed to a generation of about 6 by MAE speech, and there was a partial amelioration of the situation when the speech was generated under a fine-tuned adaptation condition (WER ≈ 18%). Types of errors were not randomly distributed among phonetic features: substitution errors were caused by vowel changes and consonant replacements; deletions were explained by the presence of consonant clusters; and most insertions were due to prosodic and rhythmic variation, specifically syllable-timed rhythm and epenthesis. These findings support the phonetic mismatch hypothesis that attributes errors in ASR to linguistic behaviors, and not failures in the system. This study contributes to a phonologically grounded description of ASR bias and proposes training and evaluation models to factor in dialect-specific phonetic knowledge. Keywords: Asian American English, ASR bias, phonetic variation, speech recognition errors, dialect mismatch, word error rate, linguistic equity, corpus, computational linguistics. 1.      Introduction Human-computer interactions are now centered on the ASR systems and have led to applications such as virtual assistants, transcription systems, and voice-controlled interfaces. There have been rapid advances in the field of deep learning, but they are not equally effective across groups of speakers. The error rates of non-mainstream dialect speakers may tend to be higher, and that is why one wonders about the impartiality, access, and linguistic bias of speech technologies (Koenecke et al., 2020). The Asian American English (AAE) is a varied group of English varieties influenced by multilingualism and exposure to the mother tongue. Even though ASR systems are typically trained with huge amounts of Mainstream American English (MAE), they do not tend to be extrapolated to other dialects, including AAE (Errattahi et al., 2018). The existing literature has addressed this issue predominantly as a computing limitation and focused on model adjustment and scale-up of the data sets. Less emphasis has, however, been placed on the phonetic processes underlying recognition errors. The current studies in the area of Automatic Speech Recognition have placed more emphasis on the fact that the distribution of errors in speech recognition of the English language is not merely distributed randomly but is also predetermined by linguistic variation and the composition of the data set. It has been established that even state-of-the-art end-to-end ASR systems, including systems based on deep neural architecture, experience performance loss even in cases of speech that fails to meet the norms of typical training (Zhang et al., 2020; Chan et al., 2022). Particularly, the difference in pronunciation, phonotactics, and prosody leads to anticipated recognition errors, especially in spontaneous and accented speech. These findings suggest that the error of ASR is very dependent on the probabilistic nature of the training data, in which models have more chances of supporting the frequently represented linguistic patterns and are less effective with the less frequently represented ones. As such, recognition systems are more likely to miss non-standard phonetic realizations and place them in the closest acoustic category, further supporting the presence of systematic bias in the performance of English ASR. In line with that, the research on varieties of South Asian English has revealed challenges in phonetics, accent, and multilingual influences in the ASR systems. One such instance is that the ASR models trained on the standard English corpora have significantly larger error rates when applied to the processing of speech with South Asian accents due to the difference in the realisation of vowels, the production of consonants, and even the prosodic patterns (Psanadi, 2022). These investigations also show that transfer learning and model adaptation can be employed to increase the accuracy of recognition, although it still fails to eliminate the latent difference between speech input and training data. This reinforces the opinion that the ASR errors are not technical limitations, but rather rest on linguistic diversity and unequal representation of data. Consequently, it is becoming more and more clear that to improve the ASR performance in global Englishes, larger datasets are not the sole answer but a much broader approach to variability that has to put the phonetic and sociolinguistic variability into the model structure. In this study, the analysis of the system architecture has been changed to phonetic mismatch. It states that the errors of ASR arise due to the lack of systematic appearance of the acoustic-phonetic patterns of AAE in those models trained on MAE. The study offers a linguistically-based explanation of ASR bias by determining the relationship between certain phonetic properties and error types. 1.2  Research Objectives The research aims to meet the following objectives: 1.3  Research Questions The research aims to answer the following questions: RQ1. What phonetic characteristics of South Asian American English are systematically related to certain types of ASR errors, substitution, deletion, and insertion, and in what proportions?

My Writings

My Poetry

کیا عجب ہوتا گر اپنی قسمت پہ ناز کرتےکیا عجب ہوتا گر اپنی قسمت پہ ناز کرتےنا کہ صُبح شام سوز و گداز کرتے ہنس کھیل کر عمر دراز کرتےنا کہ پشیمانی سے اعتراض کرتے اُنکے سب ستم بر سر نیاز کرتےنا کہ رقیب کو اہل راز کرتے دل کی کرچیوں کو اِحتِیاز کرتےنا کہ آنسوؤں سے روح پاکباز کرتے اس کہانی کو قصہ در باز کرتےنا کہ اس طرح التفاظ کرتے کیا عجب ہوتا گر اپنی قسمت پہ ناز کرتےنا کہ صُبح شام سوز و گداز کرتے

My Writings

Aesthetic Love

Chapter 1 Wandering around the woods, sliding her fingers over the wet moss as she tracked the movements of a squirrel through the rustling leaves, seeing the beautiful birds chirping, and trees as green as they could be, “Ahhh! It was a fine spring morning.” She murmured. There was a breath of spring in the air, but the primroses were out, and the lake was calm as though the air had been holding its breath just waiting till she came on board. In some distance came the creak of a gate to interrupt the silence. Touching the fresh flowers and gazing at the clean water in the lake, she noticed that one blossom had already floated away on the water, a fragment of beauty turned to ash in its prime. Suddenly, she heard a familiar voice moving towards her. She had her eyes closed, and he appeared, having his hands on her eyes. “Oh, babe! Did you wake up? I wish you had a peaceful sleep last night.” She said, removing his hands from her eyes and holding them in hers. He smiled, kissed her, and started walking by her side. “How can I not sleep well when I’ve got such a pretty girl as my wife who loves me enough that she does everything I need?” He asked her. She had a soothing and gorgeous smile on her face that illuminated her face like the shining sun. Her dark eyes used to sparkle when he held her hand and appreciated her. Dark red-brown eyes, little perfect thin lips, round face and long neck, fair complexion, and hazel brown hair, indeed a perfect beauty that could have enchanted Hummain. “You know I can’t sleep without you, without having seen your face, my day feels incomplete, and I’m thankful to Allah that I got my love as my husband”, Aniyah replied. He laughed, and they walked towards home, holding hands. Hummain was a perfect boy but a flawed one; he had drowsy eyes which could make anyone fall for him, luscious lips, a long neck, perfect height, and a healthy physique. Indeed, he was the beauty who could charm anyone with his handsome features. Hummain smiled that ill-humored way of his, but it took him a long time to rub his eyes open. The sunlit morning fell on her hair–hazel-brown, like the autumn wheat, and a look seemed to come into his eyes, as though he might be losing someone or something. After reaching home, she prepared breakfast and served him. “How long are we gonna live alone? Even without your parents, why don’t we invite them here to live with us? “ She questioned. “We will call ’em soon, honey! Don’t you get worried?” He said. “But you know I’m afraid to live alone,” She replied. Seeing her in despair, he said: “Don’t you dare to think that you’re alone! Don’t you dare to think that you’re alone. Keep thinking about me, Lauv! I’ll be back soon.” He said goodbye and went towards his car, kissing her head before departing. There was more to the quiet of the lake than drifting clouds behind, one at a time, dark and slowly deliberate, as though it foreknew that the fondling glee of the season would not long continue. The wind was sharp as a breath, but it was warm all morning, and it died once more, and the woods were very still. Once a thrush in the hedge called, but stopped in its song, as though it had caught sight of something invisible. In the distance, a gate creaked, closed without a hand on it. Aniyah shivered, and she could name no reason, though it was still sunny. A petal had fallen off the primrose she had touched that morning, and they strewed a path to the ground without a sound. She saw it, and with a certain sense of troubled astonishment, she gripped her hand tighter in Hummain. Chapter Two And when he left, Aniyah fell into her world of reverie, into which memories and imaginings went down like threads of a dream. She remembered how beautiful the union was that had discovered no definite outset. Never could I have dreamed that I would ever marry you! Who would have thought you would go in cities- bid farewell to what was known–all that was dear- just to me? Who believed that you’d come far away, to a distant city, just for me?? Just for my love and care?? Oh!! I can’t imagine my life without you, dear husband. I miss the beauty of our meeting — the first blush of the rose that bloomed without wind or shade, before life took its ashes and scattered them. Love at first sight??? She thought of the first encounter of their eyes, not in the thunderclap of that conception which no poet ever admitted to be the love of the first sight, but with a quiet sense of recognition that begins, and knows not that it begins, to nurture itself into assiduity. Then, so very ordinary to her face, by slow degrees it had become a face that she liked best of all others. She recalled how once he had questioned her name, with a note of such interest in his voice she almost thought it was an imperative, and how, though she was always cautious, she had shown her face to him, without thinking who he might be or out of what place he had come. But, she said to herself, it was not a love in the ordinary sense; love, if only afterward understood, in its completeness, is so easily lost by the fulfilment of itself. Instead, it was the silent worship of a soul who had found her own, even in the months when he had nothing to say to her, and when he was away. But at that time, there was treatment at hand–nights of sleeplessness, doubts she dared not voice: Does he love me? Will he ever visit me?

Scroll to Top