“Sorry, I Didn’t Get That!” Problems in Voice-Recognition Technology for Speakers with Dysarthria

Published: October 25th, 2017

Category: UAD Student Blog

By Carly Gurick and Carolyn Nguyen

Over the past decade, communication technology has grown by leaps and bounds—from messaging through a simple keypad, typing through an on-screen keyboard, and now speaking through a voice-to-text software. Many speech recognition devices are speaker-independent, meaning that instead of collecting data from the actual user, the device utilizes speech samples that have been derived from various speakers. From this, consumers are able to complete tasks through voice commands alone. Although offered to the public, not everyone is able to enjoy its convenience. How does this device recognize a client with a communication disorder?

Speech intelligibility plays a major role in communication – whether it is face-to-face with another person or through a software. However, the margin of error is much larger in voice recognition programs. In an article by Scientific American, the rates of word-recognition for clients with dysarthria can be 30 to over 80 percent lower than the normal population.

Dysarthria is motor speech disorder that is the result of impairments to the motor programming and execution of the muscle movement of the speech articulators. This includes the lips, tongue, jaw, vocal folds, and diaphragm. Dysarthria can be caused by stroke, traumatic brain injury, cancer, Parkinson’s Disease, Huntington’s Disease, and ALS. The type of dysarthria and degree of severity have great variability, however, the vocal quality and intelligibility of the affected persons can be severely impacted across the board. People with dysarthria may present with some of the following speech characteristics: hoarseness, breathiness, slurred speech, monopitch, monoloudness, imprecise vowels and consonants, and voice stoppages. The general decrease in intelligibility and comprehensibility that a person with dysarthria exhibits can be very frustrating, for not only the patient, but also for their communication partners.

Because of the complexity and unpredictability of dysarthria, it is difficult to create an ideal model for speech recognition devices. However, due to the constant advancements that have been made in the world of technology, we hope that developers are able to resolve this matter in the future.

References:

Dysarthria. (n.d.). Retrieved September 17, 2017, from http://www.asha.org/public/speech/disorders/dysarthria/

Mullin, E. (2016, May 27). Why Siri won’t listen to millions of people with disabilities. Retrieved September 17, 2017, from https://www.scientificamerican.com/article/why-siri-won-t-listen-to-millions-of-people-with-disabilities/

Young, V., Mihailidis, A. (2010). Difficulties in Automatic Speech Recognition of Dysarthric Speakers and Implications for Speech-Based Applications Used by the Elderly: A Literature Review. Assistive Technology, Summer; 22(2):99-112.