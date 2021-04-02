In the touchless economy accelerated by COVID-19, automatic speech recognition has seen a sharp uptick in use. As the world rapidly shifted to remote work and expanded online contact centers and storefronts, businesses turned quickly to virtual assistants, chatbots and automated transcription services.

Yet, even before COVID-19, enterprises were steadily moving towards ASR to augment their workflows.

ASR uses AI-based technologies, including machine learning and deep learning, to identify and process human speech and turn it into text. The technology can be used to power voice-based AI systems or virtual assistants, like Google Home or Amazon Alexa, or run voice-to-text software.

More ASR Organizations have increasingly turned to ASR over the last couple of years, as advances in AI, particularly machine learning and deep learning, have greatly improved ASR systems' accuracy, said Hayley Sutherland, a senior research analyst for conversational AI and intelligent knowledge discovery at IDC. Right now, most systems have an accuracy of 75% to 85% off-the-shelf, but training can improve that, she noted. COVID-19 further increased interest in ASR systems, as the pandemic drove a rapid shift to remote work and education and sparked a profusion of virtual meetings. Scott Stephenson, CEO of ASR vendor Deepgram, acknowledged that, before the pandemic, organizations that hadn't started using ASR technology expected they would do so when they eventually upgraded their infrastructure. "They would say, if you had talked to them a year prior to the pandemic, 'in the next three years, we're going to update our infrastructure,'" he said, adding that the same organization likely had been saying that for the past decade. "Now when you talk to them," Stephenson continued, "they say, 'We have already upgraded our infrastructure; we had to because we wouldn't be able to operate if we didn't.'" Deepgram, in partnership with Opus Research, recently surveyed 400 North American decision-makers in various industries to determine if and how respondents use ASR. About 99% of the respondents indicated they are currently using ASR in some form. Most, about 78%, are using ASR systems to transcribe and analyze voice data from consumer-facing devices -- largely voice assistants within mobile apps. 5 AI technologies driving business value

Common applications Indeed, outside of broadcast subtitling, one of the most common use cases for ASR is within voice-enabled virtual assistants, most of which rely on speech-to-text software to first convert spoken word to text, Sutherland said. "Once in text format, advanced natural language processing can be performed to help conversational AI systems 'understand' what users are saying and determine how to respond," she noted. Other common applications include enterprise meeting transcription, class transcription and medical notes dictation, she said. Deepgram's survey found that, after using ASR with consumer-facing devices, organizations are most commonly integrating ASR systems with their collaboration platforms (such as Zoom, Webex, Skype and Slack), with their customer-facing contact centers and with their internal help desks. Still, despite respondents' intensive use of ASR, the survey showed that more than half of the respondents don't believe they are properly using their recorded audio. According to Stephenson, that's a silo problem.