Voice-activated digital assistants have already taken us to new frontiers on our mobile phones, computers, smartwatches, in our cars, our homes, and at work. But what’s next on the speech recognition frontier?
Star Trek premiered on September 8th, 1966—introducing a sophisticated speech recognition technology that spurred the imagination over the three seasons of its original run.
The show’s Library Computer Access/Retrieval System—better known as LCARS—provided audible responses to speech queries from various computers on Federation ships, giving viewers a glimpse of what could be in normal, everyday life.
In a 2016 New York Times article, Farhad Manjoo wrote “the Star Trek computer is no metaphor. (Google) believes that machine learning has advanced to the point that it is now possible to build a predictive, all-knowing, super helpful and conversational assistant of the sort that Captain Kirk relied on to navigate the stars.”
By now, we are all familiar with Alexa, Google Assistant, and Siri. They represent the impressive improvements in natural voice recognition and accuracy rates in recent years.
But these voice assistants don’t tell the whole story. Let’s take a look at some of the most innovative software and applications in use today.
Speech Recognition: Where Are We Using It?
Before we jump into specific and innovative programs that make use of this technology, let’s discuss where it’s being used.
Speech Recognition At Home
Typing with your voice allows you to speak emails and documents into existence by hitting the microphone option on your device’s keyboard.
Voice search is the most common use of this technology. In 2021, it’s believed that 5 billion people will use voice-activated search and assistants around the world, a number that could rise to 6.4 billion in 2022. The most common tools are Siri, Google Assistant, and Amazon Alexa.
Additionally, 30% of voice assistant users say they invested in an Amazon Echo or Google Home in order to use voice-powered smart home technology to control everything from clocks, speakers, lights, doorbells, cameras, window blinds, and other household appliances.
The smart speaker is neat, but what is really starting to grab people is the smart display, essentially a smart speaker featuring a touch screen.
Smart displays like the Russian Sber portal or the Chinese smart screen Xiaodu, for example, come equipped with several AI-powered functions, including far-field voice interaction.
In 2020, the sale of smart displays rose by 21% to 9.5 million units, while basic smart speakers fell by 3%, and that trend is only likely to continue.
The COVID-19 pandemic also increased our use of voice technology at home when it came to shopping.
How many of us ordered coffee, groceries, books, clothes, or meals for curbside delivery during various stay-at-home lockdown orders? All of this can be done using speech commands on your mobile device, straight from the couch, without moving a finger.
Speech Recognition at Work
COVID-19 also sped up the rise of remote workplaces and, as a result, video meetings and conference calls.
The ability to convert audio to text is very useful, especially if you’re operating in a constantly evolving and fast-paced work environment.
A reliable transcription service makes all the difference in terms of flawless implementation what is discussed in these meetings.
You need to be able to recall what was discussed and put into practice any important points that were raised.
Not having to type or write out your lengthy notes is a huge time saver, and speech recognition technology enables transcription capabilities that help boost productivity and profitability.
Speech Recognition in Our Cars
Many of the innovations in speech recognition technology have been driven by the auto industry.
Companies like Apple, Google, and Nuance have reshaped the way voice-activation is used in vehicles.
Apple’s CarPlay, for example, offers a more basic and safety-focused version of iOS to your car’s touch-screen display. Connect your iPhone and your car’s factory-installed entertainment system is replaced by Apple’s familiar icons.
Siri helps you switch between playlists, navigate to the nearest gas station, send text messages, and make calls, all hands free.
Nuance’s Dragon Drive platform can process natural voice patterns in a far more sophisticated way than its competitors.
For example, Dragon Drive “learns” your road behaviors and updates to meet your needs before even having to ask. This is a feature like Google’s Nest which “learns” and automatically adjusts for optimal temperature settings in your house.
Nuance has also been able to incorporate voice biometric capabilities to distinguish between people speaking inside the vehicle.
You can learn more about how they developed this technology here.
Voice Technology in Banking
Banks have seen the value in voice-based banking not only in added customer convenience, but also in reducing the need for human customer service representatives.
The Royal Bank of Canada (RBC), for example, offers customers the ability to pay their bills using voice commands.
The USAA also offers the ability for members to access information about account balances, transactions, and spending patterns through Amazon Alexa.
The U.S. Bank Smart Assistant, for example, now offers spending insights, which you can access by asking simple questions, such as:
- “How much did I spend at [store name] last month?”
- “How many times did I go to [type of business] last month?”
Other banking virtual assistants not only help you make transactions, but also help you keep track of your spending and show you advanced insights about where your money goes.
For example, Bank of America’s Erica provides several key insights into your finances, including:
- Weekly snapshot of spending
- Search for past transactions
- Monitoring recurring charges
- Notifications of changes to your FICO credit score
- Tracking account balance trends
Voice biometrics really come into play here as well. Your voice can be used as a unique identifier to securely access your accounts.
Voice to Text at the Doctor’s Office
Medical transcription software uses speech recognition to capture patient diagnosis notes.
Doctors can shorten the average appointment by using speech recognition instead of typing out their own notes, allowing them to see more patients.
Machines can also estimate a person’s mental state by analyzing their voice. A recent study concluded “speech processing technology could aid mental health assessments, but there are many obstacles to overcome” and more data is needed to make inroads here.
Speech Recognition Software Innovations
According to a Research and Markets report, the speech recognition market is expected to reach USD 27 billion in 2026, compared to USD 10.7 billion in 2020.
That’s due to increasing demand for voice-activated and devices used in retail, banking, smart home, healthcare, and automobile sectors.
Here’s a sample of software recognition software innovations to be aware of in 2021.
Nuance Dragon software comes in several variations for personal and professional use, with specializations for the latter in the fields of medicine and law.
Dragon Professional instantly transcribes dictation with maximum accuracy. Dragon Anywhere is a cloud-based speech-to-text software that enables you to work from anywhere.
Nuance has become known as one of the best speech recognition developers, a reality that was recognized by Microsoft’s recent $19.7 billion acquisition of the company.
“There are lots of companies, and start-ups, doing voice recognition, but Nuance has been a leader in this area for years,” said Blair Pleasant, president and principal analyst for COMMfusion.” It has the technology, partnerships, customers, and expertise needed to further succeed and grow.”
That deal should bode well for one of Microsoft’s key voice recognition technologies.
The company was recognized in the Accessibility category for its first-of-its-kind app that provides a simple, accessible way for people with speech and motor impairments to communicate using their own voice.
Voiceitt’s app utilizes speech recognition technology to help people with speech disabilities – relating to stroke, degenerative disease, or developmental disorders – communicate and be understood, making speech recognition accessible to everyone.
The company’s advanced automatic speech recognition (ASR) technology identifies and adapts to a person’s unique impaired speech patterns like breathing pauses and non-verbal sounds.
This allows anyone with mild to severe speech impairments to communicate and control smart devices with their own voice.
Voiceitt recently integrated with Amazon Alexa, enabling people with speech impairments to use their own iPhone or iPad app to access and control Alexa.
Alexa for Business Partnerships
With Alexa for Business, employees can use Alexa as their intelligent assistant to be more productive in meeting rooms, at their desks, and even with the Alexa devices they already use at home or on the go.
This is a service that enables organizations and employees to use Alexa to get more work done, and a couple key partnerships have made it easier for businesses to access it.
In late April 2021, Zoom announced that Amazon’s business-focused Alexa integration was now available for everyone in Zoom Rooms Appliances, a program Zoom launched back in 2019 to bring dedicated Zoom hardware to meeting rooms around the world.
As the world returned to a sense of normalcy following the Covid-19 pandemic, businesses continued to explore new contactless ways of running meetings. With Zoom Rooms and Alexa, a small team in one office can communicate with another team in another office without having to touch any buttons.
The Future of Voice Recognition is Ongoing
The industry is already saturated with hundreds (if not thousands) of companies experimenting with integrating their products and services with digital voice-assistants.
If you’re interested in learning more, take a look at our complete guide to speech recognition technology.
Contact us today to find out how we can help grow your technology.
Amazon Flags Low-Quality Training Data for LLMs
The tools are out there to gather large swaths of training data for LLMS, but human touchpoints help clean...
Should you trust voice assistants for medical advice?
If you have specific health concerns or questions, it’s always best to consult a qualified healthcar...