Voice Recognition Still Lacking Compared To Humans

If you’re trying to choose between transcription software or a human transcriber, this guide will provide you with a clear decision. 

194 billion apps were downloaded in the last year, up from 178 billion the year before according to App Annie. There’s an app for everything these days, and they’re relied on just as much in the corporate world as in our home lives. WhatsApp Business was the sixth most downloaded app in 2018, and business apps as a category has the second largest collection in the App Store. When it comes to transcription apps though, which turn audio files into text format, unfortunately this type of program still falls far short of the capabilities of a human transcriber. The issue is largely down to problems with voice recognition software. 

Voice Recognition Advances 

Tech giants have long been developing voice recognition software by using humans to assist in their artificial intelligence projects. Siri, Google Home, Alexa and Cortana are all common names in the market which many consumers may rely on around the home to answer basic questions. 

Recent tests conducted by Loup Ventures set out to confirm how accurate these four voice recognition services are. Over a course of 800 questions, Google came out top answering 88% of the questions correctly, with Apple scoring 75%, Alexa at 72.5% and Cortana at 63%. 

Clearly, there’s some work still to be done when it comes to how these digital assistants are able to recognise the voice and the question being asked of it, before returning a relevant answer. 

Human Transcribers Are Superior 

Although different products, transcription software and digital assistants both rely on the use of voice recognition software, which is somewhat lacking in comparison to using real humans. 

The UK is home to around 37 regional dialects which doesn’t take into consideration the number of accents spoken by non-native people who might be resident here. If you’re transcribing an audio file, the chances are that it will include dialogue from at least one person with an accent. Whilst human transcribers might be very familiar with the Cockney or Brummie accents for example, voice recognition software can struggle to interpret certain words or phrases accordingly. 

But accents aren’t the only difficulty that voice recognition software has. A UK transcription services firm explains that audio files vary widely in quality depending on the type of technology used and the distance the speaker is from the microphone as two common issues. Whilst a transcriptionist has the ability to play back a muffled recording and make sense of it in the context of the rest of the discussion, a piece of software isn’t able to decode the section of file with the same amount of accuracy.  

Industry Specific Knowledge

Another area that voice recognition can’t claim to be perfect at, is the ability to translate industry specific audio into content that is meaningful. Many transcriptionists choose to be specialists in a particular area such as law or medicine. This means that they have appropriate industry knowledge and are able to understand certain terminology and acronyms and convey this information in a way that makes sense to the reader. Where they might be a little unsure of the jargon, human transcribers would carry out quick research skills to ensure that they’re up to speed. Voice recognition software would not be able to go to the same lengths to ensure accurate copy, and would make a ‘best guess’ when turning the industry-specific audio into text. 

Voice recognition software is undoubtedly impressive and will most likely improve in the coming years, but when it comes to turning audio into text, humans are able to provide a far greater transcription service!