According to estimates from the World Health Organization there are 466 million people in the world with disabling hearing loss, that is 6.1% of the world’s population. 466 million that are struggling with digesting all the information that is coming our way by means of video. The amount of audio-visual content in the world is growing exponentially. Everywhere we look there is information coming our way. At Scriptix our mission is to make the spoken word accessible to everybody. To that end we build speech recognition models that turn the spoken word into text, so we can offer automatic subtitles to all that content so everybody can follow what is being said.
The issue that deaf or people with hearing loss have
As said the amount of audio-visual content is increasing exponentially. In other words we digest an increasing amount of information through audio and video. Without subtitles though people suffering from hearing loss are left unable to consume such information, posing them with great difficulties in a fast-changing world. Especially during unprecedented situations such as the current COVID-19 pandemic. We also communicate with video at an increasing rate, think about vlogs and facetiming with family and friends. To make such forms of communication better accessible, adding subtitles is crucial. Because adding subtitles manually is a time consuming and costly undertaking, applying speech to text offers a time saving and cost efficient way of dealing with this challenge.
Why accessibility and speech to text matters
The number of people suffering from hearing loss is estimated to increase from 489 million in 2020 to 933 million in 2050, meaning an increasing number of people will struggle with understanding what is being said. It is of the upmost importance that these people too are enabled to digest all information the right way. Speech to text is one of the tools that can help in that regard. By subtitling all information automatically we make sure people suffering from hearing loss can follow all relevant news as well for example.
Implementing the solution
The quality and ease of implementation of a service such as speech recognition has improved a lot over the last 5 years. With the rise of cloud computing and continuous improvements of speech recognition platforms that are being exposed through API-platforms, almost any organization is able to integrate such a solution in their existing workflows. For example a broadcaster that uses a Media Asset Management system from any supplier can now integrate such a service in his workflow in half a day. In other words, in half a day a broadcaster can enable automatic speech recognition to add subtitles to his content and in doing so make his content better accessible to a much larger audience.
Improving accessibility for people suffering from hearing loss, a worldwide trend
We see a worldwide trend where an emphasis on greater inclusion and digital accessibility is becoming more and more of a hot topic. In the U.S., the Americans with Disabilities Act has been around since 1990, gradually being amended with additional legislation to improve the lives of people with disabilities.
A few years ago, it become mandatory to content producers to provide captions to content to make it better accessible for people suffering from hearing loss. Similar legislation has passed in Canada, Australia, and New Zealand. In the Philippines, the Republican Act No. 10905 passed in 2016 states that “(…) all franchise holders or operators of television stations and producers of television programs to broadcast or present their programs with closed captions.”
And last year in September throughout the European Union the new Web Content and Accessibility Guidelines went into effect, effectively this meant that all content on government websites had to become better digitally accessible. For audio visual content this means subtitles are now required.
Feedback and customization are key
Gartner research states that: “(…) Technology strategic planners must look beyond accuracy rates toward contextual awareness, domain specificity and custom training through machine learning to successfully increase market share.” Scriptix works closely with its partner Arbor Media to create automatic subtitles for municipalities in The Netherlands to enable those municipalities to adhere to the new Web Accessibility Directive. Moreover, we are in close contact with special interest groups that advocate greater inclusion and digital accessibility. They provide us with the necessary feedback from their stakeholders regarding how we can optimize our systems output to fit their needs.
For example speech recognition models turn every word into text, but sometimes people stutter a bit when they are talking. This means in the end the subtitle contains words multiple times in a single sentence which doesn’t benefit the readability of it. Therefore at Scriptix we are working on algorithms to automatically detect such duplications and delete them.
Another example is the fact that by transcribing an audio file word by word, you’re not creating an actual subtitle which is always a summarized or shortened version of what is being said. To tackle this issue we are working on an algorithm that automatically summarized what we are transcribing.
By building such additional algorithms the output more and more starts to look like an actual subtitle which in turn benefits the readability and usability for the actual target audience, people suffering from hearing loss.