The following is an excerpt from The Sound of the Future: The Coming Age of Voice Technology (PublicAffairs, October 10, 2023), which discusses the ways voice technology is poised to drastically alter the way we live and how companies do business.
Inclusion—the challenge of making the business and its offerings equally accessible and welcoming for all its employees and customers—is on the agenda of companies around the world. Voice may be the most powerful technology we can use to love the dream of inclusion from aspiration to an integral part of everyday life.
Today’s digital world is one of intense, seemingly ubiquitous connections. Yet as the COVID-19 pandemic has reminded us, it’s also a world in which health risks, infirmities, and other circumstances can easily force people into painful forms of separation and isolation. At the same time, age-old social differences such as race, gender, ethnicity, language, and religion continue to separate people and groups from one another. Finding ways to bridge these divisions between individuals and between groups is one of the biggest social, political, and economic challenges humankind faces.
Voice technology can play an important role in the quest for a more inclusive society. As the most basic and universal human communications tool of all, voice can be accessed freely by people with a wide range of physical disabilities; It does not require literacy; and via increasingly powerful voice translation systems, it can make the English-dominated digital world accessible to all the world’s people.
Voice technology is already eliminating physical and social barriers that once seemed insurmountable. Future breakthroughs now in development will provide similar benefits to hundreds of millions of people around the world.
In 2022, Google released its Look to Speak app, which uses the camera and speaker built into an Android phone to enable the user to select and pronounce a phrase they’d like to say. It’s great news for the many people who need this kind of support. And those suffering from paralysis are not the only people for whom voice tech tools can play a vital role.
The blind and vision-impaired, for example, stand to benefit in many ways from the growing availability of the voice interface as a way of accessing vital services and information. The World Health Organization estimates that some 2.2 billion people have some form of vision impairment—many of them with limited access to eye care.
One basic voice tool that is already helping millions of people with visual impairment is the screen reader—a tool that can read aloud the contents of text displayed on a smart phone or a computer screen. A number of screen reading apps are currently available, including VoiceOver, which is built into recent models of iPhones, and Talk Back, which provides the same service to users of Android phones. Screen readers are becoming increasingly sophisticated, offering options such as variable speaking rate, adaptability to a braille keyboard, and other useful variants.
Social media sites like TikTok are among the tech-based organizations that are using voice-based tools to make their services more accessible to people with limited vision. To support people who have difficulty reading the captions that appear in online videos, TikTok makes it simple for them to access a voice-based alternative. It’s super-easy to use: Just post a video, type a caption, and touch the “text-to-speech” button; anyone who views the video will hear the caption spoken aloud. Similarly, vision-impaired people can now access Instagram using an alternative text feature that offers rich spoken descriptions of photos as they pop up on screen.
Many other familiar digital tools now provide voice-based forms of support for those with limited vision. Voice tools are making it easier and safer for people with impaired vision to get around. Navigation systems like Google Maps can use voice to communicate the quickest, best route for a pedestrian, including subtle details like the precise location of an entry door. Moovit, a public transportation guide now owned by Intel, not only offers spoken guides to bus, train, and subway routes but also provides help such as the names of stops as they are reached during the journey. And Microsoft Soundscape offers the iPhone user an audible description of their surroundings, describing nearby landmarks and intersections, and allowing the user to create an aural “beacon” to guide movement toward a destination. Another app, called Nearby Explorer, provides similar support for Android users.
One of the most ingenious voice-based mobility tools is the WeWalk cane, launched through a crowdfunding campaign in 2018. At first glance, it resembles the standard white cane long used by the blind to help them travel independently. The big difference: WeWalk includes a touchpad, speaker, and sensory tools that probe for nearby obstacles, as well as a voice-activated smartphone app that responds to the user’s questions: “Where am I now? How far is the nearest bus stop? Which way to the supermarket?” For a sightless individual, having a WeWalk cane is almost like having a dedicated assistant available at all times to help you go wherever you want to go—comfortably and safely.
The Lookout app, created by Google, lets a smart phone user take a snapshot of a food container. The app then reads aloud the nutrition information on the label. Seeing AI is an ambitious app from Microsoft that works with the VoiceOver screen reader on your iPhone to provide a variety of handy services, from reading the text on a printed page held in front of your smartphone camera to describing images captured in photos. Vision-impaired people use Seeing AI every day for such mundane yet important tasks as checking the expiration date on a package of food, distinguishing a twenty-dollar bill from a single, and picking clothes whose colors harmonize rather than clash. The Lookout app offers similar support to users of Android phones.
Voice tech can play a major role in liberating and empowering individuals who have too long been excluded from mainstream society and its economic, educational, and culture life. Whatever kind of organization you may help to lead, and no matter what kind of products or services you provide, you can probably use voice tech to make breakthroughs on behalf of a broader population of customers, as well as to cast a wider net when seeking candidates for employment. Integrating these hundreds of millions of people into the world economy will create tremendous opportunities for businesses eager to serve them.