You’ve just squeezed the last drop of shampoo from the bottle, the toothpaste tube is rolled
as tightly as possible and your razor just completed its last shave. In the not-too-distant future,
you’ll step out of the shower and, like Harry Potter without a wand, speak your wishes to the air:
“Vox, send me shampoo, toothpaste and razors.” And, lo, the universe will provide. The voice-control
platforms of tomorrow will always be listening, making computing as simple as opening your mouth.
Except, of course, there’s no magic involved. “Vox” is what Forrester Research is calling the ubiquitous
voice interface it predicts we’ll soon use to interact with our digital devices. Rather than being tied to
a single device, like Siri on an iPhone, analyst James McQuivey says we’ll live in homes with a
microphone in every room connected to a hardware-independent voice-powered platform. Unlike
Siri and similar apps, McQuivey says the voice-control platforms of tomorrow will always be listening,
making computing as simple as opening your mouth.
So far, to anyone who has used what passes for voice control these days, this vision feels far away.
Siri is a disappointment, and while speech-to-search in Google’s apps is more responsive and
smarter, it’s still more of a novelty than a true tool. What’s interesting about McQuivey’s take is his
argument that Amazon will drive both adoption and improvement of voice control by creating one
dead-simple, potentially addictive use for the spoken word: shopping.
The Forrester report clearly anticipates the release of this startling new gadget, which combines
a barcode scanner and microphone in a dangerously easy-to-use device for shopping. Tied to
Amazon’s same-day delivery grocery service, Dash basically turns your entire house into a showroom.
Just scan a product to order it. But Dash’s voice option could be its real killer app: Tell it what you
want and Dash will put the item in your cart. In McQuivey’s vision, devices like Dash will give way to
networked mics throughout a home that will liberate voice control from any specific piece of hardware.
These mics, he says, would cost no more than $25 apiece and use off-the-shelf parts already widely
available: a microphone connected to an analog-to-digital converter, a Bluetooth radio, and a rechargeable
Always present and always on, the voice layer will come to offer much more than spoken-word
shopping. It will listen in, McQuivey says, and then funnel whatever information you request–when
is a bill due? check me into my hotel–to the relevant device, whether phone, watch, or laptop. It will
come to learn different family members’ voices and can send you a text, for instance, when your kid
comes home from school, or doesn’t. The least plausible flight of fancy in the report is the prediction
that a voice-control layer that’s always listening will come to know you better than you know yourself
and offer you custom self-help advice as a result.
Whatever the eventual applications may look like, the case for Amazon making money goes back to
retail. Shopping, McQuivey argues, will train people to turn to voice rather than screens because it
offers a compelling, accessible use that goes beyond not having to unlock your phone. Because
Amazon’s use case for voice is shopping first, it makes money as you use it–a claim neither Siri
nor Google Now can yet make. In the future, he says, an always-on version of the voice layer will
help Amazon profit even more by analyzing your everyday conversations and anticipating your wants.
McQuivey predicts that within a decade computers will be able to more or less understand the semantics
of human speech. But he doesn’t think we’ll have to wait that long for the voice-control layer to become
a significant part of our digital lives. Voice, he says, will score high on what he calls the three key
measures of future digital platforms’ success: frequency of use, emotional connection, and convenience.