David Pogue, September 2, 2013, NYT:
The usefulness of speech recognition on smartphones depends on how it's used, says David Pogue
Your review was the dumbest thing I’ve ever read. It strains me to avoid profanity in describing how stupid you sound.” That’s the kind of email that brightened my day after I reviewed Google’s Moto X phone a few weeks ago.
My correspondents seemed especially unhappy with one sentence in that review: “Android’s voice commands are still no match for Siri.”
Man, I really was stupid. Who’d be dumb enough to take sides in a religious war? I’d have been better off writing, “Conservatives are better-looking than liberals” or “Pro-life people are worse drivers than pro-choice.”
But the superiority of cellphone speech-recognition technology is not an idle question. Once touch screens became the future of phones, voice recognition became desperately important. Without physical keys or buttons, entering text and manipulating software controls are fussy, multi-step procedures.
My correspondents seemed especially unhappy with one sentence in that review: “Android’s voice commands are still no match for Siri.”
Man, I really was stupid. Who’d be dumb enough to take sides in a religious war? I’d have been better off writing, “Conservatives are better-looking than liberals” or “Pro-life people are worse drivers than pro-choice.”
But the superiority of cellphone speech-recognition technology is not an idle question. Once touch screens became the future of phones, voice recognition became desperately important. Without physical keys or buttons, entering text and manipulating software controls are fussy, multi-step procedures.
So I’ve just spent two weeks immersed in voice recognition. I carried an iPhone and a phone running Google’s Android operating system with me everywhere. I spoke to both phones simultaneously. I wanted to get to know the differences, the strengths, the weaknesses.
When people talk about speech recognition, they mean, and often confuse, three different functions. There’s dictation, where the phone converts speech to text; commands, where you operate the phone by talking; and Internet information searches. There are vast differences among the successes of the three.
Dictation, for example, is still fairly poor on both systems. Both Android phones and Siri, the iPhone’s speech feature, make many transcription errors. When you hear people bashing cellphone transcription, declaring, “I gave up on it,” they’re usually referring to dictation.
That’s forgivable, but come on. You’re asking your phone to understand varying accents at varying distances from its microphone, in rooms with varying background noise. It’s a wonder this feature works at all.
The latest Android version doesn’t require an Internet connection to do basic dictation. And in Android, the words appear on the screen as you utter them; Siri doesn’t transcribe until you stop talking.
On the other hand, Siri understands formatting controls like “capital,” “all caps” and “no space,” as well as all kinds of punctuation - “colon,” “dash,” “asterisk,” “ellipsis” and so on. Android understands only the basic symbols, like “period,” “comma” and “exclamation point.”
The second category, phone-control commands, is far more successful for far more people. This is when you say: “Call Mom,” “Text Emily,” “Wake me at 7:30,” “Play some Billy Joel,” “Remind me to feed the cat when I get home,” and so on.
Controlling your phone without touching it is important for safety, of course. If you must interact with your phone while driving, speaking to it certainly seems safer than looking at it.
But don’t forget the convenience factor. It’s much faster to say, “Open Angry Birds” than to flip through home screens full of icons. And “Set my alarm for 8 a.m.” is about 375 finger-taps quicker than using the clock app.
Here, Siri has the edge. As you’re driving along, for example, and you hear the incoming message sound, you can say, “Read my new messages,” and Siri reads them aloud. It even invites you to dictate a reply, without ever taking your eyes off the road. Android can’t do that.
Both systems can tap into some of the phone’s own apps. They recognise commands like “Make a meeting with Bob Barnett Thursday at noon” (a calendar interaction), “Make a note to pay back Harold” (notes), “Send an email to Danny Cooper” (mail) and “What’s Steve Alper’s home address?” (contacts).
Android blows away iOS, though, in Web searches. Both kinds of phones do an amazing job fetching weather updates. But Google’s bread and butter is Web searches, so Android responses are generally much, much faster.
Android is especially amazing at dialling places without having to look them up (“Call the Macy’s on 34th Street”) and directions (“Get me to La Guardia Airport by public transportation”), since its Map app is so unbelievably good.
Unfortunately, Android has an Achilles’ heel - actually, more like Achilles’ entire leg. To issue spoken commands, you have to tap the microphone icon on the Google search bar. And it’s only on the home screen or the Google Now screen (swipe up from the bottom).
So you can’t speak commands when your phone is locked, or when you’re in another app.
On the iPhone, you hold down the Home button or the clicker on your earbuds cord, so the voice command feature works when the phone is asleep or in any app.
In other words, to use an Android phone’s speech features, you frequently have to pick it up, and you always have to look at it, which defeats much of the purpose. The exception: Motorola’s new phones, like the Moto X, can be set to listen all the time.
Siri is better with restaurants and movies, too. Both phones understand, “Good Indian restaurants around here” or “Call the Olive Garden on Daleford Road.” But Siri can also book reservations, thanks to integration with OpenTable.com.
Similarly, Siri provides attractive, consolidated answer screens for, “What movies are opening this week?” “Give me the reviews for 'The Way, Way Back,'” or “What are today’s showtimes for 'The Smurfs 2’?” Android just shows you Google search results.
And then there’s the issue of personality: Siri has it, Android doesn’t.
We’re talking about wisecracks, jokes, attitude, addressing you by name. If you ask Siri, “Who’s your daddy?”, she replies: “You are. Can we get back to work now?” Say, “Beam me up, Siri,” and she says: “Please remove your belt, shoes and jacket, and empty your pockets.” Say, “Talk dirty to me,” and she replies, “Humus. Compost. Pumice. Silt. Gravel.”
Now, on the great battlefield of the Apple-Google fanboy war, humour is small potatoes. Apple haters practically claw their eyes out when you mention Siri’s personality. “It’s not useful! It’s a parlour trick! It strains me to avoid profanity in describing how stupid you sound!”
And that’s fine. That’s why there’s choice: two camps in this philosophical school. (Well, there’s also Windows Phone and BlackBerry, but their speech recognition is extremely rudimentary.)
And so: Put down your swords, fanboys. Both systems are exceedingly useful, once you spend the time to learn them. (Here’s a site with a good list of Android voice commands: j.mp/12kEFDo. And here’s one for Siri: j.mp/16Yy4yy.)
Though Siri has the edge, the gap has closed substantially, and both systems are rapidly improving. For example, until recently Android had no phone-control features at all - only Web searches. And in this fall’s iOS 7 update, Siri will gain a more pleasant speaking voice, faster searches and the ability to change settings by voice (“Turn on Airplane Mode,” “Turn up the brightness,” “Turn on Bluetooth”) - something neither phone can do now.
This much is clear: Cellphone speech recognition is getting better fast. Very soon, we’ll do less talking through our phones - and more talking to them.
When people talk about speech recognition, they mean, and often confuse, three different functions. There’s dictation, where the phone converts speech to text; commands, where you operate the phone by talking; and Internet information searches. There are vast differences among the successes of the three.
Dictation, for example, is still fairly poor on both systems. Both Android phones and Siri, the iPhone’s speech feature, make many transcription errors. When you hear people bashing cellphone transcription, declaring, “I gave up on it,” they’re usually referring to dictation.
That’s forgivable, but come on. You’re asking your phone to understand varying accents at varying distances from its microphone, in rooms with varying background noise. It’s a wonder this feature works at all.
The latest Android version doesn’t require an Internet connection to do basic dictation. And in Android, the words appear on the screen as you utter them; Siri doesn’t transcribe until you stop talking.
On the other hand, Siri understands formatting controls like “capital,” “all caps” and “no space,” as well as all kinds of punctuation - “colon,” “dash,” “asterisk,” “ellipsis” and so on. Android understands only the basic symbols, like “period,” “comma” and “exclamation point.”
The second category, phone-control commands, is far more successful for far more people. This is when you say: “Call Mom,” “Text Emily,” “Wake me at 7:30,” “Play some Billy Joel,” “Remind me to feed the cat when I get home,” and so on.
Controlling your phone without touching it is important for safety, of course. If you must interact with your phone while driving, speaking to it certainly seems safer than looking at it.
But don’t forget the convenience factor. It’s much faster to say, “Open Angry Birds” than to flip through home screens full of icons. And “Set my alarm for 8 a.m.” is about 375 finger-taps quicker than using the clock app.
Here, Siri has the edge. As you’re driving along, for example, and you hear the incoming message sound, you can say, “Read my new messages,” and Siri reads them aloud. It even invites you to dictate a reply, without ever taking your eyes off the road. Android can’t do that.
Both systems can tap into some of the phone’s own apps. They recognise commands like “Make a meeting with Bob Barnett Thursday at noon” (a calendar interaction), “Make a note to pay back Harold” (notes), “Send an email to Danny Cooper” (mail) and “What’s Steve Alper’s home address?” (contacts).
Android blows away iOS, though, in Web searches. Both kinds of phones do an amazing job fetching weather updates. But Google’s bread and butter is Web searches, so Android responses are generally much, much faster.
Android is especially amazing at dialling places without having to look them up (“Call the Macy’s on 34th Street”) and directions (“Get me to La Guardia Airport by public transportation”), since its Map app is so unbelievably good.
Unfortunately, Android has an Achilles’ heel - actually, more like Achilles’ entire leg. To issue spoken commands, you have to tap the microphone icon on the Google search bar. And it’s only on the home screen or the Google Now screen (swipe up from the bottom).
So you can’t speak commands when your phone is locked, or when you’re in another app.
On the iPhone, you hold down the Home button or the clicker on your earbuds cord, so the voice command feature works when the phone is asleep or in any app.
In other words, to use an Android phone’s speech features, you frequently have to pick it up, and you always have to look at it, which defeats much of the purpose. The exception: Motorola’s new phones, like the Moto X, can be set to listen all the time.
Siri is better with restaurants and movies, too. Both phones understand, “Good Indian restaurants around here” or “Call the Olive Garden on Daleford Road.” But Siri can also book reservations, thanks to integration with OpenTable.com.
Similarly, Siri provides attractive, consolidated answer screens for, “What movies are opening this week?” “Give me the reviews for 'The Way, Way Back,'” or “What are today’s showtimes for 'The Smurfs 2’?” Android just shows you Google search results.
And then there’s the issue of personality: Siri has it, Android doesn’t.
We’re talking about wisecracks, jokes, attitude, addressing you by name. If you ask Siri, “Who’s your daddy?”, she replies: “You are. Can we get back to work now?” Say, “Beam me up, Siri,” and she says: “Please remove your belt, shoes and jacket, and empty your pockets.” Say, “Talk dirty to me,” and she replies, “Humus. Compost. Pumice. Silt. Gravel.”
Now, on the great battlefield of the Apple-Google fanboy war, humour is small potatoes. Apple haters practically claw their eyes out when you mention Siri’s personality. “It’s not useful! It’s a parlour trick! It strains me to avoid profanity in describing how stupid you sound!”
And that’s fine. That’s why there’s choice: two camps in this philosophical school. (Well, there’s also Windows Phone and BlackBerry, but their speech recognition is extremely rudimentary.)
And so: Put down your swords, fanboys. Both systems are exceedingly useful, once you spend the time to learn them. (Here’s a site with a good list of Android voice commands: j.mp/12kEFDo. And here’s one for Siri: j.mp/16Yy4yy.)
Though Siri has the edge, the gap has closed substantially, and both systems are rapidly improving. For example, until recently Android had no phone-control features at all - only Web searches. And in this fall’s iOS 7 update, Siri will gain a more pleasant speaking voice, faster searches and the ability to change settings by voice (“Turn on Airplane Mode,” “Turn up the brightness,” “Turn on Bluetooth”) - something neither phone can do now.
This much is clear: Cellphone speech recognition is getting better fast. Very soon, we’ll do less talking through our phones - and more talking to them.
No comments:
Post a Comment