Talking With Machines

movedearAI and Robotics

Nov 17, 2013 (3 years and 9 months ago)

59 views

1



June 1, 2012
Talking With Machines

Smithsonian.com
Posted By:
Randy Rieland — Communication,In the News,Neuroscience,Personal


Siri is just the beginning of voice recognition. Photo courtesy of Flickr user AcidZero
Voice recognition software, most of us would probably agree, is a pretty cool thing. But the talking to
machines part–be it smartphone, TV screen or dashboard–well, not so much. Asking advice of a device?
Reeks of geek. Enunciating each word so you can be understood? How cool can you really be?
2

But Apple, true to form, has taken this head on by hiring three icons of cool to star in their latest ad
campaign for Siri, the voice of the iPhone 4S. There’s
Zooey Deschanel (Adorable Cool) and
John
Malkovich
(Cerebral Cool) and Samuel L. Jackson
(Ultimate Cool), and all make engaging in wordplay a
with a phone seem the sport of gods.

Critics, nonetheless, point out that in real life,
Siri is neither as responsive nor all-knowing as she’s
portrayed in commercials.
You, too, I’m sure, are shocked to hear this. Others see the whole thing as ripe
for parody–see Zooey’s brother Jooey do a
Funny or Die version
of Zooey’s and Siri’s rainy day together.
No matter. Siri has become a lead singer in the robot chorus, the “You Got Mail” voice of a new
generation.
It is fashionable in some circles to suggest that Siri isn’t Steve Jobs-worthy, that if he were still alive, Jobs
would have pulled it off the market or, at the very least,
never would have approved such a high-profile ad
campaign
for so flawed a product.

But as Jobs’ successor,
Tim Cook, said earlier this week, iPhone 4S owners like Siri. According to
a survey
released in March,
almost 90 percent say they use it at least once a month. And keep in mind that Siri, one
of the very few Apple products said to be in beta when it was released, won’t celebrate her first birthday
until October. She’s still learning language and, even more importantly, just beginning to tap the potential
of artificial intelligence.

Siri will
likely be a centerpiece of Apple TV,
expected to make its debut in December. But chances are, the
place where talking to machines will go mainstream is in our cars.

Drive, she said
Sure, that’s already happening, but you still have to switch to robot speak if you want to be understood.
And even then there’s no guarantee. That will start to change this summer when some new models will
come
equipped with something called Dragon Drive!


It’s the invention of Nuance Communications, a Massachusetts-based company that’s become a
powerhouse in the voice recognition business. (It’s widely believed to be the brains behind Siri.) Nuance
and voice recognition in cars took a big leap forward last week when the firm announced that Dragon
Drive! will be able to tap into the cloud.
3

What this means is that the system will dramatically
ramp up its computing power and memory
capability.
And that means that the voice in your dashboard will become more Siri-like and allow you to
actually converse with it. No more monosyllabic shouting. The day is coming when you’ll be able to
casually mention that you feel like some Allman Brothers and seconds later “Whipping Post” will come
pumping through the speakers.
The key is how well we’re able to teach machines context and pragmatics–how language is used in social
situations. And that’s tricky business. For starters, even the most sophisticated voice recognition device
needs to wait for a human to finish speaking so it’s able to parse and interpret the whole sentence. Then
there’s the
“theory of mind,”
the ability to understand that other people can have different beliefs and
intentions than our own. As far as we know, only humans can do this.

A
recent study by two Stanford psychologists
can give you a sense of what’s involved in helping machines
intuit. Researchers Michael Frank and Noah Goodman set up an online experiment in which participants
were asked to look at a set of objects and then select which one was being referred to be a particular word.
For instance, one group of participants saw a blue square, a blue circle and a red square. The question for
that group was: Imagine you are talking to someone and you want to refer to the middle object. Which
word would you use, “blue” or “circle”?

The other group was asked: Imagine someone is talking to you and uses the word “blue” to refer to one of
these objects. Which object are they talking about?
The responses helped the researchers get a clearer picture of how a listener understands a speaker and
how a speaker decides what to say. From that, they developed the kind of mathematical model that can
expand and refine a computer’s thought process.
Said Frank: “It will take years of work but the dream is of a computer that really is thinking about what
you want and what you mean rather than just what you said.”
A manner of speech
Here are some more recent developments in voice recognition:
• Siri goes silent: IBM tends to be real nervous about corporate secrets from getting out, so it now
forbids its employees from using public file transfer sites, such as Dropbox. But it also has
a ban on
4

the use of Siri
in the office because security execs worry that someone, while talking to their phone,
could reveal sensitive info that ends up on Apple’s servers.
• Take that, Apple!: Samsung launched its
new Galaxy X III smartphone
in London this week, and
while its big touchscreen is getting a lot of attention, it also features new voice and face recognition
software.
• Do what I say, not what I do: And Samsung’s not stopping there. It recently filed a patent
application for
a robot that understands human speech.
The robot would be able to adjust its
“listening” capabilities to take into account ambient noise that might interrupt or disrupt commands
it’s been given. It would also be able to recognize who’s speaking to it, even if the background noise is
very loud.