Achievements and Challenges

spectacularscarecrowΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

56 εμφανίσεις

Spoken Dialogue Technology

Achievements and Challenges

Michael McTear

University of Ulster

Overview


Introduction
-

What is a spoken dialogue
system?


Examples of spoken dialogue systems


Technical issues and challenges


Future Prospects

What is a spoken dialogue
system?

A spoken dialogue
system is an automated
system that engages in a
dialogue with a human
user using spoken
language as the medium
of interaction.

Types of dialogue system


Task
-
oriented: involves the use of
dialogues to accomplish a task, e.g.
making a hotel booking, or planning
a family holiday

Two main types of spoken dialogue system


Non
-
task
-
oriented: engaging in
conversational interaction, but
without necessarily being
involved in a task that needs to
be accomplished

e.g conversational companion
for the elderly

Application Domains for SDS


Telephone
-
based services and transactions


Call
-
routing, Directory assistance, Travel enquiries,
Bank balance, Bank transactions, Flight / hotel / car
rental reservations


In
-
car interactive and entertainment systems


Automated trouble
-
shooting


Smart homes applications


Health
-
care systems e.g. patient monitoring


Educational e,g. Intelligent Tutoring Systems,
Foreign Language Learning


Computer games

Three generations of task
-
oriented spoken dialogue
system


Informational



to retrieve information e.g. flight
times, football scores, …


Transactional



to assist the user to perform a
transaction e.g. book a flight, pay a bill


Problem
-
solving



to support the
user in solving a problem e.g. to
troubleshoot

a PC that is not
working

Why is dialogue interesting?


Fundamental aspect of human behaviour


Model human conversational competence


Simulate human conversational behaviour


Provide tool for interacting with data,
services, resources on computers


Research challenges


Applications in assistive and educational
environments


Commercial opportunities

Commercial Systems


Focus on


Business opportunities, return on investment (ROI)


Benefits for end users


Benefits for providers


Human factors: performance, usability


Tools and languages for design and maintainability


Application areas: call centre, enquiries, transactions,
healthcare, …

Academic Systems


Focus on


Technologies: speech recognition, spoken language
understanding, dialogue management


AI inspired: planning, reasoning, machine learning


Statistical v symbolic approaches


Advanced dialogue control, error handling, adaptivity,
context representation

Overview


Introduction
-

What is a spoken dialogue
system?


Examples of spoken dialogue systems


Technical issues and challenges


Future Prospects

Example 1: Voice Menu

System: Hello and welcome ….


Main menu. For customer service, say ‘service’.


To enquire about an existing order, say ‘order’ …

User: Service

System: Customer service. Would you like to report a fault
or enquire about an extended warranty?

User: Fault

System: Do you have a PC or a laptop?

User: Laptop

System: And the name of the manufacturer?

User: Sony

System: Thank you. Please hold while I transfer you to the
Sony …

http://www.speechstorm.com/

Example 2: Research System
(Mercury: MIT)


Open ended prompt

How may I help you?


Disfluencies in input

August twenty
-
first no August twelfth

I'd like to fly from Boston to Minneapolis on Tuesday no
Wednesday November 21st


Inexact response

Prompt:
Can you provide the approximate departure time
or airline preference

User:
Yeah I'd like to fly United and I'd like to leave in the
afternoon

http://groups.csail.mit.edu/sls/research/mercury.shtml

Example 2: continued


Response generation

There are more than 3 flights.

The earliest departure

leaves at 1.45 pm.



Mixed initiative: user asks question

Do you have something leaving around 4.45?



Relative date reference

I’d like to return
the following Tuesday

Example 3: Voice Search
GOOG411

GOOG
-
411 (or Google Voice Local Search) is Google's

new 411 service.

With GOOG
-
411, you can find local business information

completely free, directly from your phone.

You can access 1
-
800
-
GOOG
-
411 from any phone,

anywhere, at anytime.

http://www.google.com/goog411/

GOOG411: Prompts

What city and state?

What business name or category?

(Lists services) Number one, …..


Connects to requested service


GOOG411: What can you say?

At any point in the call:



To go back say "
go back
"

To start over say "
start over
" or press

*All phones


When asked for a city and state:



Say the full names for example, "Palo Alto California“

To enter a zip code say it or enter with keypad


When asked for business name or category:



Say the full names for example, "Joe's Pizzaria" or "Pizza“


When given results:



To navigate between results say or press the listing number

To receive an SMS say "text message"

To receive a map say "map it"

To get more details say "details"

Overview


Introduction
-

What is a spoken dialogue
system?


Examples of spoken dialogue systems


Technical issues and challenges


Future Prospects

Architecture of a spoken dialogue system

Speech

Recognition

(ASR)

Back

end

Response

Generation

Text to Speech

Synthesis

(TTS)

a
--
> x
u

Spoken

Language

Understanding

(SLU)

y
u
, c

ã, c

Concepts

Words

Audio

HMM

Acoustic

Model

N
-
Gram

Language

Model

Dialogue Manager (DM)

Dialogue

Control

Dialogue

Context Model

a user dialogue act (intended

)


c confidence


ã

user dialogue act (interpreted)




x
u

user acoustic signal

y
u

speech recognition hypothesis (words)

Component Technologies


Automatic Speech Recognition (ASR)


Spoken Language Understanding (SLU)


Response Generation (RG)


Text to speech synthesis (TTS)


Dialogue Management (DM)

Issues in ASR for Dialogue


recognising spontaneous speech in noisy
environments


word accuracy does not have to be 100%


use of confidence scores in combination with
other information to determine DM actions


use of additional information (
ASR and parse
probabilities, semantic and contextual
features
) to re
-
score recognition hypotheses

Issues in SLU for Dialogue


grammars and parsers for spontaneous speech
(disfluencies, errors)


robust understanding


problems with hand
-
crafted approaches


use of statistical/ data
-
driven methods


combined approaches e.g TINA (MIT)


hand
-
crafted rules with trained probabilities


robust strategy


if full sentence cannot be parsed,
parse and combine fragments, else use word spotting

Issues in Response Generation for
Dialogue


Content selection


Determining what to say, selecting and ranking
options


Discourse planning


discourse relations e.g. comparison, contrast


user
-
adapted information


Presentation ordering


Referring expression generation


Aggregation


grouping propositions into clauses
and sentences


Use of discourse cues (e.g.
firstly, finally, however,
moreover, …
)

Issues in Dialogue
Management


Dialogue Control


Scripts, frames, intelligent agents


Representations


Information State Theory


Error handling


Dialogue design


Traditional approaches


Statistical approaches


Reinforcement learning


Corpus / example based approaches

Overview


Introduction
-

What is a spoken dialogue
system?


Examples of spoken dialogue systems


Technical issues and challenges


Future Prospects

A vision for the future

Develop systems that can interact intelligently
and co
-
operatively across a range of
environments using a range of appropriate
modalities to support people in the activities of
their daily lives.

Fundamental research topics


Modelling human conversational competence


Dialogue
-
related issues for ASR, SLU, NLG,
TTS


Comparison of methods for dialogue
management: rule
-
based v stochastic


Representation and use of contextual
information


Integration and usage of modalities to
complement and supplement speech


Incremental processing in dialogue

Areas of application


Voice search


Dialogue in vehicles


Mobile speech applications


Multimodal embodied and situated systems


Troubleshooting applications


Dialogue systems for ambient intelligence and
as assistive technologies

Concluding remarks

Spoken Dialogue Technology


embraces a range of speech and language
technologies


poses lots of theoretical as well as practical
challenges


is interesting for commercial developers as
well as academic researchers


has a wide range of potential applications

Recommended reading

McTear, M. (2004) Spoken Dialogue Technology. Springer.

Lopez Cozar, R. & Araki, M. (2005) Spoken, multilingual
and multimodal dialogue systems.
John Wiley & Sons
.

Aghajan, H., Augusto, J.C., Lopez Cozar, R. (2009)
Human
-
Centric Interfaces for Ambient Intelligence.
Elsevier.

Jokinen, K. & McTear, M. (2010) Spoken Dialogue
Systems. Morgan Claypool Publishers.

Wilks, Y. (ed.) (2010) Close Engagements with Artificial
Companions: Key social, psychological, ethical and
design issues. John Benjamins Publishing Company.

Thank you

Questions?