Situated Dialogue Processing for Human-Robot Interaction - DFKI

fencinghuddleΤεχνίτη Νοημοσύνη και Ρομποτική

14 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

202 εμφανίσεις

Talking Robots
Language Technology, DFKI
Talking Robots
Situated Dialogue Processing for
Human-Robot Interaction
Dr.ir. Geert-Jan M. Kruijff
Talking Robots
@the Language Technology Lab
DFKI GmbH, Saarbrücken
http://talkingrobots.dfki.de
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Robots
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
assist people
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
challenge
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI

Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Bridge − Understand − Help
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
World − Other − Self
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
World
Understanding
Self
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Situated Beliefs
& Intentions
External
Processes
Dialogue
Management
Dialogue
Production
in
Dialogue
Understanding
out
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Bridge = Mediation
Ontology-based mediation
Uncertainty
Reasoning with
incompleteness
Structural uncertainty
Clarification
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Bridge = Mediation
Ontology-based mediation
Uncertainty
Reasoning with
incompleteness
Structural uncertainty
Clarification
Information Fusion For Visual Reference Resolution In
Dynamic Situated Dialogue
Geert-Jan M.Kruijff
1
,John D.Kelleher
2
,and Nick Hawes
3
1
Language Technology Lab,DFKI GmbH
gj@dfki.de
,
WWWhome page:
http://www.dfki.de/˜gj
2
Dublin Institute of Technology
John.Kelleher@comp.dit.ie
,
WWWhome page:
www.computing.dcu.ie/jkelleher/
3
School of Computer Science,University of Birmingham
n.a.hawes@cs.bham.ac.uk
,
WWWhome page:
http://www.cs.bham.ac.uk/˜nah
Abstract.
Human-Robot Interaction (HRI) invariably involves dialogue about
objects in the environment in which the agents are situated.The paper focuses
on the issue of resolving discourse references to such visual objects.The paper
addresses the problem using strategies for
intra-modal fusion
(identifying that
different occurrences concern the same object),and
inter-modal fusion
,(relating
object references across different modalities).Core to these strategies are sensori-
motoric coordination,and ontology-based mediation between content in different
modalities.The approach has been fully implemented,and is illustrated with sev-
eral working examples.
1 Introduction
The context of this work is the development of dialog systems for human-robot collab-
oration.The framework presented in this paper addresses a particular aspect of situated
dialog,namely reference resolution.Reference resolution in situated dialog is a par-
ticular instance of the anchoring problem [Coradeschi and Saffiotti,2003]:how can an
artificial system create and maintain correspondences between the symbols and sensor
data that refer to the same physical object?
In a dialog,human participants expect their partner to construct and maintain a
model of the evolving linguistic context.Each referring expression used in the dialog
introduces a representation into the semantics of its utterance.This representation must
be bound to an element in the context model in order for the utterance’s semantics to
be fully resolved.Referring expressions that access a representation in the context are
called
anaphoric
.In a
situated
dialog,human participants expect their partner to not
only construct and maintain a model of the linguistic discourse,but also to have full
perceptual knowledge of the environment.This introduces a form of reference,called
exophoric
reference.Exophoric references denote objects that have entered the dialog
context through a non-linguistic modality (such as vision),but have not been previously
evoked into the context.Consequently,for a robot to participate in a situated dialog,
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Bridge = Mediation
Ontology-based mediation
Uncertainty
Reasoning with
incompleteness
Structural uncertainty
Clarification
Information Fusion For Visual Reference Resolution In
Dynamic Situated Dialogue
Geert-Jan M.Kruijff
1
,John D.Kelleher
2
,and Nick Hawes
3
1
Language Technology Lab,DFKI GmbH
gj@dfki.de
,
WWWhome page:
http://www.dfki.de/˜gj
2
Dublin Institute of Technology
John.Kelleher@comp.dit.ie
,
WWWhome page:
www.computing.dcu.ie/jkelleher/
3
School of Computer Science,University of Birmingham
n.a.hawes@cs.bham.ac.uk
,
WWWhome page:
http://www.cs.bham.ac.uk/˜nah
Abstract.
Human-Robot Interaction (HRI) invariably involves dialogue about
objects in the environment in which the agents are situated.The paper focuses
on the issue of resolving discourse references to such visual objects.The paper
addresses the problem using strategies for
intra-modal fusion
(identifying that
different occurrences concern the same object),and
inter-modal fusion
,(relating
object references across different modalities).Core to these strategies are sensori-
motoric coordination,and ontology-based mediation between content in different
modalities.The approach has been fully implemented,and is illustrated with sev-
eral working examples.
1 Introduction
The context of this work is the development of dialog systems for human-robot collab-
oration.The framework presented in this paper addresses a particular aspect of situated
dialog,namely reference resolution.Reference resolution in situated dialog is a par-
ticular instance of the anchoring problem [Coradeschi and Saffiotti,2003]:how can an
artificial system create and maintain correspondences between the symbols and sensor
data that refer to the same physical object?
In a dialog,human participants expect their partner to construct and maintain a
model of the evolving linguistic context.Each referring expression used in the dialog
introduces a representation into the semantics of its utterance.This representation must
be bound to an element in the context model in order for the utterance’s semantics to
be fully resolved.Referring expressions that access a representation in the context are
called
anaphoric
.In a
situated
dialog,human participants expect their partner to not
only construct and maintain a model of the linguistic discourse,but also to have full
perceptual knowledge of the environment.This introduces a form of reference,called
exophoric
reference.Exophoric references denote objects that have entered the dialog
context through a non-linguistic modality (such as vision),but have not been previously
evoked into the context.Consequently,for a robot to participate in a situated dialog,
Mapping
Linguistic meaning:
@{i1:object}(
box
)
Conceptual meaning:
concept
(
box
) &
instance
(I1,
box
) & i1

I1
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
I
n
t
e
r
n
a
t
i
o
n
a
l

J
o
u
r
n
a
l

o
f

A
d
v
a
n
c
e
d

R
o
b
o
t
i
c

S
y
s
t
e
m
s
,

V
o
l
.

x
,

N
o
.

y

(
2
0
0
z
)




I
S
S
N

1
7
2
9

8
8
0
6
,

p
p
.

f
i
r
s
t

l
a
s
t

(
l
e
a
v
e

t
h
i
s

s
e
c
t
i
o
n

u
n
c
h
a
n
g
e
d
)



S
i
t
u
a
t
e
d

D
i
a
l
o
g
u
e

a
n
d

S
p
a
t
i
a
l

O
r
g
a
-
n
i
z
a
t
i
o
n
:

W
h
a
t
,

W
h
e
r
e


a
n
d

W
h
y
?


G
e
e
r
t
-
J
a
n

M
.

K
r
u
i
j
f
f
1
;

H
e
n
d
r
i
k

Z
e
n
d
e
r
1
;

P
a
t
r
i
c

J
e
n
s
f
e
l
t
2

&

H
e
n
r
i
k

I
.

C
h
r
i
s
t
e
n
s
e
n
2

1
L
a
n
g
u
a
g
e

T
e
c
h
n
o
l
o
g
y

L
a
b
,

G
e
r
m
a
n

R
e
se
a
r
c
h

C
e
n
t
e
r

fo
r

A
r
t
i
fi
c
i
a
l

I
n
t
e
l
l
i
g
e
n
c
e

(D
F
K
I

G
m
b
H
)
,


S
a
a
r
b
r
ü
c
k
e
n
,

G
e
r
m
a
n
y

2
C
e
n
t
r
e

fo
r

A
u
t
o
n
o
m
o
u
s

S
y
st
e
m
s,

R
o
y
a
l

I
n
st
i
t
u
t
e

o
f

T
e
c
h
n
o
l
o
g
y

(K
T
H
),


S
t
o
c
k
h
o
l
m
,

S
w
e
d
e
n

g
j
@
d
fk
i
.
d
e


A
b
s
t
r
a
c
t
:

T
h
e

p
a
p
e
r

p
r
e
s
e
n
t
s

a
n

H
R
I

a
r
c
h
i
t
e
c
t
u
r
e

f
o
r

h
u
m
a
n

a
u
g
m
e
n
t
e
d

m
a
p
p
i
n
g
,

w
h
i
c
h

h
a
s

b
e
e
n

i
m
p
l
e
m
e
n
t
e
d

a
n
d

t
e
s
t
e
d

o
n

a
n

a
u
t
o
n
o
m
o
u
s

m
o
b
i
l
e

r
o
b
o
t
i
c

p
l
a
t
f
o
r
m
.

T
h
r
o
u
g
h

i
n
t
e
r
a
c
t
i
o
n

w
i
t
h

a

h
u
m
a
n
,

t
h
e

r
o
b
o
t

c
a
n

a
u
g
m
e
n
t

i
t
s

a
u
t
o
n
o
m
o
u
s
l
y

a
c
q
u
i
r
e
d

m
e
t
r
i
c

m
a
p

w
i
t
h

q
u
a
l
i
t
a
t
i
v
e

i
n
f
o
r
m
a
t
i
o
n

a
b
o
u
t

l
o
c
a
t
i
o
n
s

a
n
d

o
b
j
e
c
t
s

i
n

t
h
e

e
n
v
i
r
o
n

m
e
n
t
.

T
h
e

s
y
s
t
e
m

i
m
p
l
e
m
e
n
t
s

v
a
r
i
o
u
s

i
n
t
e
r
a
c
t
i
o
n

s
t
r
a
t
e
g
i
e
s

o
b
s
e
r
v
e
d

i
n

i
n
d
e
p
e
n
d
e
n
t
l
y

p
e
r
f
o
r
m
e
d

W
i
z
a
r
d

o
f

O
z

s
t
u
d
i
e
s
.

T
h
e

p
a
p
e
r

d
i
s
c
u
s
s
e
s

a
n

o
n
t
o
l
o
g
y

b
a
s
e
d

a
p
p
r
o
a
c
h

t
o

m
u
l
t
i

l
a
y
e
r
e
d

c
o
n
c
e
p
t
u
a
l

s
p
a
t
i
a
l

m
a
p
p
i
n
g

t
h
a
t

p
r
o

v
i
d
e
s

a

c
o
m
m
o
n

g
r
o
u
n
d

f
o
r

h
u
m
a
n

r
o
b
o
t

d
i
a
l
o
g
u
e
.

T
h
i
s

i
s

a
c
h
i
e
v
e
d

b
y

c
o
m
b
i
n
i
n
g

a
c
q
u
i
r
e
d

k
n
o
w
l
e
d
g
e

w
i
t
h

i
n
n
a
t
e

c
o
n
c
e
p
t
u
a
l

c
o
m
m
o
n
s
e
n
s
e

k
n
o
w
l
e
d
g
e

i
n

o
r
d
e
r

t
o

i
n
f
e
r

n
e
w

k
n
o
w
l
e
d
g
e
.

T
h
e

a
r
c
h
i
t
e
c
t
u
r
e

b
r
i
d
g
e
s

t
h
e

g
a
p

b
e
t
w
e
e
n

t
h
e

r
i
c
h

s
e
m
a
n
t
i
c

r
e
p
r
e
s
e
n
t
a
t
i
o
n
s

o
f

t
h
e

m
e
a
n
i
n
g

e
x
p
r
e
s
s
e
d

b
y

v
e
r
b
a
l

u
t
t
e
r
a
n
c
e
s

o
n

t
h
e

o
n
e

h
a
n
d

a
n
d

t
h
e

r
o
b
o
t

s

i
n
t
e
r
n
a
l

s
e
n
s
o
r

b
a
s
e
d

w
o
r
l
d

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
n

t
h
e

o
t
h
e
r
.

I
t

i
s

t
h
u
s

p
o
s
s
i
b
l
e

t
o

e
s
t
a
b
l
i
s
h

r
e
f
e
r
e
n
c
e

t
o

s
p
a
t
i
a
l

a
r
e
a
s

i
n

a

s
i
t
u
a
t
e
d

d
i
a
l
o
g
u
e

b
e
t
w
e
e
n

a

h
u
m
a
n

a
n
d

a

r
o
b
o
t

a
b
o
u
t

t
h
e
i
r

e
n
v
i
r
o
n
m
e
n
t
.

T
h
e

r
e
s
u
l
t
i
n
g

c
o
n
c
e
p
t
u
a
l

d
e
s
c
r
i
p
t
i
o
n
s

r
e
p
r
e
s
e
n
t

q
u
a
l
i
t
a
t
i
v
e

k
n
o
w
l
e
d
g
e

a
b
o
u
t

l
o
c
a
t
i
o
n
s

i
n

t
h
e

e
n
v
i
r
o
n
m
e
n
t

t
h
a
t

c
a
n

s
e
r
v
e

a
s

a

b
a
s
i
s

f
o
r

a
c
h
i
e
v
i
n
g

a

n
o

t
i
o
n

o
f

s
i
t
u
a
t
i
o
n
a
l

a
w
a
r
e
n
e
s
s
.


K
e
y
w
o
r
d
s
:

H
u
m
a
n

R
o
b
o
t

I
n
t
e
r
a
c
t
i
o
n
,

C
o
n
c
e
p
t
u
a
l

S
p
a
t
i
a
l

M
a
p
p
i
n
g
,

S
i
t
u
a
t
e
d

D
i
a
l
o
g
u
e


1
.

I
n
t
r
o
d
u
c
t
i
o
n


Mo
r
e

a
n
d

m
o
r
e

r
o
b
o
t
s

f
i
n
d

t
h
e
i
r

w
a
y

i
n
t
o

e
n
v
i
r
o
n
m
e
n
t
s

w
h
e
r
e

t
h
e
i
r

p
r
i
m
a
r
y

p
u
r
p
o
s
e

i
s

t
o

i
n
t
e
r
a
c
t

w
i
t
h

h
u
m
a
n
s

t
o

h
e
l
p

a
n
d

s
o
l
v
e

a

v
a
r
i
e
t
y

o
f

s
e
r
v
i
c
e

o
r
i
e
n
t
e
d

t
a
s
k
s
.

P
a
r

t
i
c
u
l
a
r
l
y

i
f

s
u
c
h

a

s
e
r
v
i
c
e

r
o
b
o
t

i
s

m
o
b
i
l
e
,

i
t

n
e
e
d
s

t
o

h
a
v
e

a
n

u
n
d
e
r
s
t
a
n
d
i
n
g

o
f

t
h
e

s
p
a
t
i
a
l

a
n
d

f
u
n
c
t
i
o
n
a
l

p
r
o
p
e
r
t
i
e
s

o
f

t
h
e

e
n
v
i
r
o
n
m
e
n
t

i
n

w
h
i
c
h

i
t

o
p
e
r
a
t
e
s
.

T
h
e

p
r
o
b
l
e
m

w
e

a
d
d
r
e
s
s

i
s

h
o
w

a

r
o
b
o
t

c
a
n

a
c
q
u
i
r
e

a
n

u
n
d
e
r
s
t
a
n
d
i
n
g

o
f

t
h
e

e
n
v
i
r
o
n
m
e
n
t

s
o

t
h
a
t

i
t

c
a
n

a
u
t
o
n
o
m
o
u
s
l
y

o
p
e
r
a
t
e

i
n

i
t
,

a
n
d

c
o
m
m
u
n
i
c
a
t
e

a
b
o
u
t

i
t

w
i
t
h

a

h
u
m
a
n
.

We

p
r
e
s
e
n
t

a
n

a
r
c
h
i
t
e
c
t
u
r
e

t
h
a
t

p
r
o
v
i
d
e
s

t
h
e

r
o
b
o
t

w
i
t
h

t
h
i
s

a
b
i
l
i
t
y

t
h
r
o
u
g
h

a

c
o
m
b
i
n
a
t
i
o
n

o
f

h
u
m
a
n

r
o
b
o
t

i
n
t
e
r
a
c
t
i
o
n

a
n
d

a
u
t
o
n
o
m
o
u
s

m
a
p
p
i
n
g

t
e
c
h
n
i
q
u
e
s
.

I
t

c
a
p
t
u
r
e
s

v
a
r
i
o
u
s

f
u
n
c
t
i
o
n
s

t
h
a
t

i
n
d
e
p
e
n
d
e
n
t
l
y

p
e
r
f
o
r
m
e
d

Wi
z
a
r
d

o
f

O
z

s
t
u
d
i
e
s

h
a
v
e

o
b
s
e
r
v
e
d

t
o

b
e

n
e
c
e
s
s
a
r
y

f
o
r

s
u
c
h

a

s
y
s
t
e
m
.

S
e
v
e
r
a
l

c
a
s
e

s
t
u
d
i
e
s

h
a
v
e

b
e
e
n

c
o
n
d
u
c
t
e
d

t
o

t
e
s
t

a
n
d

e
v
a
l
u
a
t
e

t
h
e

r
e
s
u
l
t
i
n
g

i
n
t
e
g
r
a
t
e
d

s
y
s
t
e
m
.

T
h
e

m
a
i
n

i
s
s
u
e

i
s

h
o
w

t
o

e
s
t
a
b
l
i
s
h

a

c
o
r
r
e
s
p
o
n
d
e
n
c
e

b
e

t
w
e
e
n

h
o
w

a

h
u
m
a
n

p
e
r
c
e
i
v
e
s

s
p
a
t
i
a
l

a
n
d

f
u
n
c
t
i
o
n
a
l

a
s

p
e
c
t
s

o
f

a
n

e
n
v
i
r
o
n
m
e
n
t
,

a
n
d

w
h
a
t

t
h
e

r
o
b
o
t

a
u
t
o
n
o

m
o
u
s
l
y

l
e
a
r
n
s

a
s

a

m
a
p
.

Mo
s
t

e
x
i
s
t
i
n
g

a
p
p
r
o
a
c
h
e
s

t
o

r
o

b
o
t

m
a
p

b
u
i
l
d
i
n
g
,

o
r

S
i
m
u
l
t
a
n
e
o
u
s

L
o
c
a
l
i
z
a
t
i
o
n

A
n
d

Ma
p
p
i
n
g

(
S
L
A
M)
,

u
s
e

a

m
e
t
r
i
c

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
f

s
p
a
c
e
.

H
u
m
a
n
s
,

t
h
o
u
g
h
,

h
a
v
e

a

m
o
r
e

q
u
a
l
i
t
a
t
i
v
e
,

t
o
p
o
l
o
g
i
c
a
l

p
e
r
s
p
e
c
t
i
v
e

o
n

s
p
a
t
i
a
l

o
r
g
a
n
i
z
a
t
i
o
n

(
Mc
N
a
m
a
r
a
,

1
9
8
6
)
.

We

a
d
o
p
t

a
n

a
p
p
r
o
a
c
h

i
n

w
h
i
c
h

w
e

b
u
i
l
d

a

m
u
l
t
i

l
a
y
e
r
e
d

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
f

t
h
e

e
n
v
i
r
o
n
m
e
n
t
,

c
o
m
b
i
n
i
n
g

m
e
t
r
i
c

m
a
p
s

a
n
d

t
o
p
o
l
o
g
i
c
a
l

g
r
a
p
h
s

(
a
s

a
n

a
b
s
t
r
a
c
t
i
o
n

o
v
e
r

g
e
o
m
e
t
r
i
c
a
l

i
n
f
o
r
m
a
t
i
o
n
)
,

l
i
k
e

(
K
u
i
p
e
r
s
,

2
0
0
0
)
.

We

e
x
t
e
n
d

t
h
e
s
e

r
e
p
r
e
s
e
n
t
a
t
i
o
n
s

w
i
t
h

c
o
n
c
e
p
t
u
a
l

d
e
s
c
r
i
p
t
i
o
n
s

t
h
a
t

c
a
p
t
u
r
e

a
s
p
e
c
t
s

o
f

s
p
a
t
i
a
l

a
n
d

f
u
n
c
t
i
o
n
a
l

o
r
g
a
n
i
z
a
t
i
o
n
.

T
h
e

r
o
b
o
t

o
b
t
a
i
n
s

t
h
e
s
e

d
e
s
c
r
i
p
t
i
o
n
s

e
i
t
h
e
r

t
h
r
o
u
g
h

i
n
t
e
r

a
c
t
i
o
n

w
i
t
h

a

h
u
m
a
n
,

o
r

t
h
r
o
u
g
h

i
n
f
e
r
e
n
c
e

c
o
m
b
i
n
i
n
g

i
t
s

o
w
n

o
b
s
e
r
v
a
t
i
o
n
s

(
I

s
e
e

a

c
o
f
f
e
e

m
a
c
h
i
n
e
)

w
i
t
h

o
n
t
o
l
o
g
i
c
a
l

k
n
o
w
l
e
d
g
e

(
C
o
f
f
e
e

m
a
c
h
i
n
e
s

a
r
e

u
s
u
a
l
l
y

f
o
u
n
d

i
n

k
i
t
c
h
e
n
s
,

s
o

t
h
i
s

i
s

l
i
k
e
l
y

t
o

b
e

a

k
i
t
c
h
e
n
!
)
.

We

s
t
o
r
e

o
b
j
e
c
t
s

i
n

t
h
e

s
p
a
t
i
a
l

r
e
p
r
e
s
e
n
t
a
t
i
o
n
s
,

a
n
d

s
o

a
s
s
o
c
i
a
t
e

t
h
e

f
u
n
c
t
i
o
n
a
l
i
t
y

o
f

a

l
o
c
a
t
i
o
n

w
i
t
h

t
h
a
t

o
f

t
h
e

f
u
n
c
t
i
o
n
s

o
f

t
h
e

o
b
j
e
c
t
s

p
r
e
s
e
n
t

t
h
e
r
e
.

A

c
o
r
e

c
h
a
r
a
c
t
e
r
i
s
t
i
c

o
f

o
u
r

a
p
p
r
o
a
c
h

i
s

t
h
a
t

w
e

a
n
a
l
y
z
e

e
a
c
h

u
t
t
e
r
a
n
c
e

t
o

o
b
t
a
i
n

a

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
f

t
h
e

m
e
a
n
i
n
g

i
t

e
x
p
r
e
s
s
e
s
,

a
n
d

h
o
w

i
t

(
s
y
n
t
a
c
t
i
c
a
l
l
y
)

c
o
n
v
e
y
s

t
h
a
t

m
e
a
n
i
n
g



r
a
t
h
e
r

t
h
a
n

j
u
s
t

d
o
i
n
g

f
o
r

e
x
a
m
p
l
e

k
e
y

w
o
r
d

s
p
o
t
t
i
n
g
.

T
h
i
s

w
a
y
,

w
e

c
a
n

p
r
o
p
e
r
l
y

h
a
n
d
l
e

t
h
e

v
a
r
i
e
t
y

o
f

w
a
y
s

i
n

w
h
i
c
h

p
e
o
p
l
e

m
a
y

e
x
p
r
e
s
s

a
s
s
e
r
t
i
o
n
s
,

q
u
e
s
t
i
o
n
s
,

a
n
d

c
o
m
m
a
n
d
s
.

F
u
r
t
h
e
r
m
o
r
e
,

h
a
v
i
n
g

a

r
e
p
r
e

s
e
n
t
a
t
i
o
n

o
f

t
h
e

m
e
a
n
i
n
g

o
f

t
h
e

u
t
t
e
r
a
n
c
e

w
e

c
a
n

c
o
m
b
i
n
e

i
t

w
i
t
h

f
u
r
t
h
e
r

i
n
f
e
r
e
n
c
e
s

o
v
e
r

o
n
t
o
l
o
g
i
e
s

t
o

o
b
t
a
i
n

a

c
o
m
p
l
e
t
e

c
o
n
c
e
p
t
u
a
l

d
e
s
c
r
i
p
t
i
o
n

o
f

t
h
e

l
o
c
a
t
i
o
n

o
r

o
b
j
e
c
t

b
e
i
n
g

t
a
l
k
e
d

a
b
o
u
t
.

T
h
i
s

w
a
y

w
e

c
a
n

g
r
o
u
n
d

s
i
t
u
a
t
e
d

d
i
a
l
o
g
u
e

i
n

t
h
e

s
i
t
u
a
t
i
o
n
a
l

a
w
a
r
e
n
e
s
s

o
f

t
h
e

r
o
b
o
t
.

F
o
l
l
o
w
i
n
g

(
T
o
p
p

&

C
h
r
i
s
t
e
n
s
e
n
,

2
0
0
5
)

a
n
d

(
T
o
p
p

e
t

a
l
.
,

2
0
0
6
)
,

w
e

t
a
l
k

a
b
o
u
t

H
u
m
a
n

A
u
g
m
e
n
t
e
d

M
a
p
p
i
n
g

(
H
A
M)

t
o

i
n
d
i
c
a
t
e

t
h
e

a
c
t
i
v
e

r
o
l
e

t
h
a
t

h
u
m
a
n

r
o
b
o
t

i
n
t
e
r
a
c
t
i
o
n

p
l
a
y
s

i
n

t
h
e

r
o
b
o
t
ʹ
s

a
c
q
u
i
s
i
t
i
o
n

o
f

q
u
a
l
i
t
a
t
i
v
e

s
p
a
t
i
a
l

k
n
o
w
l
e
d
g
e
.

I
n

§
2

w
e

d
i
s
c
u
s
s

v
a
r
i
o
u
s

o
b
s
e
r
v
a
t
i
o
n
s

t
h
a
t

i
n
d
e
p
e
n
d
e
n
t
l
y

p
e
r
f
o
r
m
e
d

Wi
z
a
r
d

o
f

O
z

s
t
u
d
i
e
s

h
a
v
e

m
a
d
e

o
n

t
y
p
i
c
a
l

i
n
t
e
r
a
c
t
i
o
n
s

f
o
r

H
A
M

s
c
e
n
a
r
i
o
s
,

a
n
d

w
e

Pan-tilt unit
with ster
eo-vision
camera
SICK laser range finder
Balance caster wheel
Driv
e wheels (left/right)
with pneumatic tir
es
and wheel encoders f
or
odometr
y
F
orwar
d/aft
bump sensors
Wir
eless ethernet
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
I
n
t
e
r
n
a
t
i
o
n
a
l

J
o
u
r
n
a
l

o
f

A
d
v
a
n
c
e
d

R
o
b
o
t
i
c

S
y
s
t
e
m
s
,

V
o
l
.

x
,

N
o
.

y

(
2
0
0
z
)




I
S
S
N

1
7
2
9

8
8
0
6
,

p
p
.

f
i
r
s
t

l
a
s
t

(
l
e
a
v
e

t
h
i
s

s
e
c
t
i
o
n

u
n
c
h
a
n
g
e
d
)



S
i
t
u
a
t
e
d

D
i
a
l
o
g
u
e

a
n
d

S
p
a
t
i
a
l

O
r
g
a
-
n
i
z
a
t
i
o
n
:

W
h
a
t
,

W
h
e
r
e


a
n
d

W
h
y
?


G
e
e
r
t
-
J
a
n

M
.

K
r
u
i
j
f
f
1
;

H
e
n
d
r
i
k

Z
e
n
d
e
r
1
;

P
a
t
r
i
c

J
e
n
s
f
e
l
t
2

&

H
e
n
r
i
k

I
.

C
h
r
i
s
t
e
n
s
e
n
2

1
L
a
n
g
u
a
g
e

T
e
c
h
n
o
l
o
g
y

L
a
b
,

G
e
r
m
a
n

R
e
se
a
r
c
h

C
e
n
t
e
r

fo
r

A
r
t
i
fi
c
i
a
l

I
n
t
e
l
l
i
g
e
n
c
e

(D
F
K
I

G
m
b
H
)
,


S
a
a
r
b
r
ü
c
k
e
n
,

G
e
r
m
a
n
y

2
C
e
n
t
r
e

fo
r

A
u
t
o
n
o
m
o
u
s

S
y
st
e
m
s,

R
o
y
a
l

I
n
st
i
t
u
t
e

o
f

T
e
c
h
n
o
l
o
g
y

(K
T
H
),


S
t
o
c
k
h
o
l
m
,

S
w
e
d
e
n

g
j
@
d
fk
i
.
d
e


A
b
s
t
r
a
c
t
:

T
h
e

p
a
p
e
r

p
r
e
s
e
n
t
s

a
n

H
R
I

a
r
c
h
i
t
e
c
t
u
r
e

f
o
r

h
u
m
a
n

a
u
g
m
e
n
t
e
d

m
a
p
p
i
n
g
,

w
h
i
c
h

h
a
s

b
e
e
n

i
m
p
l
e
m
e
n
t
e
d

a
n
d

t
e
s
t
e
d

o
n

a
n

a
u
t
o
n
o
m
o
u
s

m
o
b
i
l
e

r
o
b
o
t
i
c

p
l
a
t
f
o
r
m
.

T
h
r
o
u
g
h

i
n
t
e
r
a
c
t
i
o
n

w
i
t
h

a

h
u
m
a
n
,

t
h
e

r
o
b
o
t

c
a
n

a
u
g
m
e
n
t

i
t
s

a
u
t
o
n
o
m
o
u
s
l
y

a
c
q
u
i
r
e
d

m
e
t
r
i
c

m
a
p

w
i
t
h

q
u
a
l
i
t
a
t
i
v
e

i
n
f
o
r
m
a
t
i
o
n

a
b
o
u
t

l
o
c
a
t
i
o
n
s

a
n
d

o
b
j
e
c
t
s

i
n

t
h
e

e
n
v
i
r
o
n

m
e
n
t
.

T
h
e

s
y
s
t
e
m

i
m
p
l
e
m
e
n
t
s

v
a
r
i
o
u
s

i
n
t
e
r
a
c
t
i
o
n

s
t
r
a
t
e
g
i
e
s

o
b
s
e
r
v
e
d

i
n

i
n
d
e
p
e
n
d
e
n
t
l
y

p
e
r
f
o
r
m
e
d

W
i
z
a
r
d

o
f

O
z

s
t
u
d
i
e
s
.

T
h
e

p
a
p
e
r

d
i
s
c
u
s
s
e
s

a
n

o
n
t
o
l
o
g
y

b
a
s
e
d

a
p
p
r
o
a
c
h

t
o

m
u
l
t
i

l
a
y
e
r
e
d

c
o
n
c
e
p
t
u
a
l

s
p
a
t
i
a
l

m
a
p
p
i
n
g

t
h
a
t

p
r
o

v
i
d
e
s

a

c
o
m
m
o
n

g
r
o
u
n
d

f
o
r

h
u
m
a
n

r
o
b
o
t

d
i
a
l
o
g
u
e
.

T
h
i
s

i
s

a
c
h
i
e
v
e
d

b
y

c
o
m
b
i
n
i
n
g

a
c
q
u
i
r
e
d

k
n
o
w
l
e
d
g
e

w
i
t
h

i
n
n
a
t
e

c
o
n
c
e
p
t
u
a
l

c
o
m
m
o
n
s
e
n
s
e

k
n
o
w
l
e
d
g
e

i
n

o
r
d
e
r

t
o

i
n
f
e
r

n
e
w

k
n
o
w
l
e
d
g
e
.

T
h
e

a
r
c
h
i
t
e
c
t
u
r
e

b
r
i
d
g
e
s

t
h
e

g
a
p

b
e
t
w
e
e
n

t
h
e

r
i
c
h

s
e
m
a
n
t
i
c

r
e
p
r
e
s
e
n
t
a
t
i
o
n
s

o
f

t
h
e

m
e
a
n
i
n
g

e
x
p
r
e
s
s
e
d

b
y

v
e
r
b
a
l

u
t
t
e
r
a
n
c
e
s

o
n

t
h
e

o
n
e

h
a
n
d

a
n
d

t
h
e

r
o
b
o
t

s

i
n
t
e
r
n
a
l

s
e
n
s
o
r

b
a
s
e
d

w
o
r
l
d

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
n

t
h
e

o
t
h
e
r
.

I
t

i
s

t
h
u
s

p
o
s
s
i
b
l
e

t
o

e
s
t
a
b
l
i
s
h

r
e
f
e
r
e
n
c
e

t
o

s
p
a
t
i
a
l

a
r
e
a
s

i
n

a

s
i
t
u
a
t
e
d

d
i
a
l
o
g
u
e

b
e
t
w
e
e
n

a

h
u
m
a
n

a
n
d

a

r
o
b
o
t

a
b
o
u
t

t
h
e
i
r

e
n
v
i
r
o
n
m
e
n
t
.

T
h
e

r
e
s
u
l
t
i
n
g

c
o
n
c
e
p
t
u
a
l

d
e
s
c
r
i
p
t
i
o
n
s

r
e
p
r
e
s
e
n
t

q
u
a
l
i
t
a
t
i
v
e

k
n
o
w
l
e
d
g
e

a
b
o
u
t

l
o
c
a
t
i
o
n
s

i
n

t
h
e

e
n
v
i
r
o
n
m
e
n
t

t
h
a
t

c
a
n

s
e
r
v
e

a
s

a

b
a
s
i
s

f
o
r

a
c
h
i
e
v
i
n
g

a

n
o

t
i
o
n

o
f

s
i
t
u
a
t
i
o
n
a
l

a
w
a
r
e
n
e
s
s
.


K
e
y
w
o
r
d
s
:

H
u
m
a
n

R
o
b
o
t

I
n
t
e
r
a
c
t
i
o
n
,

C
o
n
c
e
p
t
u
a
l

S
p
a
t
i
a
l

M
a
p
p
i
n
g
,

S
i
t
u
a
t
e
d

D
i
a
l
o
g
u
e


1
.

I
n
t
r
o
d
u
c
t
i
o
n


Mo
r
e

a
n
d

m
o
r
e

r
o
b
o
t
s

f
i
n
d

t
h
e
i
r

w
a
y

i
n
t
o

e
n
v
i
r
o
n
m
e
n
t
s

w
h
e
r
e

t
h
e
i
r

p
r
i
m
a
r
y

p
u
r
p
o
s
e

i
s

t
o

i
n
t
e
r
a
c
t

w
i
t
h

h
u
m
a
n
s

t
o

h
e
l
p

a
n
d

s
o
l
v
e

a

v
a
r
i
e
t
y

o
f

s
e
r
v
i
c
e

o
r
i
e
n
t
e
d

t
a
s
k
s
.

P
a
r

t
i
c
u
l
a
r
l
y

i
f

s
u
c
h

a

s
e
r
v
i
c
e

r
o
b
o
t

i
s

m
o
b
i
l
e
,

i
t

n
e
e
d
s

t
o

h
a
v
e

a
n

u
n
d
e
r
s
t
a
n
d
i
n
g

o
f

t
h
e

s
p
a
t
i
a
l

a
n
d

f
u
n
c
t
i
o
n
a
l

p
r
o
p
e
r
t
i
e
s

o
f

t
h
e

e
n
v
i
r
o
n
m
e
n
t

i
n

w
h
i
c
h

i
t

o
p
e
r
a
t
e
s
.

T
h
e

p
r
o
b
l
e
m

w
e

a
d
d
r
e
s
s

i
s

h
o
w

a

r
o
b
o
t

c
a
n

a
c
q
u
i
r
e

a
n

u
n
d
e
r
s
t
a
n
d
i
n
g

o
f

t
h
e

e
n
v
i
r
o
n
m
e
n
t

s
o

t
h
a
t

i
t

c
a
n

a
u
t
o
n
o
m
o
u
s
l
y

o
p
e
r
a
t
e

i
n

i
t
,

a
n
d

c
o
m
m
u
n
i
c
a
t
e

a
b
o
u
t

i
t

w
i
t
h

a

h
u
m
a
n
.

We

p
r
e
s
e
n
t

a
n

a
r
c
h
i
t
e
c
t
u
r
e

t
h
a
t

p
r
o
v
i
d
e
s

t
h
e

r
o
b
o
t

w
i
t
h

t
h
i
s

a
b
i
l
i
t
y

t
h
r
o
u
g
h

a

c
o
m
b
i
n
a
t
i
o
n

o
f

h
u
m
a
n

r
o
b
o
t

i
n
t
e
r
a
c
t
i
o
n

a
n
d

a
u
t
o
n
o
m
o
u
s

m
a
p
p
i
n
g

t
e
c
h
n
i
q
u
e
s
.

I
t

c
a
p
t
u
r
e
s

v
a
r
i
o
u
s

f
u
n
c
t
i
o
n
s

t
h
a
t

i
n
d
e
p
e
n
d
e
n
t
l
y

p
e
r
f
o
r
m
e
d

Wi
z
a
r
d

o
f

O
z

s
t
u
d
i
e
s

h
a
v
e

o
b
s
e
r
v
e
d

t
o

b
e

n
e
c
e
s
s
a
r
y

f
o
r

s
u
c
h

a

s
y
s
t
e
m
.

S
e
v
e
r
a
l

c
a
s
e

s
t
u
d
i
e
s

h
a
v
e

b
e
e
n

c
o
n
d
u
c
t
e
d

t
o

t
e
s
t

a
n
d

e
v
a
l
u
a
t
e

t
h
e

r
e
s
u
l
t
i
n
g

i
n
t
e
g
r
a
t
e
d

s
y
s
t
e
m
.

T
h
e

m
a
i
n

i
s
s
u
e

i
s

h
o
w

t
o

e
s
t
a
b
l
i
s
h

a

c
o
r
r
e
s
p
o
n
d
e
n
c
e

b
e

t
w
e
e
n

h
o
w

a

h
u
m
a
n

p
e
r
c
e
i
v
e
s

s
p
a
t
i
a
l

a
n
d

f
u
n
c
t
i
o
n
a
l

a
s

p
e
c
t
s

o
f

a
n

e
n
v
i
r
o
n
m
e
n
t
,

a
n
d

w
h
a
t

t
h
e

r
o
b
o
t

a
u
t
o
n
o

m
o
u
s
l
y

l
e
a
r
n
s

a
s

a

m
a
p
.

Mo
s
t

e
x
i
s
t
i
n
g

a
p
p
r
o
a
c
h
e
s

t
o

r
o

b
o
t

m
a
p

b
u
i
l
d
i
n
g
,

o
r

S
i
m
u
l
t
a
n
e
o
u
s

L
o
c
a
l
i
z
a
t
i
o
n

A
n
d

Ma
p
p
i
n
g

(
S
L
A
M)
,

u
s
e

a

m
e
t
r
i
c

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
f

s
p
a
c
e
.

H
u
m
a
n
s
,

t
h
o
u
g
h
,

h
a
v
e

a

m
o
r
e

q
u
a
l
i
t
a
t
i
v
e
,

t
o
p
o
l
o
g
i
c
a
l

p
e
r
s
p
e
c
t
i
v
e

o
n

s
p
a
t
i
a
l

o
r
g
a
n
i
z
a
t
i
o
n

(
Mc
N
a
m
a
r
a
,

1
9
8
6
)
.

We

a
d
o
p
t

a
n

a
p
p
r
o
a
c
h

i
n

w
h
i
c
h

w
e

b
u
i
l
d

a

m
u
l
t
i

l
a
y
e
r
e
d

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
f

t
h
e

e
n
v
i
r
o
n
m
e
n
t
,

c
o
m
b
i
n
i
n
g

m
e
t
r
i
c

m
a
p
s

a
n
d

t
o
p
o
l
o
g
i
c
a
l

g
r
a
p
h
s

(
a
s

a
n

a
b
s
t
r
a
c
t
i
o
n

o
v
e
r

g
e
o
m
e
t
r
i
c
a
l

i
n
f
o
r
m
a
t
i
o
n
)
,

l
i
k
e

(
K
u
i
p
e
r
s
,

2
0
0
0
)
.

We

e
x
t
e
n
d

t
h
e
s
e

r
e
p
r
e
s
e
n
t
a
t
i
o
n
s

w
i
t
h

c
o
n
c
e
p
t
u
a
l

d
e
s
c
r
i
p
t
i
o
n
s

t
h
a
t

c
a
p
t
u
r
e

a
s
p
e
c
t
s

o
f

s
p
a
t
i
a
l

a
n
d

f
u
n
c
t
i
o
n
a
l

o
r
g
a
n
i
z
a
t
i
o
n
.

T
h
e

r
o
b
o
t

o
b
t
a
i
n
s

t
h
e
s
e

d
e
s
c
r
i
p
t
i
o
n
s

e
i
t
h
e
r

t
h
r
o
u
g
h

i
n
t
e
r

a
c
t
i
o
n

w
i
t
h

a

h
u
m
a
n
,

o
r

t
h
r
o
u
g
h

i
n
f
e
r
e
n
c
e

c
o
m
b
i
n
i
n
g

i
t
s

o
w
n

o
b
s
e
r
v
a
t
i
o
n
s

(
I

s
e
e

a

c
o
f
f
e
e

m
a
c
h
i
n
e
)

w
i
t
h

o
n
t
o
l
o
g
i
c
a
l

k
n
o
w
l
e
d
g
e

(
C
o
f
f
e
e

m
a
c
h
i
n
e
s

a
r
e

u
s
u
a
l
l
y

f
o
u
n
d

i
n

k
i
t
c
h
e
n
s
,

s
o

t
h
i
s

i
s

l
i
k
e
l
y

t
o

b
e

a

k
i
t
c
h
e
n
!
)
.

We

s
t
o
r
e

o
b
j
e
c
t
s

i
n

t
h
e

s
p
a
t
i
a
l

r
e
p
r
e
s
e
n
t
a
t
i
o
n
s
,

a
n
d

s
o

a
s
s
o
c
i
a
t
e

t
h
e

f
u
n
c
t
i
o
n
a
l
i
t
y

o
f

a

l
o
c
a
t
i
o
n

w
i
t
h

t
h
a
t

o
f

t
h
e

f
u
n
c
t
i
o
n
s

o
f

t
h
e

o
b
j
e
c
t
s

p
r
e
s
e
n
t

t
h
e
r
e
.

A

c
o
r
e

c
h
a
r
a
c
t
e
r
i
s
t
i
c

o
f

o
u
r

a
p
p
r
o
a
c
h

i
s

t
h
a
t

w
e

a
n
a
l
y
z
e

e
a
c
h

u
t
t
e
r
a
n
c
e

t
o

o
b
t
a
i
n

a

r
e
p
r
e
s
e
n
t
a
t
i
o
n

o
f

t
h
e

m
e
a
n
i
n
g

i
t

e
x
p
r
e
s
s
e
s
,

a
n
d

h
o
w

i
t

(
s
y
n
t
a
c
t
i
c
a
l
l
y
)

c
o
n
v
e
y
s

t
h
a
t

m
e
a
n
i
n
g



r
a
t
h
e
r

t
h
a
n

j
u
s
t

d
o
i
n
g

f
o
r

e
x
a
m
p
l
e

k
e
y

w
o
r
d

s
p
o
t
t
i
n
g
.

T
h
i
s

w
a
y
,

w
e

c
a
n

p
r
o
p
e
r
l
y

h
a
n
d
l
e

t
h
e

v
a
r
i
e
t
y

o
f

w
a
y
s

i
n

w
h
i
c
h

p
e
o
p
l
e

m
a
y

e
x
p
r
e
s
s

a
s
s
e
r
t
i
o
n
s
,

q
u
e
s
t
i
o
n
s
,

a
n
d

c
o
m
m
a
n
d
s
.

F
u
r
t
h
e
r
m
o
r
e
,

h
a
v
i
n
g

a

r
e
p
r
e

s
e
n
t
a
t
i
o
n

o
f

t
h
e

m
e
a
n
i
n
g

o
f

t
h
e

u
t
t
e
r
a
n
c
e

w
e

c
a
n

c
o
m
b
i
n
e

i
t

w
i
t
h

f
u
r
t
h
e
r

i
n
f
e
r
e
n
c
e
s

o
v
e
r

o
n
t
o
l
o
g
i
e
s

t
o

o
b
t
a
i
n

a

c
o
m
p
l
e
t
e

c
o
n
c
e
p
t
u
a
l

d
e
s
c
r
i
p
t
i
o
n

o
f

t
h
e

l
o
c
a
t
i
o
n

o
r

o
b
j
e
c
t

b
e
i
n
g

t
a
l
k
e
d

a
b
o
u
t
.

T
h
i
s

w
a
y

w
e

c
a
n

g
r
o
u
n
d

s
i
t
u
a
t
e
d

d
i
a
l
o
g
u
e

i
n

t
h
e

s
i
t
u
a
t
i
o
n
a
l

a
w
a
r
e
n
e
s
s

o
f

t
h
e

r
o
b
o
t
.

F
o
l
l
o
w
i
n
g

(
T
o
p
p

&

C
h
r
i
s
t
e
n
s
e
n
,

2
0
0
5
)

a
n
d

(
T
o
p
p

e
t

a
l
.
,

2
0
0
6
)
,

w
e

t
a
l
k

a
b
o
u
t

H
u
m
a
n

A
u
g
m
e
n
t
e
d

M
a
p
p
i
n
g

(
H
A
M)

t
o

i
n
d
i
c
a
t
e

t
h
e

a
c
t
i
v
e

r
o
l
e

t
h
a
t

h
u
m
a
n

r
o
b
o
t

i
n
t
e
r
a
c
t
i
o
n

p
l
a
y
s

i
n

t
h
e

r
o
b
o
t
ʹ
s

a
c
q
u
i
s
i
t
i
o
n

o
f

q
u
a
l
i
t
a
t
i
v
e

s
p
a
t
i
a
l

k
n
o
w
l
e
d
g
e
.

I
n

§
2

w
e

d
i
s
c
u
s
s

v
a
r
i
o
u
s

o
b
s
e
r
v
a
t
i
o
n
s

t
h
a
t

i
n
d
e
p
e
n
d
e
n
t
l
y

p
e
r
f
o
r
m
e
d

Wi
z
a
r
d

o
f

O
z

s
t
u
d
i
e
s

h
a
v
e

m
a
d
e

o
n

t
y
p
i
c
a
l

i
n
t
e
r
a
c
t
i
o
n
s

f
o
r

H
A
M

s
c
e
n
a
r
i
o
s
,

a
n
d

w
e

Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
Robot, this is
the living room
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
Robot, this is
the living room
@{l1:area}(
living-room
)
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
Robot, this is
the living room
@{l1:area}(
living-room
)
concept
(
living-room
) &
instance
(L1,
living-room
) &
l1

L1
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping
Feature map
obj1
Coffeemachine
Object
Kitchen
Office
Area
innate
asserted
acquired
inferred
is
!
a
has
!
a
Navigation graph
Areas
area2
area1
KitchenObject
Ontology
Metric map
Topological
map
Room
Corridor
Conceptual map
Robot, this is
the living room
@{l1:area}(
living-room
)
concept
(
living-room
) &
instance
(L1,
living-room
) &
l1

L1
area
(area1) &
area1

L1
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping (2005)
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping (2005)
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping (2005)
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping (2005)
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Example: Mapping (2005)
Wednesday, June 19, 13
Talking Robots
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
“Uncertainty”
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Uncertainty in
Mediation
Ontology-based mediation
Objects, features, uncertainty
Reasoning with
incompleteness
Structural uncertainty
Clarification
Information Fusion For Visual Reference Resolution In
Dynamic Situated Dialogue
Geert-Jan M.Kruijff
1
,John D.Kelleher
2
,and Nick Hawes
3
1
Language Technology Lab,DFKI GmbH
gj@dfki.de
,
WWWhome page:
http://www.dfki.de/˜gj
2
Dublin Institute of Technology
John.Kelleher@comp.dit.ie
,
WWWhome page:
www.computing.dcu.ie/jkelleher/
3
School of Computer Science,University of Birmingham
n.a.hawes@cs.bham.ac.uk
,
WWWhome page:
http://www.cs.bham.ac.uk/˜nah
Abstract.
Human-Robot Interaction (HRI) invariably involves dialogue about
objects in the environment in which the agents are situated.The paper focuses
on the issue of resolving discourse references to such visual objects.The paper
addresses the problem using strategies for
intra-modal fusion
(identifying that
different occurrences concern the same object),and
inter-modal fusion
,(relating
object references across different modalities).Core to these strategies are sensori-
motoric coordination,and ontology-based mediation between content in different
modalities.The approach has been fully implemented,and is illustrated with sev-
eral working examples.
1 Introduction
The context of this work is the development of dialog systems for human-robot collab-
oration.The framework presented in this paper addresses a particular aspect of situated
dialog,namely reference resolution.Reference resolution in situated dialog is a par-
ticular instance of the anchoring problem [Coradeschi and Saffiotti,2003]:how can an
artificial system create and maintain correspondences between the symbols and sensor
data that refer to the same physical object?
In a dialog,human participants expect their partner to construct and maintain a
model of the evolving linguistic context.Each referring expression used in the dialog
introduces a representation into the semantics of its utterance.This representation must
be bound to an element in the context model in order for the utterance’s semantics to
be fully resolved.Referring expressions that access a representation in the context are
called
anaphoric
.In a
situated
dialog,human participants expect their partner to not
only construct and maintain a model of the linguistic discourse,but also to have full
perceptual knowledge of the environment.This introduces a form of reference,called
exophoric
reference.Exophoric references denote objects that have entered the dialog
context through a non-linguistic modality (such as vision),but have not been previously
evoked into the context.Consequently,for a robot to participate in a situated dialog,
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Uncertainty in
Mediation
Ontology-based mediation
Objects, features, uncertainty
Reasoning with
incompleteness
Structural uncertainty
Clarification
Information Fusion For Visual Reference Resolution In
Dynamic Situated Dialogue
Geert-Jan M.Kruijff
1
,John D.Kelleher
2
,and Nick Hawes
3
1
Language Technology Lab,DFKI GmbH
gj@dfki.de
,
WWWhome page:
http://www.dfki.de/˜gj
2
Dublin Institute of Technology
John.Kelleher@comp.dit.ie
,
WWWhome page:
www.computing.dcu.ie/jkelleher/
3
School of Computer Science,University of Birmingham
n.a.hawes@cs.bham.ac.uk
,
WWWhome page:
http://www.cs.bham.ac.uk/˜nah
Abstract.
Human-Robot Interaction (HRI) invariably involves dialogue about
objects in the environment in which the agents are situated.The paper focuses
on the issue of resolving discourse references to such visual objects.The paper
addresses the problem using strategies for
intra-modal fusion
(identifying that
different occurrences concern the same object),and
inter-modal fusion
,(relating
object references across different modalities).Core to these strategies are sensori-
motoric coordination,and ontology-based mediation between content in different
modalities.The approach has been fully implemented,and is illustrated with sev-
eral working examples.
1 Introduction
The context of this work is the development of dialog systems for human-robot collab-
oration.The framework presented in this paper addresses a particular aspect of situated
dialog,namely reference resolution.Reference resolution in situated dialog is a par-
ticular instance of the anchoring problem [Coradeschi and Saffiotti,2003]:how can an
artificial system create and maintain correspondences between the symbols and sensor
data that refer to the same physical object?
In a dialog,human participants expect their partner to construct and maintain a
model of the evolving linguistic context.Each referring expression used in the dialog
introduces a representation into the semantics of its utterance.This representation must
be bound to an element in the context model in order for the utterance’s semantics to
be fully resolved.Referring expressions that access a representation in the context are
called
anaphoric
.In a
situated
dialog,human participants expect their partner to not
only construct and maintain a model of the linguistic discourse,but also to have full
perceptual knowledge of the environment.This introduces a form of reference,called
exophoric
reference.Exophoric references denote objects that have entered the dialog
context through a non-linguistic modality (such as vision),but have not been previously
evoked into the context.Consequently,for a robot to participate in a situated dialog,
Crossmodal Content Binding in Information-Processing
Architectures

Henrik Jacobsson
henrikj@dfki.de
Nick Hawes
n.a.hawes@cs.bham.ac.uk
Geert-Jan Kruijff
Language Technology Lab,
DFKI GmbH,Germany
gj@dfki.de
Jeremy Wyatt
School of Computer Science,
University of Birmingham,UK
j.l.wyatt@cs.bham.ac.uk
ABSTRACT
Operating in a physical context,an intelligent robot faces
two fundamental problems.First,it needs to combine infor-
mation from its di

erent sensors to form a representation of
the environment that is more complete than any of its sen-
sors on its own could provide.Second,it needs to combine
high-level representations (such as those for planning and
dialogue) with its sensory information,to ensure that the in-
terpretations of these symbolic representations are grounded
in the situated context.Previous approaches to this prob-
lem have used techniques such as (low-level) information fu-
sion,ontological reasoning,and (high-level) concept learn-
ing.This paper presents a framework in which these,and
other approaches,can be combined to form a shared rep-
resentation of the current state of the robot in relation to
its environment and other agents.Preliminary results from
an implemented system are presented to illustrate how the
framework supports behaviours commonly required of an in-
telligent robot.
Categories and Subject Descriptors
I.2.9 [
Artificial Intelligence
]:Robotics
General Terms
Algorithms,Design
1.INTRODUCTION
An information-processing architecture for robotics is typ-
ically composed of a large number of cooperating subsys-
tems,such as natural language analysis and production,

This work was supported by the EU FP6 IST Cognitive
Systems Integrated Project “CoSy” FP6-004250-IP.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page.To copy otherwise,to
republish,to post on servers or to redistribute to lists,requires prior specific
permission and/or a fee.
Copyright 200X ACMX-XXXXX-XX-X/XX/XX...$5.00.
computer vision,motoric skills,and various deliberative pro-
cesses such as symbolic planners.The challenge addressed in
this paper is the production and maintenance of a model of
the world for a robot situated in“everyday”scenarios involv-
ing human interaction.This requires a method for
binding
representations across the subsystems.This world model
should adequately reflect the aspects of the world that are
stable in the medium term,whilst incorporating more dy-
namic aspects.
Throughout this paper we will primarily consider a robot
that can interact with a human and a set of objects on a
tabletop.For example,when faced with a scene containing
a red mug,a blue cup and a blue bowl,the robot may be
asked to“put the blue things to the left of the red thing”.For
a systemto be able to performsuch a task e

ectively,it must
be able to build a representation that connects the (low-level
and modality specific) information about the world and the
(high-level and amodal) representations that can be used to
interpret the utterance,determine the desired world state,
and plan behaviour.As resulting actions must be executed
in the world,the representation must allow the robot to
ultimately access the low-level (i.e.metric) information from
which its higher-level representations are derived.
Any design for a system to tackle the above task must
focus on creating such a representation,and grounding it in
the environment of the robot.In addition to this,the engi-
neering e

ort of integrating the various information-processing
subsystems with the representation must be considered.Af-
ter all,since the robot is an engineered system,every com-
ponent must be put there by means of human e

ort.
The grounding problem is entangled with the engineering
problem of subsystem integration and cannot be considered
in isolation.Grounding can generally be seen as the process
of establishing the relation between a representation in one
domain with that of another.One special case is is when
one of the domains is the external world,i.e.“reality”:
The term grounding [denotes] the processes by
which an agent relates beliefs to external physical
objects.Agents use grounding processes to con-
struct models of,predict,and react to,their ex-
ternal environment.Language grounding refers
to processes specialised for relating words and
speech acts to a language user’s environment via
grounded beliefs.[11] p.8
Wednesday, June 19, 13
© 2013 Geert-Jan Kruijff
Talking Robots
Language Technology, DFKI
Uncertainty in
Mediation
Ontology-based mediation
Objects, features, uncertainty
Reasoning with
incompleteness
Structural uncertainty
Clarification
Information Fusion For Visual Reference Resolution In
Dynamic Situated Dialogue
Geert-Jan M.Kruijff
1
,John D.Kelleher
2
,and Nick Hawes
3
1
Language Technology Lab,DFKI GmbH
gj@dfki.de
,
WWWhome page:
http://www.dfki.de/˜gj
2
Dublin Institute of Technology
John.Kelleher@comp.dit.ie
,
WWWhome page:
www.computing.dcu.ie/jkelleher/
3
School of Computer Science,University of Birmingham
n.a.hawes@cs.bham.ac.uk
,
WWWhome page:
http://www.cs.bham.ac.uk/˜nah
Abstract.
Human-Robot Interaction (HRI) invariably involves dialogue about