Embodied Learning of

zoomzurichAI and Robotics

Oct 16, 2013 (3 years and 8 months ago)

79 views

Embodied Learning of
Qualitative Models

Jure Žabkar

Exploration and Curiosity in Robot Learning and Inference
,
DAGSTUHL, March 2011

joint work with xpero partners

problem



How should

a robot

choose

its

actions






and
experiences so as
to maximize

the





effectiveness

of its
learning
?”

goals



to learn
comprehensible

models



no

extrinsic reward



intrinsic reward:
improved prediction
model
about the environment

our way



learning from scratch

(no explicit background knowledge, but given a
learning algorithm)



real
robots,

real
-
time

learning

learning loop

1.
observe the environment (collect data)

2.
learn a model

3.
use the model to predict the effect of
each action

4.
choose the best action (w.r.t. active
learning strategy)

5.
observe the environment and check
whether the predictions match new
observations

starting scenario

Q: how does the
area

of the ball
(as observed by the robot)

change w.r.t. robot's
actions
?

area := #pixels of the red
blob in the image from
robot's camera



actions: s
L
, s
R

(the distance of the L/R
wheel)

area
=

area(s
L
,s
R
)

task: find the appropriate model

equation discovery?

we tried several algorithms,
no success

motivation

people
most
often

reason qualitatively








AI: robots should mimic





human intelligence

why learning qualitative relations?

the area problem,
qualitatively


if action=
forward

then
the area increases
until it
becomes
constant

(blob occupies the whole image)


if
orientation<0 and action=
left
(increasing the
absolute value of the angle)

then
the area
decreases
until it becomes
constant

(zero)


...

qualitative rules



prediction model gets



much more accurate,

but the predictions are






not that precise.

methods


active learning + planning


learning methods:

Padé

Žabkar, Možina, Bratko, Demšar
Learning Qualitative Models from Numerical Data
, AIJ,
2011

STRUDEL

Košmerlj, Bratko, Žabkar
E
mbodied

C
oncept

D
iscovery

through

Q
ualitative
A
ction

M
odels
,
IJUFKS,
2011

Qube

Žabkar et al
Preference Learning from Qualitative Partial Derivatives,
ECML Preference Learning
Workshop,
2010

Hyper
(with predicate invention mechanism)

Leban, Žabkar, Bratko
An experiment in robot discovery with ILP

Proc. ILP
2008


tested on simulated
(billiards)

and real data
(medical application,
robotics
)

ceteris paribus


e.g.
partial differentiation


observe

a
qualitative relation
between
two
selected
features
,

other features held
constant


qualitative relations of 3 types:


x increases


f(x) increases (Padé)


preference relation: x y


f(x
) f(y)


structural: on(A,B,t1), on(A,C,t2)

"
all
other things being
equal
"

qualitative models

data

qualitative
changes

qualitative
models

Padé, Qube, STRUDEL

machine learning,

statistics

qualitative models

data

qualitative
changes

qualitative
models

Padé, Qube, STRUDEL

machine learning,

statistics

qualitative models

data

qualitative
changes

qualitative
models

Padé, Qube, STRUDEL

machine learning,

statistics

l
earning with structured data


ILP with predicate invention too complex
for real
-
time learning



we use ILP to learn smaller subtasks


structural qualitative changes

www.ailab.si/xpero

the concept
"movable"

the discovered condition which
distinguishes different effects of
actions:

p1(Obj
):
-

at(T1, Obj, Pos1),

at(T2, Obj, Pos2),

neq_pos(Pos1, Pos2
).



move(T, Obj):
-

p1(Obj),

f1(T, Obj).


move(T
, Obj):
-

not p1(Obj),

f2(T, Obj
).

f1(T1, Obj):
-

at(T1, Obj, Pos1),

at(T2, Obj, Pos2),

Pos1
\
== Pos2,

{T2 = T1+1
}.


f2(T, Obj):
-

not f1(T, Obj
).

the discovered effects of actions:

p1
is true if the object

was
observed

at two
di
fferent

position
s