The Organization of Embodied Behavior in Robotic Systems

loutclankedAI and Robotics

Nov 13, 2013 (3 years and 11 months ago)

85 views

Laboratory for Perceptual Robotics


Department of Computer Science

Hierarchical Mechanisms for
Robot Programming

Shiraj Sen
Stephen Hart Rod Grupen

Laboratory for Perceptual Robotics

University of Massachusetts Amherst

May 30, 2008

NEMS ‘08


2

Laboratory for Perceptual Robotics


Department of Computer Science

Outline

Hierarchical mechanisms

for robot programming

representation

programming

Action

Potential functions

Value functions

State


representation

user

defined

reinforcement

learning

intrinsic

extrinsic

3

Laboratory for Perceptual Robotics


Department of Computer Science

Hierarchical Actions

Σ

G

H

Σ

G

H

Σ

G

H

force

velocity

references

feedback

signals

ϕ

potential fields

Φ

value functions

greedy traversal

avoids local
minimum

programs

closed loop

primitive actions

4

Laboratory for Perceptual Robotics


Department of Computer Science

Primitive Action Programming Interface

Sensory Error (

)


Visual (
u
ref
)


Tactile (
f
ref
)


Configuration
variables (
θ
ref
)


Operational
Space(
x
ref
)


Potential Functions (

)


Spring potential fields
(
ϕ
h
)


Collision
-
free motion
fields (
ϕ
c
)


Kinematic conditioning
fields (
ϕ
cond
)

Motor Variables (

)

Subsets of :


Configuration
Variables


Operational
Space Variables



primitive actions:

a =

Nullspace
Projection


a
1


a
2

5

Laboratory for Perceptual Robotics


Department of Computer Science

State Representation


Discrete abstraction of action dynamics.


4
-
level logic in control predicate
p
i


no reference (

)

convergence

unknown

X

-

1

0

descending gradient

6

Laboratory for Perceptual Robotics


Department of Computer Science

Hierarchical Programming


A program is defined as a MDP over a vector of controller
predicates:





S

=



p
1

… p
N





Absorbing states

in the value function capture

convergence
” of programs.

X

-

1

0




Learn value functions using
reinforcement learning

7

Laboratory for Perceptual Robotics


Department of Computer Science

Stack

Insert

Grasp

Touch

Catalog

Intrinsic Reward


Goal
: build deep control knowledge



Reward
controllable interaction with the world


controllers with
direct feedback from the external world
.

Track

X

-

1

0

convergence event

X

-

1

0

8

Laboratory for Perceptual Robotics


Department of Computer Science

Experimental Demonstration



Motor units


Two 7
-
DOF Barrett WAMs


Two 4
-
DOF Barrett Hands


2
-
DOF pan/tilt stereo head



Sensory feedback


Visual


Hue


Saturation


Intensity


Texture


Tactile


6
-
axis finger
-
tip F/T sensors


Proprioceptive




Dexter

9

Laboratory for Perceptual Robotics


Department of Computer Science

STAGE 1:
SaccadeTrack

-

25 Learning Episodes

a
track

a
track

a
track

a
saccade

a
saccade

X 1

X 0

1 X

0 X

X
-

X X

S
st

=



p
saccade
p
track




rewarding

action

Track
-
saturation

10

Laboratory for Perceptual Robotics


Department of Computer Science

S
rg

=



p
st
p
reach

p
grab




STAGE 2:
ReachGrab
-

25 Learning Episodes

rewarding

action

Touch

Track
-
saturation

11

Laboratory for Perceptual Robotics


Department of Computer Science

STAGE 2:
ReachGrab
-

25 Learning Episodes

Touch

Track
-
saturation

12

Laboratory for Perceptual Robotics


Department of Computer Science

STAGE 3:
VisualInspect
-

25 Learning Episodes

S
vi

=



p
rg
p
cond

p
track(blue)


Touch

Track
-
saturation

Track
-
blue

rewarding

action

13

Laboratory for Perceptual Robotics


Department of Computer Science

STAGE 3:
VisualInspect
-

25 Learning Episodes

Touch

Track
-
saturation

Track
-
blue

14

Laboratory for Perceptual Robotics


Department of Computer Science

STAGE 4:
Grasp


User Defined Reward

X
-

-


1 X X

X X X

ReachGrab

X

-

1

0

X 0 0

X 1 1

X 1 0

X 0 1

a
moment
a
force

Touch

Track
-
saturation

Grasp

Track
-
blue

S
grasp

=



p
rg
p
moment

p
force



rewarding

action

15

Laboratory for Perceptual Robotics


Department of Computer Science

STAGE 5:
PickAndPlace


User Defined Reward

a
transport
a
moment

X

-

1

0

X X X

Grasp

X 0
-


X 0 0

X
-

-

1 X X

X 1 1

X 1 0

S
pnp

=



p
g
p
transport

p
moment



rewarding

action

16

Laboratory for Perceptual Robotics


Department of Computer Science

Conclusions


Mechanisms for creating
hierarchical programs.


recursive formulation of potential functions and value functions.



control theoretic representation for
action, state,

and
intrinsic

reward
.



Experimental demonstration of programming manipulation
skills using
staged learning episodes.



Intrinsic reward pushes out
new behavior
and models the
affordances

of objects.

17

Laboratory for Perceptual Robotics


Department of Computer Science

Thank You