Machine Learning for Fast Quadrupedal Locomotion

journeycartΤεχνίτη Νοημοσύνη και Ρομποτική

15 Οκτ 2013 (πριν από 3 χρόνια και 7 μήνες)

121 εμφανίσεις

Machine Learning for Fast
Quadrupedal

Locomotion




Andrew
Fierro

CS 5331

1

CS 5331

Introduction


This paper presents a way for the robot
to learn an optimal walk autonomously



Challenges


Sparse Training Data


Dynamical Complexity

CS 5331

2

Summary


Parameterized Walk



4 different learning approaches



Results

CS 5331

3

Parameterized Walk

CS 5331

4

Parameterized Walk


12 parameters to modify



The front locus (3 parameters: height, x
-
pos., y
-
pos.)


The rear locus (3 parameters)


Locus length


Locus skew multiplier in the x
-
y plane (for turning)


The height of the front of the body


The height of the rear of the body


The time each foot takes to move through its locus


The fraction of time each foot spends on the ground

CS 5331

5

Learning Algorithm


Pass out different parameters to robots



Have Aibo time itself



Send back results and repeat with a new
set of parameters

CS 5331

6

Learning Algorithm

CS 5331

7

Hill
-
Climbing Algorithm


Parameter vector
π

(initial policy)



t number of policies, each with it’s
modification of
π



Each policy R evaluated



Highest
-
scoring R is new starting point

CS 5331

8

Amoeba Algorithm


Simplex


Figure with N+1 points in N
dimensional space (triangle in 2D space)



Transformations on lowest scoring point



Good reflection
-
> Expansion



Bad Reflection
-
> Contraction



CS 5331

9

Genetic Algorithm


Policies that do well expand through the
population



Policies that do poorly are removed



New generation generated by mutations
and crossovers

CS 5331

10

Genetic Algorithm

CS 5331

11

Policy Gradient Algorithm


Builds on hill climbing algorithm



Instead of choosing best performing
policy as next starting point, estimate the
gradient and follow it



Sample resulting policies around
π

and
move
π

in optimal direction



CS 5331

12

Results

CS 5331

13

Other Tests

CS 5331

14

Questions


Once programmed do they store the
parameters for the next time the robot is
used,
ie

do they have to let it reprogram
itself every time or only once?



Do they use same objective function for all 4
learning algorithms?



What are included in each chromosome in
genetic algorithm? What each individual
represents?

CS 5331

15