Developing Control of a High-DOF Robot Using Reinforcement Learning, Genetic Algorithms, Scripting, and Simulation

powemryologistΤεχνίτη Νοημοσύνη και Ρομποτική

23 Οκτ 2013 (πριν από 4 χρόνια και 2 μήνες)

84 εμφανίσεις

Developing Control of a High
-
DOF Robot Using Reinforcement Learning, Genetic
Algorithms, Scripting, and Simulation


William R. Hutchison, Betsy J. Constantine, Johann Borenstein, and Jerry Pratt

Abstract


Controlling high degree of freedom (DOF) mobile rob
ots in complex natural environments is
challenging and is arguably beyond the abilities of developers to solve analytically in an
acceptable amount of time and cost. Automated control development methods such as learning
and genetic algorithms (GAs) have b
een successfully applied to a number of difficult robot
control problems. GAs and learning methods, however, each have limitations when used as the
sole method for developing control. Although neural networks are effective in dealing with
complexity on the

sensory side
, a high
-
DOF robot application has both complex inputs and
multiple outputs, and traditional neural network learning methods suffer from the credit
assignment problem. On the other hand, a purely GA approach is limited in the number of
paramet
ers that can be handled, and thus is inappropriate when sensory input is complex, such as
from video or
LIDAR
input.

This paper describes the Seventh Generation (7G) Control System, a
biologically
-
inspired
software system that combines genetic algorithm m
ethods with a reinforcement learning (RL)
neural network
system. A control agent, which accepts sensor data as input and outputs control
actions, is based on a neural network implementing a reinforcement learning process. An
integrated genetic algorithm sy
stem modifies
fixed
parameters of the control agent to evolve the
best control agent based on fitness. Fitness is measured by the success of a control agent in
learning to control behavior of a simulated model of the robot in selected simulated terrains.

A
n innovative feature of the 7G learning system allows developers to create a program, called
a script, as a foundation from which to bootstrap 7G’s learning.
The script

is
based on the
developer’s

analysis of the relations be
tween sensory input and actions
.

Unlike the learning
-
by
-
example method, e
ven if the script does not always select correct actions, it greatly accelerates
the development of control by reducing the search space for the learning algorithm.

Learning and evolutionary methods require very la
rge numbers of control agents and/or
training trials, so using a real robot to develop control is rarely feasible. It is far more efficient to
conduct the control development process in a simulated environment where automated tools are
available. According
ly, the 7G system was integrated
with the Yobotics

Simulation Construction
Set, a 3D phys
ics
-
based simulation system

that provided measures of effectiveness of the control
and allowed the developer to visualize progress during the
iterative
development pro
cess.

While the

7G Control System

has been used to develop control
of

several high DOF robots,
the
example described in
this paper
is development of

a
control
system for
the OmniTread OT
-
4
,

a
high DOF
serpentine robot

from the University of Michigan
.


Aut
hors Names and Contact Information

William R. Hutchison

Behavior Systems, LLC

5475 Tenino Avenue

Boulder, CO 80303

(720) 289
-
0737

whutchi@behaviorsys.com


Betsy J. Constantine

Context Systems

Carl
i
sle, Massac
husetts

constantine@contextsystems.net


Johann Borenstein

The University of Michigan Advanced Technologies Lab

Ann Arbor, Michigan

johannb@umich.edu


Jerry Pratt

Yobotics, Inc.

Cincinnati
, Ohio

jpratt@yobotics.com


Presenting Author
’s

Brief Biography


Dr. William Hutchison earned an undergraduate degree
in Mathematics and Psychology
from Kansas University and a Ph.D. in Psychology from SUNY
-
Stony Brook. He taught in
doctoral programs in Psychology at SUNY
-
Stony Brook and in Behavioral Systems Analysis at
West Virginia University.
In 1986 he

cofounded Beha
vHeuristics, Inc., where he developed one
of the first
, and the world’s largest,

commercial application

of modern
neural networks
,
interacting with 200 human analysts and earning $140 million for USAir its first year.

Dr.
Hutchison

has collaborated extensi
vely with Dr. Betsy Constantine in

robotic
s

research

focused
on

developing
a wide range of
complex robot behavior
s in a range of robots
.