Constrained Learning in
Neural Control
Master’s Thesis Defense
Laboratory for Intelligent Systems and Control
Department of Mechanical Engineering and Material Science
Duke University
Mark A. Jensenius
Advisor: Dr. Silvia Ferrari
April 25, 2005
Typical Aircraft Missions
•
Transport / Surveillance
–
Steady, level flight
–
Small maneuvers
•
Combat
–
Extreme maneuvers
•
Exploratory
–
Unmodeled dynamic effects
Business Jet
Climb Angle
Sideslip Angle
Reference Frames
Ground
Business Jet
Velocity
Roll Angle
Business Jet Controls
Rudder
Deflection
Aileron
Deflection
Thrust
Stabilator
Deflection
Design Approach
•
Aircraft Model
–
Decoupled dynamics
–
Linearization
•
Control law
–
Linear control
–
DHP neural network initialization
–
Online training
•
Performance comparison
–
Linear controller
–
Non

adapting neural controller
–
Adapting neural controller
Flight Envelope
Proportional

Integral Controller
y
c
(
t
)
y
s
(
t
)
C
I
H
x
+
+
–
–
–
u
(
t
)
BUSINESS
JET
x
(
t
)
C
B
C
F
Minimize performance metric
With respect to:
Subject to:
Linear Optimal Control Problem
•
Value function
•
Linear control gain matrix,
C
•
Riccati matrix,
P
Linear Quadratic Regulator
Neural Network Controller
y
c
(
t
)
y
s
(
t
)
NN
A
H
x
+
+
+
–
u
(
t
)
BUSINESS
JET
x
(
t
)
C
F
NN
C
Training
Critic Network
Dual Heuristic Dynamic Programming
•
Co

state function
•
Action Network
•
Optimality Criteria
(1)
(2)
Action and Critic Neural Networks
M
1
M
2
a
1
~
a
2
~
1
a
W
A
W
R
V
x
a
~
u
~
or
b
Neural Network Initialization
,
b
,
A
,
W
A
constrained
weights
unconstrained
weights
Zero
Randomized
Design points
Hyperspherical initialization
construction
functions
Neural Network Construction Functions
s
k
1
V
b
n
n
n
Output constraints:
Gradient constraints:
Neural Network Training
new
weights
Current weights
Error function
Training sets
(input/output/gradient)
Training
Algorithm
•
Batch Training
–
Offline initialization
–
Minimize error over many
training sets
•
Incremental Training
–
Online learning
–
Minimize error for one
training set
Neural Network Online Training
Gradient

based training:
RPROP with backtracking
: When a weight’s
error derivative changes sign, restore its previous
Value and decrease its adjustment magnitude.
RPROP with scaling and backtracking
: If
error increases by more than 10%, revert to
and reduce . If error decreases by
less than 0.5%, revert to and increase .
RPROP
: If a weight’s error derivative stays the
same after an update, increase corresponding
component of . Otherwise, decrease and
change the component’s sign.
+
+
+
Gradient Transformation
Function
Gradient Transformation
E
ij
is defined in Appendix B of the thesis.
+
Constrained RPROP
with scaling and backtracking
•
Online

training of the
action network during
a flight maneuver
•
Target obtained from
optimality criteria
•
Error tolerance (10

4
)
suspended for this plot
Design Point
Design Point
Linear
Non

adapting Neural
Adapting Neural
Interpolation Point
Interpolation Point
Linear
Non

adapting Neural
Adapting Neural
Interpolation Point
Linear
Non

adapting Neural
Adapting Neural
Interpolation Point
Linear
Non

adapting Neural
Adapting Neural
Interpolation Point
Action Neural
Network
T=0
T=5
T=10
Constrained
Output MSE
2.729 x10

7
2.404 x10

7
5.555 x10

7
Unconstrained
Output MSE
2.729 x10

7
1.770 x10
12
3.168 x10
11
Constrained
Gradient MSE
8.470 x10

28
7.545 x10

26
4.057 x10

27
Unconstrained
Gradient MSE
8.470 x10

28
7.848 x10

4
1.373 x10

4
Satisfaction of constraints at design points (Mean Square Error)
Revisiting the Design Point
Linear
Unconstrained Neural
Constrained Neural
Extrapolation Point
Extrapolation Point
Linear
Non

adapting Neural
Adapting Neural
Extrapolation Point
Linear
Non

adapting Neural
Adapting Neural
Concluding Remarks
•
Performs optimally at design points
•
Significant performance improvement when
faced with nonlinearities and unknown
dynamics.
•
Recommendations for future work
–
Replace aircraft model with a neural network
–
Real

world implementation
i.e. RC aircraft, submersible, etc
Comments 0
Log in to post a comment