for Control of and Systems 03 March NEU talk.pdfFeedback Control Systems Aircraft autopilots Car...
Transcript of for Control of and Systems 03 March NEU talk.pdfFeedback Control Systems Aircraft autopilots Car...
Moncrief-O’Donnell Chair, UTA Research Institute (UTARI)The University of Texas at Arlington, USA
and
F.L. Lewis, NAI
Talk available online at http://www.UTA.edu/UTARI/acs
Neural Networks for Control of Nonlinear Processes and Systems
Supported by :China Qian Ren Program, NEUChina Education Ministry Project 111 (No.B08015)NSF, ONR
Qian Ren Consulting Professor, State Key Laboratory of Synthetical Automation for Process Industries,
Northeastern University, Shenyang, China
Y.D. Song
Bringing Together Control Theory and Computational Intelligence
He who exerts his mind to the utmost knows nature’s pattern.
The way of learning is none other than finding the lost mind.
Meng Tz500 BC
Man’s task is to understand patterns innature and society.
Mencius
Importance of Feedback Control
Darwin 1850- FB and natural selectionVito Volterra 1890- FB and fish population balanceAdam Smith 1760- FB and international economyJames Watt 1780- FB and the steam engineFB and cell homeostasis
The resources available to most species for their survival are meager and limited
Nature uses Optimal control
Feedback Control Systems
Aircraft autopilotsCar engine controlsShip controllersCompute Hard disk drive controllersIndustry process control – chemical, manufacturingRobot control
Industrial Revolution –Windmill control, British millwrights - 1600sSteam engine and prime movers
James Watt 1769SteamshipSteam Locomotive boiler control
Sputnik 1957Aerospace systems
Relevance- Machine Feedback Control
qd
qr1qr2
AzEl
barrel flexiblemodes qf
compliant coupling
moving tank platform
turret with backlashand compliant drive train
terrain andvehicle vibrationdisturbances d(t)
Barrel tipposition
qd
qr1qr2
AzEl
barrel flexiblemodes qf
compliant coupling
moving tank platform
turret with backlashand compliant drive train
terrain andvehicle vibrationdisturbances d(t)
Barrel tipposition
Vehicle mass m
ParallelDamper
mc
activedamping
uc(if used)
kc cc
vibratory modesqf(t)
forward speedy(t)
vertical motionz(t)
surface roughness(t)
k c
w(t)
Series Damper+
suspension+
wheel
Single-Wheel/Terrain System with Nonlinearities
High-Speed Precision Motion Control with unmodeled dynamics, vibration suppression, disturbance rejection, friction compensation, deadzone/backlash control
VehicleSuspension
IndustrialMachines
Satellite pointing Land Systems
Aerospace
Relevance- Industrial Process ControlPrecision Process Control with unmodeled dynamics, disturbance rejection, time-varying parameters, deadzone/backlash control
ChemicalVaporDeposition
IndustrialMachines
XY Table
Autoclave
Newton’s Law
v(t)
x(t)
F(t)m
)()( tum
tFx
xmmaF
Industrial Process and Motion Systems (Vehicles, Robots)
τqB)+τq+G(q)+F(q)q(q,+Vq)qM( dm )(
Coriolis/centripetalforce
gravity friction disturbances
Actuatorproblems
inertia
Control Input
Lagrange’s Eqs. Of Motion
Lagrange Dynamical systems
Dynamical System Models
)()()(
xhyuxgxfx
Nonlinear system
Continuous-Time Systems Discrete-Time Systems
)()()(1
kk
kkkk
xhyuxgxfx
Linear system
CxyBuAxx
kk
kkk
CxyBAxx
1
1/s
f(x)
h(x)g(x)
z-1
xx yu
Control Inputs Internal States Measured Outputs
Issues in Feedback Control
system
Feedbackcontroller
Feedforwardcontroller
Measured outputs
Control inputs
Desired trajectories
Sensornoise
Disturbances
StabilityTracking BoundednessRobustness
to disturbancesto unknown dynamics
Unknown Process dynamicsProcess NonlinearitiesUnknown Disturbances
plant
controlu(t)
outputy(t)
controller
systemidentifier estimated
output
)(ˆ ty
identificationerror
desiredoutput
)(tyd
plant
controlu(t)
outputy(t)
controllerdesiredoutput
)(tyd
trackingerror
plant
controlu(t)
outputy(t)
controller #1
controller #2
desiredoutput
)(tyd
trackingerror
Indirect Scheme
Controller Topologies
Direct Scheme
Feedback/FeedforwardScheme
Feedback LinearizationAdaptive ControlNeural Networks for ControlNeural-adaptive Control
6 US PatentsSponsored by:China Qian RenChina Project 111US NSF, ARO, ONR, AFOSR
Definitions of System Stability
xe
xe+B
xe-B
Const Bound B
tt0 t0+T
T
x(t)
x(t)
t
x(t)
t
Asymptotic StabilityMarginal or Bounded Stability- SISL
Uniform Ultimate Boundedness
)()(
1 kk xfxxfx
d
B(d)
For every B there exists a d
There exists bound B that cannot be prescribed
21 2
1y us a s a
Example 1. Linear System
1 2y a y a y u
desired to track a reference input ( )dy t
de y y Tracking error
1 2de y a y a y u
r e e Sliding variable
1 2dr e e y e a y a y u
du v y e Auxiliary input
1 2r a y a y v Error dynamics
Of the form ( )r f x v
1 2 1 2( ) ( )Tyf x a y a y a a W x
y
Unknown function
Feedback Linearization
Unknown parameters
Known Regression Vector
Example 2. Nonlinear Lagrange System( , ) ( )y d y y k y u
unknown nonlinear damping term unknown nonlinear friction
( )r f x v
( ) ( , ) ( )f x d y y k y with
Assume Linear in the Parameters (LIP)
1
1
( , )( ) ( )
( , )Td y y
f x D K W xk y y
Known possibly nonlinear regression function
Unknown parameters
desired to track a reference input ( )dy t
de y y Tracking error
( , ) ( )de y d y y k y u
r e e Sliding variable
Error dynamicsdu v y e Auxiliary input
Feedback Linearization
Lagrangian System Appears in:Process controlMechanical systemsRobots
plantr(t)
Feedback Linearization Controller
du v y e
u
yy d
d
yy e
e
I
r e e
e v?
controller
The equations give the FB controller structure
Feedforward terms
Tracking Loop
plantr(t)
Feedback Linearization Controller
du v y e
u
yy d
d
yy
ee
I
r e e
e
v
The equations give the FB controller structure
( ) vv f x K r
vK
( )f x
Tracking Loop
Feedback terms
( )r f x v Error dynamics
If f(x) is known, use
Adaptive Control
Error Dynamics
( )r f x v r(t)= control errorControl input
Unknown nonlinearities
( )Tr W x v
Error Dynamics
Assume: f(x) is known to be of the structure
( ) ( )Tf x W x
LINEAR-IN-THE-PARAMETERS (LIP)
Known basis set= regression vector – DEPENDS ON THE SYSTEM
Unknown parameter vector
Controller
ˆ ˆ( ) ( ) ( )Tv vv f x K r W t x K r
ˆ( ) ( )W t W W t Parameter estimation error
ˆ( ) ( ) ( )T T Tvr W x v W x W x K r
( )Tvr W x K r
closed-loop system becomes
Est. error drives the control error
ˆ ˆ ( ) TdW W F x rdt
Parameter estimate is updated (tuned) using the adaptive tuning law
Adaptive Control
Pos. def. control gain
ESTIMATE of unknown parameters
plantr(t)
Feedback Linearization Adaptive ControllerA dynamic controller
du v y e
u
yy d
d
yy
ee
I
r e e
e
v
The equations give the FB controller structure
ˆ ˆ( ) ( ) ( )Tv vv f x K r W t x K r
vK
ˆ ( ) TW F x rˆ ( ) ( )TW t x
ˆ ( )f x
Tunable inner loop
Tracking Loop
Feedback terms
Error dynamics
f(x)
Kv
^
Adaptive ControlTuning
Dynamics
r(t)
Adaptive Controller
Adaptive Controller
ˆ ( ) TW F x r
A dynamic controller
Adaptive ControlPerformance of Adaptive Controller: Using the adaptive controller, the closed-loop
system is asymptotically stable, i.e. the control error r(t) goes to zero.
If an additional Persistence of Excitation (PE) condition holds, the parameter estimates converge to the actual unknown parameters.
11 12 2 { }T TL r r tr W F W
Proof:
1{ }T TL r r tr W F W
1 1( ) { } { ( ) }T T T T T Tv vL r W x K r tr W F W r K r tr W F W x r
ˆ ( ) TW F x r ( ) TW F x r
( )Tvr W x K r Error dynamics
Parameter tuning law
TvL r K r
Therefore Lyapunov shows that
0
0
( ), ( )r t W t are bounded( ), ( )r t W t
or
Typical Behavior of Adaptive Controllers
Control errors go to zero and the parameter estimates converge.
This assumes that ( ) ( )Tf x W x holds exactly, and that there are no disturbances in the system.
( )r f x Error dynamics
Control input
Unknown nonlinearities
Assume
know a fixed nominal value or estimate ˆ ( )f x for unknown f(x),
ˆ( ) ( )f f x f x
( ) ( )f x F x
estimation error is bounded like
Known bounding function, maybe nonlinear
ˆ ( ) v rf x K r v
Controller
( ) ,
( ) ,r
F xr rr
vF xr r
Robust Control
Error dynamics
Robust ControlTerm vr(t)
f(x)
Kv
^
Robust Controller
FixedNominal ControlTerm
r(t)
Robust Controller
A NON Dynamic controller
Closed-loop Error dynamics
ˆ( ) ( ) ( ) v rr f x f x f x K r v
( ) v rr f x K r v
r
Performance of Robust Controller:With this control, the closed-loop system is bounded stable with
bounded with a magnitude near
Proof:12
TL r r
TL r r
( ) ( )T T Tv r v rL r f x K r v r K r r f x v
2min ( ) ( ) T
v rL K r r F x r v
2 2min
2min
( ) ( ) ( ) /
( )
v
v
L K r r F x r F x r
K r
r Case 1:
0
2 2min
2min
( ) ( ) ( ) /
( ) ( ) 1 /
v
v
L K r r F x r F x
K r r F x r
Case 2: r
indefinite
( ) ,
( ) ,r
F xr rr
vF xr r
Typical Behavior of Robust Controllers
Control error does not go to zero but does indeed stay small.
Robust ControlAdaptive Control
Robot System[I]
Robust ControlTerm
v(t)
qd
Tracking Loop
f(x)
rKv
Nonlinear Inner Loop
..
^
Multiloop Nonlinear Controller Structure
q = qq.e =
ee.
qd =qdqd.
Adaptive ControlTerm
Adaptive plus robust control
( ) ,
( ) ,r
F xr rr
vF xr r
ˆ ( ) TW F x r
Neural Networks for Control
F.L. Lewis, S. Jagannathan, and A.Yesildirek,Neural Network Control of RobotManipulators and Nonlinear Systems,Taylor and Francis, London, 1999.
NN control in Chapter 4
Neural Network Properties
Learning
Recall
Function approximation
Generalization
Classification
Association
Pattern recognition
Clustering
Robustness to single node failure
Repair and reconfiguration
Nervous system cell. http://www.sirinet.net/~jgjohnso/index.html
Two-layer feedforward static neural network (NN)
(.)
(.)
(.)
(.)
x1
x2
y1
y2
VT WT
Inputs
Hidden layer
Outputs
xn ym
1
2
3
L
(.)
(.)
(.)
Summation eqs Matrix eqs( )( )
T T
T
y W V xW x
K
ki
n
jkjkjiki wvxvwy
10
10
Extra freedom to select basis set by tuning first-layer weights V.
Neural Network Universal Approximation Property
Let f(x) be any smooth nonlinear function
Then f(x) can be approximated by
1
( ) ( ) ( )L
i ii
f x w x x
For appropriate choice of the basis functions ( ), 1,i x i L
Moreover, the approximation error goes uniformly to zero as ( )x L
This was shown by Weierstrass for polynomial bases functions (Taylor series)
Hornik and Stinchcomb, Sandberg showed that is bounded on a compact set For a large class of approximating functions
( )x
Need to find the unknown weights
Do this by NN learning – tuning the NN weights
iw
Common activation functions (.)
Nonlinear System[I]
Unity-Gain Tracking Loop
rKv
Industry Standard- PD Controller
)()()( tetetr Desiredtrajectory
Actualtrajectory
Easy to implement with COTS controllersFastCan be implemented with a few lines of code- e.g. MATLAB
But -- Cannot handle-High-order unmodeled dynamics Unknown disturbancesHigh performance specifications for nonlinear systemsActuator problems such as friction, deadzones, backlash
Controlinpute
e
d
d
yy y
y
u
plantr(t)
Feedback Linearization Adaptive ControllerA dynamic controller
du v y e
u
yy d
d
yy
ee
I
r e e
e
v
The equations give the FB controller structure
ˆ ˆ( ) ( ) ( )Tv vv f x K r W t x K r
vK
ˆ ( ) TW F x rˆ ( ) ( )TW t x
ˆ ( )f x
Control System Design Approach
( ) ( , ) ( ) ( )m dM q q V q q q F q G q
( ) ( ) ( )de t q t q t
r e e
dm xfrVrM )(
Robot dynamics
Tracking Error definition
Error dynamics
Siding variable
( ) M( ( ) ( ) )dMr M e e q t q t e
( ) ( )( ) ( , )( ) ( ) ( )d m df x M q q e V q q q e F q G q
Where the unknown function is
qdRobot System[I]
qe
PD Tracking Loop
r
dm qFqGqqqVqqM )()(),()( Robot dynamics
?controller
)()()( tqtqte d Tracking error
eer Sliding variable
The equations give the FB controller structure
qdRobot System[I]
qe
PD Tracking Loop
r
dm xfrVrM )(
( ) ( )( ) ( , )( ) ( ) ( )d m df x M q q e V q q q e F q G q Where
Error dynamics is
If f(x) is known use the control ( ) vf x K r
Then ( )m v dMr V K r
vK
( )compute f x
Computed Torque Control
Control System Design Approach
dm qFqGqqqVqqM )()(),()(
)()()( tqtqte d eer
dm xfrVrM )(
Robot dynamics
Tracking Error definition
Error dynamics
vrKxVW vTT )ˆ(ˆ Define control input
)()( xVWxf TTApprox. unknown function by NN
Universal Approximation Property UNKNOWN FN.
)()ˆ(ˆ)( tvxVWxVWrKrVrM dTTTT
vm
Closed-loop dynamics
)(~ tvfrKrVrM dvm
qdRobot System[I]
Robust ControlTerm
q
v(t)
e
PD Tracking Loop
rKv
Neural Network Robot Controller
^
qd
f(x)
Nonlinear Inner Loop
..
Feedforward Loop
Universal Approximation Property
Problem- Nonlinear in the NN weights sothat standard proof techniques do not work
Feedback linearization
Easy to implement with a few more lines of codeLearning feature allows for on-line updates to NN memory as dynamics changeHandles unmodelled dynamics, disturbances, actuator problems such as frictionNN universal basis property means no regression matrix is neededNonlinear controller allows faster & more precise motion
Theorem 1 (NN Weight Tuning for Stability)
Let the desired trajectory )(tqd and its derivatives be bounded. Let the initial tracking error bewithin a certain allowable set U . Let MZ be a known upper bound on the Frobenius norm of theunknown ideal weights Z . Take the control input as
vrKxVW vTT )ˆ(ˆ with rZZKtv MFZ )()( .
Let weight tuning be provided by
WrFxrVFrFW TTT ˆˆ'ˆˆˆ , VrGrWGxV TT ˆ)ˆ'ˆ(ˆ
with any constant matrices 0,0 TT GGFF , and scalar tuning parameter 0 . Initialize the weight estimates as randomVW ˆ,0ˆ .
Then the filtered tracking error )(tr and NN weight estimates VW ˆ,ˆ are uniformly ultimately bounded. Moreover, arbitrarily small tracking error may be achieved by selecting large controlgains vK . Backprop terms-
WerbosExtra robustifying terms-Narendra’s e-mod extended to NLIP systems
Adaptive part Robust part
Stability Proof based on Lyapunov Extension
Define a Lyapunov Energy Function
)~~()~~( 21
21
21 VVtrWWtrMrrL TTT
Differentiate
)()'ˆˆ~(~)ˆ'ˆˆ~(~
)2(21
vwrWxrVVtr
xrVrWWtr
rVMrrKrL
TTTT
TTTT
mT
vT
Using certain special tuning rules, one can show that the energy derivative is negative outside a compact set.
Lnegative
)(tr
)(~ tW This proves that all signals are bounded
Problems—1. How to characterize the NN weight errors as ‘small’?- use Frobenius Norm2. Nonlinearity in the parameters requires extra care in the proof
UUB- uniform ultimate boundednes
010
2030
4050
0
12
3
4
5
6
-20
-15
-10
-5
0
5
10
15
weights
W2 weights, x
d=[0.5sin(t) 0.5cos(t)]T
time
W2 w
eigh
ts
NN weights converge to the best learned values for the given system
τqB)+τq+G(q)+F(q)q(q,+Vq)qM( dm )(Structured Control NN
x
q,
2,, qq
q
q
1ˆ M
2ˆ mV
G
F
+)(ˆ xf
Partitioned neural network
Non
linea
r pre
proc
essi
ng
sin( ), cos( )i iq q
i jq q
2iq
coriolis
centripetal
sgn( )iqiqe
1s C
A
xx.Bu2
H(s)
u1
1s C
A
xx.Bu2
H(s)
u1
kkTT
kk uxVWAxx )(1
Chaos in Dynamic Neural Networks
c.f. Ron Chen
Recurrent or Dynamic NN
%MATLAB file for chaotic NN from Jun Wang's paper
function [ki,x,y,z]=tcnn(N);y(1)= rand; ki(1)=1; z(1)= 0.08;a=0.9; e= 1/250; Io=0.65;g= 0.0001; b=0.001;
for k=1: N-1;ki(k+1)= k+1;x(k)= 1/(1+exp(-y(k)/e));y(k+1)= a*y(k) + g -
z(k)*(x(k) - Io);z(k+1)= (1-b)*z(k);
endx(N)= 1/(1+exp(-y(N)/e));
Ie
zgyy
zz
kykkk
kk
/1
1
11
Jun Wang
Kung Tz 500 BCConfucius
ArcheryChariot driving
MusicRites and Rituals
PoetryMathematics
孔子Man’s relations to
FamilyFriendsSocietyNationEmperorAncestors
Tian xia da tongHarmony under heaven
124 BC - Han Imperial University in Chang-an
Actuator Dynamics
=D(u)
ud+
-d-
m+
m-
.
ud+
d-
mu
BacklashDeadzone
System
Feedbackcontroller
Outputs
ActualControl inputsActuator
nonlinearity
AppliedControl inputs
Actuator Nonlinearities
Friction
Friction Compensation
dm qFqGqqqVqqM )()(),()(
Lagrangian System Dynamics – e.g. Robot
FrictionUse the standard NN from Lecture 1
qdRobot System[I]
Robust ControlTerm
q
v(t)
e
PD Tracking Loop
rKv
^
qd
f(x)
Nonlinear Inner Loop
.
.Feedforward Loop
Friction compensation
0 2 4 6 8 10 12 14 16-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
Time(second)
(a)
0 2 4 6 8 10 12 14 16-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
Time(second)
Leng
th(m
eter
)
(a)
0 2 4 6 8 10 12 14 16-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
Time(second)
Leng
th(m
eter
)
(b)
NN Friction Compensator
Desired trajectory
Tracking errors- solid = fixed gain controller, dashed= NN controller
Trajectory Tracking Controller
Position Velocity
position
velocity
Fixed gain
NN
Deadzone
=D(u)
ud+
-d-
m+
m-
u
u
Sat(u)
d+-d-
u
=
-
Key fact
NN Approximation of Functions with Jumps
-3 -2 -1 0 1 2 3 4 50
0.2
0.4
0.6
0.8
1
-3 -2 -1 0 1 2 3 4 50
0.2
0.4
0.6
0.8
1
0(x) 1(x)
-3 -2 -1 0 1 2 3 4 50
0.2
0.4
0.6
0.8
1
-3 -2 -1 0 1 2 3 4 50
0.2
0.4
0.6
0.8
1
2(x) 3(x)
Jump Function Basis Set
1 1
x y
1
2
L
V1T
V2T
1cc
c
W1T
W2T
....
...
2
1
N
Augmented NN
Standard NN
Extra Nodes withJump Activation Functions
MechanicalSystem
Kv[
v
reqd
Estimateof NonlinearFunction
w
--
D(u)u
NN DeadzonePrecompensator
I
II
( )f x
q
dq
NN in Feedforward Loop- Deadzone Compensation
iiiTTTTT
iii WWrTkWrTkUuUWrwUTW ˆˆˆ)('ˆ)(ˆ21
WrSkrwUWUuUSW TTiii
TT ˆ)(ˆ)('ˆ1
Acts like a 2-layer NNWith enhanced backprop tuning !
little critic network
0 5 10 15-1
-0.5
0
0.5
1
x2(k)
time
0 5 10 15-1
-0.5
0
0.5
e2(k)
time
0 5 10 15-2
-1
0
1
2
x2(k)
time
0 5 10 15-1
-0.5
0
0.5
e2(k)
time
Performance Results
PD control-deadzone chops out the middle
NN control fixes the problem
ud+
d-
mu
Backlash
( , , )B u u
A dynamic nonlinearity
u
w
d+
d-
u
w
1/m
+=
Nominal feedforward path + Unknown dynamic nonlinearity
( , , )B u u
Nonlinear
SystemK
v[
v1
rexd
Estimate
of Nonlinear
Function
--
x
( )f x
des
[0
--
yd
(n)
-
Backlash
-
1/s
Filter v2
Backstepping loop
des
NN Compensator
-
dx r
Kb
u
nny
FZ
Dynamic Inversion NN compensator for system with Backlash
U.S. patent- Selmic, Lewis, Calise, McFarland
Performance Results
PD control-backlash chops off tops & bottoms
NN control fixes the problem
0 1 2 3 4 5 6 7 8 9 10-1.5
-1
-0.5
0
0.5
1
time
x 1(t)
PD controller with backlash
0 1 2 3 4 5 6 7 8 9 10-0.1
-0.05
0
0.05
0.1
0.15
time
e 1(t)
0 1 2 3 4 5 6 7 8 9 10-2
-1
0
1
2
time
x 2(t)
PD controller with backlash
0 1 2 3 4 5 6 7 8 9 10-1
-0.5
0
0.5
1
time
e 2(t)
position
velocity
error
0 1 2 3 4 5 6 7 8 9 10-1.5
-1
-0.5
0
0.5
1
time
x 1(t)
PD controller with NN backlash compensation
0 1 2 3 4 5 6 7 8 9 10-0.04
-0.03
-0.02
-0.01
0
0.01
timee 1(t)
position
Tracking error
~x 1
( , )h x xo 1 2
( , )h x xc 1 2
ROBOT
Kv
[vc
kD
KvkpM-1(.)
qxx
1
2
eee
qqqdd
d
x1
x2
z2
1x)(t)(ˆ tr
Neural Network Observer
Neural Network Controller
1~x
111
212
121
~)()()ˆ,ˆ(ˆˆ
~ˆˆ
xKxMxxWz
xxz
t
k
oTo
D
NN Observers Needed when all states are not measuredi.e. Output feedback
TooDo k 1
~)ˆ(ˆ xxFW
oooooo WFWxF ˆˆ~1
Tccc rxxFW ˆ)ˆ,ˆ(ˆ
21
ccc WrF ˆˆ
Tune NN observer -
Tune Action NN -
Recurrent NN Observer
Fuzzy-Neural Control
Neural Network Properties
Learning
Recall
Function approximation
Generalization
Classification
Association
Pattern recognition
Clustering
Robustness to single node failure
Repair and reconfiguration
Nervous system cell. http://www.sirinet.net/~jgjohnso/index.html
USED
???
Input Membership Fns. Output Membership Fns.
Fuzzy Logic Rule Base
NN
Input
NN
Output
Fuzzy Associative Memory (FAM) Neural Network (NN)
INTELLIGENT CONTROL TOOLS
Input x Output u
Input x Output u
Both FAM and NN define a function u= f(x) from inputs to outputs
FAM and NN can both be used for: 1. Classification and Decision-Making2. Control
(Includes Adaptive Control)
NN Includes Adaptive Control (Adaptive control is a 1-layer NN)
x 2
x1
FL Membership Functions for 2-D Input Vector x
1
0
1 0
X1i X1
i+1
X2j
X2j+
1x1
i x1i+1
x 2j
x 2j+
1
Relation Between Fuzzy Systems and Neural Networks
Separable Gaussian activation functions for RBF NN
Separable triangular activation functions for CMAC NN
Two-layer NN as FL System
(.)
(.)
(.)
(.)
x1
x2
y1
y2
VT WT
inputs
hidden layer
outputs
xn ym
1
2
3
L
(.)
(.)
(.)
Standard thresholds
n
FL system = NN with VECTOR thresholds
.e)b,a,z(2l
ii2l
i
li
)bz(ali
liiA
rWKkr)bBaAˆ(KW WWT
W
raKkrWAKa aaT
a
rbKkrWBKb bbT
b
Gaussian membership function
Tuning laws
ControlledPlantKv[ I]
r(t)
-
Input MembershipFunctions
Fuzzy Rule Base
Output MembershipFunctions
xd(t)
e(t)
-
-)x,x(g d
x(t)
Fuzzy Logic Controllers
Dynamic Focusing of Awareness
Initial MFs
Final MFs
Depend on desired reference trajectory
Effect of change of membership function spread "a"
Effect of change of membership function elasticities "c"
2cB )b,a,z()c,b,a,z(
2
22
2
1
c
)bz(a))bz(a(cos)c,b,a,z(
Elastic Fuzzy Logic- c.f. P. WerbosWeights importance of factors in the rules
ControlledPlantKv[ I]
r(t)
-
Input MembershipFunctions
Fuzzy Rule Base
Output MembershipFunctions
xd(t)
e(t)
-
-)x,x(g d
x(t)
)x,x(grK)t(u dv raKkrWAKa aa
Ta
rbKkrWBKb bbT
b
rWKkr)cCbBaAˆ(KW WWT
W rcKkrWCKc ccT
c
Elastic Fuzzy Logic ControlControl Tune Membership Functions
Tune Control Rep. Values
Better Performance
Initial MFs
Final MFs
UnknownPlant
PerformanceEvaluator
InstantaneousUtility
r(t)
DesiredTrajectory
Action Generating NN
x(t)u(t)
tuning
d(t)
R(t)
)(ˆ xfUnknown
Plant
PerformanceEvaluator
InstantaneousUtility
r(t)
DesiredTrajectory
Action Generating NN
x(t)u(t)
FL Critic
tuning
d(t)
R(t)
)(ˆ xf
Fuzzy Logic Critic NN controller
UnknownPlant
PerformanceEvaluator
d(t)
)(^ xg u(t) x(t)
.
R
r
+
v(t)
Action Generating NN
xd(t)Kv
Kv
( 6-15 )
( 6-14 )
+
+
11ˆ;ˆ VW
-(.)
(.)
(.)
(.)
x1
x2
y1
y2
VT WT
inputs
hidden layer
outputs
xn ym
1
2
3
L
REFERENCE
input membershipfunctions
fuzzy rulle base
output membershipfunctions
+
R
Learning FL Critic Controller
,ˆ)ˆ('ˆˆ1111 VrVWrHV TTT
,ˆ)ˆ(ˆ1111 WRrVW TT
2211122222ˆˆ)ˆ('ˆ)()(ˆ WVrVWRrW TTTTT
Tune Action generating NN (controller)
Tune Fuzzy Logic Critic
FL Critic
Action generating NN
Critic requires MEMORY
User input:Reference Signal Performance
MeasurementMechanism
Reinforcement Signal
r(t)
Action Generating Neural Net
PLANT
RobustTerm
Kv
q(t)u(t)
qd(t)
v(t)
g(x)-
-+
Utility
Critic Element
R(t)
d(t)fr(t)
Control Action
( )×
( )×
( )×
( )×
y1
ym-1
ym
Input Layer Hidden
Layer
Output Layer
z2
zN-1
zN
Inpu
t Pre
-pro
cess
ing W
x1
xn-1
xn
1z1=1q
d(t)
Reinforcement Learning NN Controller
High-Level NN Controllers Need Exotic Lyapunov Fns.
1))(sgn()( trtR
)~~(21)( 1
1WFWtrrtL T
n
ii
WFRxFW T )(ˆ
)~~(21)1ln()1ln()( 1)()( WFWtreetL Ttrtr
( ) ( ~ ~ )L trT T sgn +r r W F W1
)~~()(11
1)()(
WFW
T
trtrtrtr
eeL
Reinforcement NN control
Simplified critic signal
Lyapunov Fn
Lyap. Deriv. contains R(t) !!
Tuning Law only contains R(t)
Adaptive Reinforcement Learning
,)(ˆ11 TWR
Critic is output of NN #1
)(ˆ),(ˆ 22 Td Wxxg
Action is output of second NN
,ˆ)(ˆ111 WRW T
,ˆˆ)(')(ˆ211122 WRWVrW
TT
The tuning algorithm treats this as a SINGLE 2-layer NN
Energy-based control