New Iterative learning control for nonlinear time-variant systems … · 2018. 10. 3. · getting...
Transcript of New Iterative learning control for nonlinear time-variant systems … · 2018. 10. 3. · getting...
Sofia University “St. Kl. Ohridski”
Faculty of Mathematics and Informatics
Department of Mechatronics, Robotics and Mechanics
Iterative learning control for nonlinear
time-variant systems describing the
dynamics of robot manipulators
Kamen Krastev Delchev
Assoc. Prof., Ph.D.
Habilitation thesis for promotion to Full Professor in
4.5. Mathematics
(applied mechanics, robotics)
Sofia, Bulgaria
(March), 2018
c ⃝ 2017, 2018 Kamen Delchev, [email protected]
2
С чувство на дълбоко уважение бих искал да изразя своята голяма
благодарност към своите учители: проф. Константин Марков –
ръководител на моята магистърска дипломна работа в СУ „Климент
Охридски“, проф. Л. Лилов – мой преподавател в СУ „Климент Охридски“,
акад. Д. Е. Охоцимский – ръководител на моята докторска работа в
„Институт прикладной математики им. М. В. Келдыша“ – РАН и към своите
колеги проф. Е. Маноах, проф. Е. Захариев и доц. П. Кирязов – и-т по
механика – БАН, проф. Г. Бояджиев и проф. Е. Кръстев – СУ „Климент
Охридски“, проф. А. К. Платонов и вед. науч. сотр. В. С. Ярошевский –
„Институт прикладной математики им. М. В. Келдыша“ – РАН, както и към
всички колеги и приятели, които са спомогнали за моето развитие в науката
и осъществяване на моите научни изследвания.
Бих искала да благодаря на семейството си и по-специално на моята
съпруга, за подкрепата, търпението, разбирането и помощта, които
получавах и получавам в своята научна дейност.
3
П Р Е Д Г О В О Р
Хабилитационният труд се състои от „Общо въведение“, три глави и
„Списък на литературните източници“ и отразява моите научни
изследвания и получените от тях резултати през последните десет години в
областта на „Итеративно управление със самообучение“ (ИУС) за
нелинейни и нестационарни системи, моделиращи динамиката на
манипулационни роботи [1-20]. Изследванията са мотивирани от факта, че
ИУС се явява най-добрия подход за оптимизация на „Програмно
управление (Feedforward control)“ [1, 2, 12] чрез минимизация грешката на
изходната траектория на системата, получена вследствие на неточности на
структурата на модела, на стойностите на неговите параметри и на
детерминираните смущения. Тъй като оптимизацията се постига чрез
изпълнение от реалния обект на определен брой итерации на зададеното
движение, то броят на итерациите и неговата минимизация (бърза
сходимост) е от съществен интерес за практическото приложение на ИУС,
като той се явява мотивация на първоначалните изследвания. Най-
съществен мотивационен ефект за разработване на нов метод за ИУС бе
получен от необходимостта за разрешаване на проблема на „нарастването
на междинната грешка (transient growth)“ при управлението със
самообучение. На практика този проблем обезсмисля приложението на
ИУС и не бе известен ефективен метод за неговото решаване при
нелинейните системи [6, 17]. Предложеният в Глава 1 „Алгоритъм за ИУС
с ограничена грешка“ [17] решава описания по-горе проблем по един
естествен, лесен за реализация и сигурен начин, който се явява и основния
принос в предлагания хабилитационен труд. Във Глава 2 е представен един
подход за стабилизация на нелинейни и нестационарни системи с
дискретизиран сигнал на обратната връзка, базиран на използването на
ИУС за компенсиране чрез Feedforward control на неточностите от
дискретизацията [19]. Глава 3 разглежда едно приложение на
предложеният в Глава 1 нов метод за ИУС при траекторно управление на
виртуален робот PUMA 560 с цел верификация на „Алгоритъм за ИУС с
ограничена грешка“ и сравнение на неговите характеристики с тези на
стандартната процедура за ИУС [20].
4
Резултатите, представени в хабилитационния труд, са публикувани в
статиите [17, 19, 20, 15, 18].
Резултатите, обобщени в първа глава са докладвани на международната
конференция EUROMECH Colloquium 515, Advanced Applications and
Perspectives of Multibody System Dynamics, Blagoevgrad, July 13 – 16, 2010,
Bulgaria.
5
Contents
General Introduction . . . . . . . . . . . . . . . . . . . . . . . 8
1 Iterative learning control for nonlinear systems: bounded-
error algorithm 15
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Iterative learning algorithm with bounded tracking error . 20
1.3 Simulation results . . . . . . . . . . . . . . . . . . . . 26
1.4 Discussion and conclusions . . . . . . . . . . . . . . . . 32
2 Iterative Learning Control with Sampled-Data Feedback 34
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2 Formulation of the problem . . . . . . . . . . . . . . . . 36
2.3 Convergence and robustness of the learning control with a
sampled-data feedback . . . . . . . . . . . . . . . . . . 39
2.4 Simulation results . . . . . . . . . . . . . . . . . . . . 48
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 55
3 Iterative learning control with bounded transient errors: an
application for PUMA 560 robotic manipulator 57
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2 Dynamic model of PUMA560 robot . . . . . . . . . . . 59
3.3 Iterative learning control for robotic manipulators. . . . . 61
3.4 Iterative learning algorithm with a bounded tracking error 66
3.5 Simulation results and discussions . . . . . . . . . . . . 70
3.5.1 Simulation results in cases of an unacceptable initial
tracking error . . . . . . . . . . . . . . . . . . . . . . 72
3.5.2 Simulation results in cases of an unacceptable transient
error . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.5.3 Simulation results in cases of both initial and transient
error . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 82
4 References 84
Appendix A 88
Appendix B 90
7
General Introduction
8
Survey of the state-of-the-art in the field and
motivation
Iterative learning control (ILC) improves the tracking accuracy of repetitive
processes. The ILC is based on the idea that the information from previous trial
is used to update the control input in order to obtain better performance of the
assigned task on the next trial [1-17]. Some other definitions of ILC are quoted
by Ahn et al. in [2]. A common feature of them is the “repetition” of the assigned-
task performance and the usage of the information of previous trials (passes,
iterations, cycles or repetitions).
Therefore, the ILC is applicable in optimization of repetitive processes,
especially of industrial robot motions (motions of ridged-body robots in
industrial environment) [14-16, 19, 20].
In classical ILC the following postulates are required [2, 5], although
recently, some ILC algorithms are reported so that these postulates could be
broken or relaxed (adopted from [2]):
Every trail ends in a fixed time of duration.
In this Habilitation Thesis (Chapter 1) we present new bounded-error
learning algorithm which does not satisfy this postulate.
Repetition of the initial setting (initial state coordinates) is satisfied.
This postulate also has been weakened by Heinzinger et al. in [3, 4], where
the influence on the ILC convergence of errors in the initial time is proven.
Invariance of the system dynamics is ensured throughout the repetition.
The system output is measured in a deterministic way.
According to the first postulate, the control input trajectory should to be
updated off line and this is an advantageous way of updating.
Classical offline ILC scheme for a plant with a feedback controller attached
is depicted in Fig. 1.
9
where: P represents a plant; C is a feedback controller; L is a learning controller;
the input trajectory lu is the feedforward term of the control law; ly is the actual
output trajectory; dy is the desired output trajectory. The feed forward term 1lu
improves trajectory tracking and l is the iteration number.
With respect to the learning controller L (learning update law), ILC can be
categorized as P-type (the update - proportional to the state-space error), D-type
(the update - proportional to the higher derivative error), I-type (the update -
proportional to the state-space error integral), PI-type, PD-type, and PID-type [3,
4, 11, 12].
In similar way, a closed-loop ILC can be classified according to the type of
the attached feedback controller (P, PD, PID …).
Thus, the resultant controller-based (on learning and feedback controllers)
classification of a closed-loop ILC can be P-P or P-PD, or PD-D, or any other
combination of types of the learning and feedback controllers correspondingly
[11].
On the other hand, with regard to the model of the plant, the ILC should be
classified as linear (LILC – an ILC for linear systems) [2, 7, 9] or nonlinear
(NILC) [1-4, 11-13] iterative learning control.
Linear iterative learning control (LILC) is an ILC for linear systems or based
on a linear model of nonlinear systems [1, 2]. If the linear approximation of a
nonlinear dynamics results in great uncertainties, the corresponding LILC may
fail to ensure the admissible tracking accuracy. In this case, one should resort to
nonlinear models and nonlinear iterative learning control (NILC) [1, 12].
C + P
L
+
Fig. 1. Learning control application for a plant with a feedback controller
attached [3].
10
Moreover, the ILC update law could be classified as linear or nonlinear with
respect to the learning operator [12]. The ILC update-laws, proposed in [1] for
linear or nonlinear dynamic systems, utilize learning operators which are
constant gain-matrices or do not concern the robot dynamic model. Another
approach to learning operator synthesis is based on the dynamic models with
estimated parameters, respectively [14].
Classical linear feedback control (PD or PID) has many applications in
practice, but the linear PD/PID control of nonlinear and uncertain systems is not
adequate for precise tracking [11]. Thus the combination of nonlinear feedback
control and nonlinear iterative learning control is an effective approach to
achieve good tracking performance [11].
The main advantages of the offline closed-loop ILC are as follows:
The ILC minimizes trajectory-tracking errors that arise from the
unmodelled dynamics (uncertain systems). It is well known in the
literature [1, 2, 12] that the best approach for an improvement of the
feedforward control for uncertain systems is the Iterative Learning
Control.
The ILC procedure, can avoid closed-loop stability problems because at
the end of the ILC procedure the feedback terms of the control input are
getting reduced, so the feed-forward controller gets domination over the
feedback controller [16].
The calculation of the ILC control update is offline, so the update law
could be complicated and all available tools for data processing (filtering,
interpolation etc.) can be utilized.
Anyway, the ILC applicability suffers from the following basic problems:
An admissible number of iterations should produce the desired tracking
accuracy. Therefore, the ILC procedure must be convergent with a high
rate of convergence [1, 9, 11, 12, 14].
In spite of the small enough initial tracking error the ILC could fail due
to non-monotonic convergence that results in a big transient error [2, 6,
11, 14, 15], as it is shown in Fig. 2. The learning transient (transient
11
growth) is the major problem for the applicability of an ILC procedure
[15]
Computer simulations presented in [6, 15, 17] reveal the non-monotony of
maximal tracking errors versus iterations, which leads to inapplicability of the
corresponding ILC. Therefore, with regard to the conditions for monotonic
convergence of the ILC, numerous publications are available and most of them
consider linear systems [2, 7, 9]. A monotonically convergent ILC for nonlinear
systems (Hamiltonian systems) is proposed in [13], but an assumption for large
enough feedback gains must be valid. Similar ideas of monotonically Switching
Gain PD-PD type ILC, for a specific class of nonlinear time-varying systems,
based on increasing feedback gains are addressed in [11]. Following these works
and the ideas of accelerating the ILC convergence, reported in [2], in [18] a
simulation-based design of monotonically convergent PD type ILC for nonlinear
time-varying systems is proposed. It has to be mentioned, that in [18] an
Fig. 2. Maximal tracking error of TT3000 robot.
12
engineering approach to practical synthesis of a monotonically convergent ILC
law is presented while the general solution (bounded-error algorithm) of the
problem of transient errors (see Fig. 2) is presented in [17].
An NILC algorithm, aimed at solving problems of a big initial tracking error
and/or the transient growth, is proposed in [15]. The algorithm is based on the
fact that the time interval of the trajectory performance is divided into N
subintervals and the NILC is executed for the first subinterval, than for the firs
and the second subintervals, and so forth. Considerable difficulties in
implementation of this algorithm arise from the necessity of determining N
through computer simulations so that the acceptable maximal ILC tracking error
to be obtained.
The main purpose of this habilitation thesis is to present some new methods,
ILC designs, results and applications of the ILC theory motivated by the
mentioned above problems of the ILC applicability (in particular – the transient
growth in ILC) and stabilization problems of nonlinear time-variant systems with
sampled-data feedback that arise from the discretization of the feedback signal.
The work is organized as follows:
In Chapter 1 [17] we propose a new approach (bounded-error algorithm)
which is motivated by the major problem for the ILC implementation – the
transient growth, and it is not addressed to the monotonic convergence of the ILC
procedure but it limits the tracking (transient) error (Fig. 2) in a simple, efficient
and safety way with respect to the application of the learning procedure. This
algorithm ensures that trajectory-tracking errors of the proposed NILC, when
implemented, are bounded by a given error norm bound. A special feature of the
algorithm is that the trial-time interval is finite but not fixed as it is for the other
iterative learning algorithms. A sufficient condition for convergence and
robustness of the bounded-error learning procedure is derived.
Chapter 2 [19] deals with the improvement of the stability of sampled-data
(SD) feedback control for nonlinear multiple-input multiple-output time varying
systems, such as robotic manipulators, by incorporating an off-line model based
nonlinear iterative learning controller. The proposed scheme of nonlinear
iterative learning control (NILC) with SD feedback is applicable to a large class
of robots because the sampled-data feedback is required for model based
feedback controllers, especially for robotic manipulators with complicated
dynamics (6 or 7 DOF, or more), while the feedforward control from the off-line
iterative learning controller should be assumed as a continuous one. The
robustness and convergence of the proposed NILC law with SD feedback is
13
proven, and the derived sufficient condition for convergence is the same as the
condition for a NILC with a continuous feedback control.
Chapter 3 [20] is concerned with a model-based nonlinear iterative learning
control (NILC) for nonlinear multiple-input and multiple-output mechanical
systems of robotic manipulators. A new strategy for the NILC implementation
based on the new bounded-error algorithm is proposed. This algorithm ensures
that trajectory-tracking errors of the proposed NILC, when implemented, are
bounded by a given error norm bound. Both standard and bounded-error learning
control laws with feedback controllers attached are considered. The NILC
synthesis is based on a dynamic model of a 6-DOF robotic manipulator. The
dynamic model includes viscous and Coulomb friction and input generalized
torques are bounded. With respect to the bounded-error and standard learning
processes applied to a virtual PUMA 560 robot, simulation results are presented
in order to verify maximal tracking errors, convergence and applicability of the
proposed learning control.
14
Chapter 1
Iterative learning control for nonlinear systems:
bounded-error algorithm
1.1 Introduction
The problem of the increased transient error (transient growth) of an ILC system
is due to the non-monotonic convergence of the tracking error (Fig. 1.2). The
problem formulation is based on the convergence theorem presented by
Heinzinger et al. in [3, 4] concerning a learning scheme (Fig. 1.1):
where: P represents a plant; C is a feedback controller; M is memory; the input
trajectory lu is the feedforward term of the control law; ly is the actual output
Fig. 1.1. Learning control scheme with a feedback controller attached.
l u
C
P
M
d y
L
l y
l+1 u
d y
15
trajectory; dy is the desired output trajectory. The feed forward term
1lu
improves trajectory tracking performance according to the following standard
update law ( 0,1,2...)l :
1( ) ( ) ( ( ), )( ( ) ( ))l l l d lt t t t t t u u L y y y (1.1)
where : [0, ]t t T , is the tracking time and [0, ]T is the time interval of the trial
duration; ( ( ), )l t tL y is a nonlinear time-varying learning operator and 0u is the
initial control input.
The learning scheme in Fig. 1.1 is convenient to describe the following class
of multi-input-multi-output nonlinear time-varying state-space equations:
( ) ( ( ), ) ( ( ), ) ( ) ( )
( ) ( ( ), )
l l l l l
l l
t f t t B t t t t
t g t t
x x x u
y x (1.2)
where, for {0,..., }l and all [0, ]t T , ( ) n
l t x R , ( ) m
l t y R , ( ) r
l t u R are
not necessarily continuous, and ( ) n
l t R represents both deterministic and
random disturbances. The functions : [0, ]n nf T R R and
: [0, ]n n rB T R R are piecewise continuous in [0, ]t T and
: [0, ]n mg T R R is differentiable in x and t , with partial derivatives ( , )xg
and ( , )tg . In addition, the following assumptions hold (adopted from [3, 4]):
I. For each fixed initial state (0)x with ( ) 0 the output map
: ([0, ], ) ([0, ], )r n mO C T C T R R R and the state map
: ([0, ], ) ([0, ], )r n nS C T C T R R R are one-to-one. In this notation
( ) ( ( ), (0))l l lO y u x and ( ) ( ( ), (0))l l lS x u x .
II. The disturbance ( )l is bounded on [0, ]T i.e. ( )t b , is the
Euclidean norm.
III. The functions ( , )f , ( , )B , ( , )xg , and ( , )tg are uniformly globally
Lipschitz in x on the interval [0, ]T .
IV. The operators ( , )B and ( , )xg are bounded on [0, ] nT R .
16
V. All functions are assumed measurable and integrable.
Lemma 1.1 [3]. If { }, {0,..., }la l is a sequence of real numbers such that
1 , 0 1l la a , then 1limsup (1 )ll
a
.
The proof of Lemma 1.1 is presented in [3].
Let us consider the following learning update law
1 0( ) (1 ) ( ) ( ) ( ( ), )( ( ) ( ))l l l d lt t t L t t t t u u u y y y (1.3)
where : [0, ]m r mL T R R is a bounded learning operator and [0.1) allows
the influence of a bias term [3, 4].
Theorem 1.1 [3, 4]. Let the system described by equation (1.2) satisfy
assumptions I-V and use the update law (1.3). Given an attainable desired
trajectory ( ) ( ( ), ), [0,T]d dt g t t t y x , ( )d tu is the corresponding input and
( )d tx is the corresponding state (according to assumption I). If
(1 ) ( ( , ), ) ( , ) ( , ) 1,
( , ) [0, ],
x
n
L g t t g t B t
t T
I x x x
x R (1.4)
and the initial state error (0) (0)d lx x is bounded by 0xb , then, as l , the
error between ( )l tu and ( )d tu is bounded. In addition, the state and output
asymptotic errors are bounded. These bounds depend continuously on the bound
on the initial state error, bound on the state disturbance, and , as 0xb , b , and
tend to zero, these bounds also tend to zero.
The proof of Theorem 1.1 is presented by Heinzinger et al. in [3, 4], where,
in particular, it is proven that
1( ) ( ) , 0 1l lt t
u u (1.5)
and
17
( ) (0) ( ) ,
( ) ( ) ,
l l l
l g l
t k t
t k t
x x u
y x (1.6)
where ( ) ( ) ( )l d lt t t u u u , ( ) ( ) ( )l d lt t t x x x , ( ) ( ) ( )l d lt t t y y y ,
( , ) 0k k T , , 0k k , and combines 0xb , b and
0 u , and
(0, )gk is the Lipschitz constant for ( ( ), )lg t tx . The time-weighted norm (
norm) is defined by
[0, ]
( ) sup ( )t
l lt T
t e t
u u . (1.7)
Applying Lemma 1.1 to the inequality (1.5) and using Eq. (1.6) yields
1
1
0
limsup ( ) (1 ) ,
limsup ( ) , ( (1 ) ).
ll
l y y g xl
t
t b b k b k
u
y (1.8)
If 0 0xb , 0b , and 0 , then 0 and 0yb .
Because in all practical cases the input signal belongs to a compact subset of r
R further we will consider that [0, ]
( ) : sup ( )l l ut T
t t b
u u . If the ILC update law
(see Eq. (1.3)) produces a feedforward input that is out of the allowable compact
set of inputs, and this set is a convex set, then projecting back into the allowable
set ensures the validity of Theorem 1.1 [3, 4].
Thus, the robustness and monotonic convergence of the control input error is
proven regarding to the corresponding time-weighted norm and the monotonic
convergence of the state error time-weighted norm ( )l t
x , and the output error
time-weighted norm ( )l t
y also is proven (see Eq. (1.6)) [3, 4]. Moreover, the
convergence of PD-type and PID-type of the standard learning update law, (Eq.
(1.3)), with nonlinear proportional (P) and integral (I) learning gains is proven in
[3]. Unfortunately, the monotonic convergence of the time-weighted norm
18
( )l t
u is not a sufficient condition for monotonic convergence of the
corresponding supremum norm: [0, ]
( ) sup ( )l lt T
t t
u u because:
( ) ( ) ( ) , [0, ]T
l l lt t e t t T
u u u (1.9)
Obviously, if ( ) , 0l t
u or (and) 0T , is convergent to zero with
respect to l , then ( )l t
u and ( )l t
y also are zero convergent (see Eq.
(1.9)) but a large transient error ( )l t
y is possible [2, 6, 12] as it is shown in
Fig. 1.2.
Referring to Fig. 1.2, one can see that the corresponding learning control is
not applicable and an optimization of the ILC is necessary to reduce the transient
Fig. 1.2. Maximal tracking error of PUMA 560
robot
19
error. However, the only approach, reported in the literature [2, 12], for transient
error reduction is the achievement of a monotonic convergence of the tracking
error by adding conditions on the learning operator (learning gains) [2, 11, 13].
In the next section we present new approach to the problem of the increased
transient error of an ILC system. It is based on the natural idea of stopping the
tracking of the desired trajectory if the output error in the time domain is so large
that the further tracking is impossible. For example, if a robot tracks a desired
trajectory and the corresponding tracking error increases dramatically, then the
robot should stop moving to prevent damage to the robot or its surroundings.
This requirement raises two basic questions:
How to update the feedforward control (Eq. (1.1)) using only a part of the
data ( ), [0, ]s
l lt t Ty where s
lT T in case of a premature termination of
the trail performance due to the increased tracking error?
If we have answered the first question and continue the learning
procedure, is this procedure an asymptotically convergent one?
Chapter 1 is aimed at answering both questions.
This Chapter is organized as follows. Section 1.2 presents the main result
of this habilitation thesis – a new algorithm for NILC. The new algorithm is
validated and examined by computer simulation in Section 2.3.
1.2 Iterative learning algorithm with bounded
tracking error
In this section, we present the main result of this habilitation thesis, which is a
new algorithm for nonlinear iterative learning control (for the system in Eq. (1.2))
with bounded supremum norm of the tracking error and the proof of its robustness
and convergence.
The main idea of the new algorithm is that the output trajectory at each
iteration must be inside a hypertube of width 2 around the desired trajectory
i.e.:
20
( ) { ( ) : ( ) ( ) , [0, ],
(0, ]}, 0,1,...
s
l d
s
t t t t t T
T T l
y y y y (1.10)
where ( ), [0, ], (0, ]s s
l l lt t T T T y is the output trajectory in equation (1.2) and
T
ye b is a preliminary given error norm bound. It has to be mentioned that
according to Theorem 1.1, the tracking accuracy of the output trajectory of the
standard (update law (1.3)) ILC process is ( ) T
l yt e by , l , (see Eqs.
(1.8) and (1.9)) and obviously, the radius of the hypertube must be greater than
this accuracy.
Given an attainable desired output trajectory and an error bound
( 0)T
ye b the new algorithm for the implementation of the NILC
procedure could be formulated as follows:
i. Set the initial iteration number 0l and begin the iterative procedure.
ii. Starting from the initial position (0)ly the system is tracking the desired
trajectory under the control ( )l tu until ( ) ( )s s
l l d lT T y y or the end
position ( )l Ty is reached. When , (0, ]s s
l lt T T T the tracking process
stops.
iii. After the current tracking performance has finished, the learning
controller updates the feed-forward control term according to the
following learning update law:
*
1 1 0
0 1 0
( , ) (1 ) ( , ) ( ) ( , ),
( , ) ( ),
s s s
l l l l l l
s
t T t T t t T
t T t
u u u u
u u
*
( ( ), )( ( ) ( )), [0, ],
( , ) (0, ];
0, ( , ]
s
l d l l
s s
l l l
s
l
L t t t t t T
t T T T
t T T
y y y
u
(1.11)
iv. If the output error ( )l t
y is less than or equals to an acceptable
tracking accuracy, then exit from the learning procedure, else set 1l l
and go to step (ii).
21
Thus, we are in a position to prove the robustness and convergence of the
new bounded-error learning algorithm.
Corollary 1.1. If the update law (1.3) is replaced with (1.11), then Theorem 1.1
still holds.
Proof. The main idea of the proof is to show that a sequence
*1( , ) , {0,..., }s
l lt T l
u , [0, ]t T (1.12)
exists and
* *1 1( , ) ( , ) , 0 1s s
l l l lt T t T
u u (1.13)
and therefore, that Lemma 1.1 can be applied.
At first, we have to prove that the sequence defined by (1.12) exists and the
inequality (1.13) holds when the update law is given by Eq. (1.11), and
(0, )s
lT T , and the learning process runs in a hypertube of width 2 around the
desired trajectory.
Then we shall prove that there exists * *: , s
ll l l T T and ( )s
l lT y .
Thus, the iterative learning process consists of two phases. During the first phase *{0,1,..., }l l , [0, ]s
lt T , and (0, )s
lT T , and during the second one *l l and
l , [0, ]t T , and s
lT T . Moreover, the update laws (1.3) and (1.11) are
equivalent when s
lT T so we can directly apply Theorem 1.1 during the second
phase when l , and [0, ]t T , and the entire output trajectory (from the
beginning at 0t to the end at time t T ) lies inside the hypertube.
For simplification, we introduce the following notations:
( )l l tx x , ( )d d tx x , ( )l l ty y , ( )d d ty y ,
( )l l tu u , ( )d d tu u , ( )l l tω ω , ( ( ), )l l lf f t t x ,
( ( ), )d d df f t t x , ( ( ), )l l lB B t t x , ( ( ), )d d dB B t t x ,
22
( )( ( ), )lxl tg g t t
x xxx
, ( )( ( ), )dxd tg g t t
x xxx
,
( )( ( ), )ltl tg g t t
t
x xx , ( )( ( ), )dtd tg g t t
t
x xx ,
and gxk , gtk , fk , Bk are Lipschitz constants for ( , )xg , ( , )tg , ( , )f , and ( , )B
respectively, and gxb , udb , and Bb are the corresponding norm bounds on ( , )xg
, ( )d u , and ( , )B , and [0, ]
supd d d dt T
b f B
u .
In the case of (0, )s
lT T we consider the time interval [0, ]s
lT and
( ) ( ) ( )l d lt t t y y y . Using the above notations from Eq. (1.2) we obtain:
( ) ( ) .l xd d d d td xl l l l l tlg f B g g f B g y u u (1.14)
Combining Eq. (1.14) with
( ) ( ) ( )xl d d d xl d d d xl l d l dg f B g f B g B B u u u u ,
taking norms, using the bounds and using the Lipschitz conditions (see
assumptions II-IV) yields:
( ( ))
.
l gx d gt gx f B ud l
gx B l gx
k b k b k k b
b b b b
y x
u (1.15)
Noticing that l u udb b u from inequality (1.15) and inequality (1.6),
using Eq. (1.9), we obtain that ly is bounded and consequently ( )l ty is
Lipschitz in [0, ]s
lt T . Thus, ( )l ty is continuous on [0, ]s
lT and 0(0)l yk y
, and 0 ( )s
y l lk T y , and 0 0
T
y g xk e k b
. Therefore, * (0, ) :s
lt T
*
0( ) 0.5l yt k y .
On the other hand, * *( ) ( )s s
t l l l y lb T t k T t y y , where (0, )yk
is the Lipschitz constant for ( )l ty and
* 1( ) ( ) 0.5 (1 )s
t l l l ub T t k y y , T
u gk e k k
. Consequently,
23
* 0s tl
y
bT t
k
. (1.16)
We can apply Theorem 1.1 for *[0, ]t t so that from Eq. (1.6), using Eq.
(1.9), we have:
*
*
0[0, ]
( ) sup ( )l y u lt t
t k k t
y u , (1.17)
and consequently
*[0, ]
sup ( ) 02
lt t u
tk
u . (1.18)
In the case of (0, )s
lT T from (1.11) and (1.7) for 1( , )s
l lt T
u and
1( , )s
l lt T
u it follows
1[0, ]
1[0, ]
1[ , ]
sup ( , )
sup ( , )
max .sup ( , )
sl
sl
t s
l lt T
t s
l lt T
t s
l lt T T
e t T
e t T
e t T
u
u
u
(1.19)
and
1[0, ]
1[0, ]
1[ , ]
sup ( , )
sup ( , )
max ,sup ( , )
sl
sl
t s
l lt T
t s
l lt T
t s
l lt T T
e t T
e t T
e t T
u
u
u
(1.20)
We consider both possible cases for 1( , )s
l lt T u , when
24
1 1[0, ] [ , ]
sup ( , ) sup ( , )s
l
t s t s
l l l lt T t T T
e t T e t T
u u (1.21)
and when
1 1[0, ] [0, ]
sup ( , ) sup ( , )s
l
t s t s
l l l lt T t T
e t T e t T
u u . (1.22)
In the case when the equality (1.21) holds, if * 1( )* 1: 2 ( )( )t yk b k
u u udk e k b b
then from inequalities (1.16) and (1.18) it
follows that ** *
*[0, ]
sup ( ) ( )s
lk Tk t
l u udt t
e t e b b
u and using
1[ , ]
sup ( , )s
l
s
u ud l lt T T
b b t T
u , and Eq. (1.9) for *[0, ]t t , and the inequality
* *
*1 1
[0, ] [0, ]
sup ( , ) sup ( , )s
l
k t k ts s
l l l lt T t t
e t T e t T
u u we obtain
* *
1 1[0, ] [ , ]
sup ( , ) sup ( , ) .s s
l l
k t k ts s
l l l lt T t T T
e t T e t T
u u (1.23)
Noticing that 0 1 from Eq. (1.19) and Eq. (1.23) we have that
* *
1 1[0, ] [0, ]
sup ( , ) sup ( , )s
l
k t k ts s
l l l lt T t T
e t T e t T
u u . (1.24)
Taking into account that 0 from inequality (1.23) combined with Eq.
(1.24) and Eq. (1.21) by using Eq. (1.7) yields (1.12), and (1.13) for * *max( , )k .
In the case when the equality (1.22) holds, we can apply Theorem 1.1 for
[0, ]s
lt T because the update laws (1.3) and (1.11) are equivalent when
[0, ]s
lt T , and from inequality (1.5) by using equation (1.7) for k we obtain
1 1[0, ] [0, ]
sup ( , ) sup ( , ) .s s
l l
t s t s
l l l lt T t T
e t T e t T
u u (1.25)
25
Combining equalities (1.22) and (1.24) with inequality (1.25) for [0, ]t T ,
and * *max( , )k yields (1.12), and the inequality (1.13).
Now assuming that l we are in a position to apply Lemma 1.1 in the
case when (0, )s
lT T , and [0, ]t T . Thus, from inequalities (1.8) we have that
*1limsup ( , ) ,s
l l yl
t T b
y (1.26)
and consequently *
1( , )s s T
l l l yT T e b y (see Eq. (1.9)). On the other hand, from
the algorithm description (see (ii)) 1( , )s s
l l lT T y implying that *T
ye b
but *
( 0)T
ye b by definition. Therefore, the above assumption is not
valid and * *: , s
ll l l T T and ( )s
l lT y . Thus, the results of Theorem
1.1 are obtained during the second phase of the iterative learning process when *,l l l .
Consequently, Theorem 1.1 still holds.
1.3 Simulation results
In this section, we present the simulation results from implementation of the
error bounded NILC algorithm described in the previous section. These results
are compared with the simulation results from the standard NILC algorithm. We
consider a dynamic model of TT-3000 SCARA-type robot of SEIKO Instruments
Inc. [21].
The Lagrange’s formulation of equations of motion for this robot is [21]:
c Aq b Dq f u , (1.27)
26
211 12 2 1 2 2
212 22 1 2
1 11 1
2 22 2
T T T]1 2 1 2 1 2
( ) ( ) sin (2 ), ,
( ) ( ) sin
sign( )0, , ,
sign( )0
[ , , [ , ] , [ , ] ,
c
c
c
a a q q q q
a a q q
f qd u
f qd u
q q q q q q
q qA b
q q
D f u
q q q
(1.28)
2 2 2
11 2 1 1 1 2 2 1 2
12 2 2 1 2
2
22 2 2 2
( ) 2 cos , ( ),
( ) cos , ,
( ) , ,
a q I m s I m I s
a q m l s
a I m s
q
q
q
(1.29)
and , 1, 2iq i is the corresponding joint angle, id is viscous friction, cif is
Coulomb friction, il is the length of the link 1, 2i given in the specifications
of the robot, is is the position of the centre of the mass, im is the total mass of
the link, and iI is the inertia of the link about its centre of the mass, u is the
vector of generalized torques, l c u u u , and ,l cu u are the feedforward and
feedback terms described below.
We propose the following standard PD update control law which is similar
to the update law described in equation (1.3) when 0 :
1ˆ ( ( ))[ ( )
( )]
l l l d l v d l
p d l
t L
L
u u A q q q q q
q q (1.30)
where: dq is an attainable and desired trajectory: 1 ( ) 0.25cos(2 ) 0.25dq t t ,
2 ( ) 0.5cos( ) 0.5dq t t , [0, ]t T , and lq is the output trajectory at the thl
iteration; ˆ ( ( ))l tA q , 0,1,...,l N is the learning operator; 0( ) 0t u is the initial
feed-forward control input; vL and pL are learning control gains. Using equation
(1.11) the corresponding error bounded learning update law is given by:
*
1 1 0 1( , ) ( , ) ( , ), ( , ) 0,s s s s
l l l l l lt T t T t T t T u u u u (1.31)
27
*
ˆ ( ( ))[ ( ) ( )],
[0, ], (0, ], ( ) ( ) ;
0, ( , ].
l d l v d l p d l
s s s s
l l l l l d l
s
l
t L L
t T T T T T
t T T
A q q q q q q q
u y y
Then, we consider the following stabilizing P-type feedback controller
proposed in [22]
ˆ ˆ[ ( ) ( )]c d v d l p d lK K u A q q q q q b ˆˆl c Dq f , (1.32)
where vK and pK are the feedback gains, and ˆ ˆ ˆ, , A b D , and ˆcf are the
corresponding estimates of , , A b D , and cf in equation (1.27).
In order to simulate system uncertainties (Eqs. (1.27)-(1.32)) we use two sets
of model parameters which are described in Table 1.1.
Table 1.1. The standard and estimated model parameters.
Standard
Parameters
x10-2
[Nm2] [Nm2] [Nm2]
65.962 15.056 14.534
1d [Nms] 2d [Nms] 1cf [Nm] 2cf [Nm]
17.29 12.49 6.46 3.44
Experimental
Estimations
x10-2
[Nm2] [Nm2] [Nm2]
65.553 14.718 18.478
1d [Nms] 2d [Nms] 1ˆcf [Nm] 2
ˆcf [Nm]
17.29 12.49 8.13 5.51
28
Let assume (0) (0) 0l d q q , 0.3[ ],rad 2 [sec]T . The feedback
gains are 2vK and 0pK , and the learning gains are 0vL and 3pL .
As a result of the implementation of the bounded-error NLIC, for 0l we
have the output trajectory at the first iteration depicted in Fig. 1.3 by a dotted line
together with the desired trajectory (a solid line).
The maximum of the initial tracking error, (Fig. 1.3), is 0.2297 [rad] that is
less than and therefore, the end time of the tracking process is 2 [sec]T .
Further, the bounded-error learning process is illustrated in detail in Fig. 1.4.
Fig. 1.3. The desired trajectory (solid line) and the initial
trajectory (dotted line).
29
Fig. 1.4. The bounded-error learning process.
Fig. 1.5. Max. error of the standard (dotted line) and the
bounded-error (solid line) ILC.
30
Trajectories 2 2 1( )l l lq q q , 3,4,5,6,and 7l are shown and corresponding
supremum norms of the tracking errors are bounded, i.e.,
( ) ( ) , 0.3[ ], [0, )s
l d lt t rad t T q q and ( ) ( )s s
l l d lT T q q , and
(0, ]s
lT T . This is shown for the whole bounded-error learning process by the
profile of the maximum tracking error vs. iteration number, which is depicted by
a solid line in Fig. 1.5 where the corresponding profile for the standard NILC is
plotted by a dotted line.
Referring to Fig. 1.5, we can see that the maximum transient error of the
standard NILC (Eq. (1.30)) shown by a dotted line is successfully truncated
(bounded) as a result (the solid line) of the new bounded-error learning algorithm
(Eq. (1.31)).
Fig. 1.6. Tracking times of the bounded-error learning
process.
31
Moreover, the profile of the values of (0, ]s
lT T , 0 15l , plotted in Fig.
1.6, reveals that, in this particular case, the bounded-error learning process is
faster than the standard one (6
2
10 15s
l
l
T T T
).
1.4 Discussion and conclusions
The main feature of the presented approach is that it solves, in the time
domain, the problem of the transient growth of the tracking error, (Fig. 1.5), while
the other methods concerning this problem provide solutions in the iteration
domain. The implementation of the proposed bounded-error learning update law,
(Eq. (1.11)), results in a premature termination of the tracking process in the time
domain if the tracking error norm is greater than the given value. Consequently
this learning algorithm does not satisfy the postulate which states that the time
interval of duration of each trial must be fixed. If we weaken this postulate i.e.,
the time duration of each trial is finite but not fixed and it is less or equal than the
time duration of the desired trajectory, then the bounded-error algorithm satisfies
this relaxed version.
To some extent, the proposed algorithm could be compared to the Newton-
type ILC [12] because the Newton-type ILC changes the learning update law
from linear to nonlinear depending on the supremum norm of the tracking error.
But the great difference between these methods is that the bounded-error
algorithm terminates the tracking process in the time domain (time interval [0, ]T
) in order to bound the increase in tracking error, while the Newton-type ILC
changes the update law in the iteration domain and therefore it satisfies the first
postulate about the fixed trial duration.
We came to the following conclusions:
A nonlinear iterative learning control for a class of nonlinear time-varying
systems is proposed. Using a new learning strategy (bounded-error
algorithm) it ensures that the transient growth of tracking error is bounded
by terminating the tracking process in the time domain.
The robustness and convergence of the bounded-error learning algorithm
is proven with respect to the time-weighted norm of the input error.
The applicability and efficiency of the bounded-error learning algorithm
are tested by a numerical simulation of the proposed learning control for
32
TT-3000 SCARA-type robot of SEIKO Instruments Inc. The simulation
result reveals that the increase in transient tracking error is successfully
bounded and the convergence rate in this particular case is similar to the
convergence rate of the standard learning algorithm. Moreover, the new
learning process is faster than the standard one and this is of considerable
interest for further research.
33
Chapter 2
Iterative Learning Control with Sampled-Data
Feedback
2.1 Introduction
Sampled-data systems are a class of control systems, where a continuous-time
plant is controlled by a discrete-time controller [23-25]. The stabilization of
sampled-data systems (SD stabilization problem) is motivated by the use of
digital computers in most recent controllers. In particular, a robot arm with
rotational joints controlled by a programmable industrial controller is a typical
nonlinear time-varying sampled-data (SD) system. The difference between the
discrete system and the SD one is shown in Fig. 2.1(a) and Fig. 2.1(b),
correspondingly,
a b
Fig. 2.1. (a) Control of a discrete system [23]; (b) Control of a sampled-data
system [23].
where P is a plant and C is a feedback controller, solid lines represent continuous
signals and dotted lines represent discrete signals, and S and H are a sampler
(Analog to Digital Converter) and a holder (Digital to Analog Converter),
34
correspondingly, and [0, ]t T , 0
: [0, ]K
s
k k kt t kt T
, / sK T t , where
denotes the integer part of a real number, and st is the sampling interval.
Three basic approaches for SD controller design are discussed by Petrew et
al. [23]: 1) continuous-time emulation design, 2) discrete-time discretization
design, and 3) sampled-data direct design. Unfortunately, the efficiency of these
methods for SD system stabilization requires sufficiently high sampling rate.
Therefore, in [26] a feedforward gain was proposed to improve significantly
control efficiency. It is well known in the literature [1, 2, 12] that the best
approach for an improvement of the feedforward control for uncertain systems is
the Iterative Learning Control (ILC).
A classical off-line sampled-data ILC scheme for a plant with a sampled-data
feedback controller attached is depicted in Fig. 2.2(a) [24] where sampled signals
(dotted lines) from a continuous plant P are processed with the digital learning
and feedback controllers L and C, and where the inputs and (solid lines)
are obtained by using a hold device H on the discrete-time signals generated by
both controllers. So, the feedback and feedforward channels have equal sampling
rate (Fig. 2.2(a)), and therefore, SD ILC is a particular case of the multirate ILC
[24, 27] shown in Fig. 2.2 b), where the sampling rate for the feedback loop
(dotted lines in Fig. 2.2(b)) is different than the sampling rate for the feedforward
loop (dashed lines in Fig. 2.2(b)).
a b
Fig. 2.2. (a) Sampled-data ILC with a sampled-data feedback and sampled
signals (dotted lines) have the same sampling rates ; (b) Multirate ILC – the
lu cu
l u
C
P
M E M O R Y
y
l y
l+1 u
S
D←A
d
H
A←D
d y
l u
L
S
D→A
H
A→D
l u
C
P
M E M O R Y
y
l y
l+1 u
S
D←A
d
H
A←D
d y
l u
L
S
D→A
H
A→D
35
sampling rate for the feedback loop (dotted lines) is different than the sampling
rate for the feedforward loop (dashed lines).
In this paper we consider the special case of a continuous ILC for continuous
nonlinear time-varying system (plant P in Fig. 2.3) with a sampled-data feedback
controller attached. This case is applicable to ILC for robot manipulators with
multi-joint arms with 4-6 degree of freedom (DOF) [14] or redundant kinematic
structure [28]. This study is aimed to present a model based nonlinear ILC with
SD feedback that improves SD system stabilization (compensates for the
uncertainties of the sampled data feedback), and to prove the stability (uniform
asymptotic convergence) of the new ILC law for robot manipulators, and to
verify the proposed ILC algorithm by computer simulations of PUMA 560 robot
manipulator.
Chapter 2 is organized as follows. Section 2.2 presents the design of a
nonlinear ILC with a SD feedback for robot manipulators. Section 2.3 introduces
the robustness and convergence analysis of the presented learning algorithm. The
proposed ILC procedure with the SD feedback is validated and examined by
computer simulation in Section 2.4.
2.2 Formulation of the problem
We consider robotic manipulator with n-DOF. The nonlinear multiple-input
multiple-output (MIMO) dynamic model of the robot is based on the Lagrange’s
formulation of equations of motion in the space of generalized coordinates:
l l l l l l A(q )q b(q ,q ) Dq g(q ) f u , (2.1)
where: 0,1,...,l N is the iteration number, lq is the 1n vector of generalized
coordinates (joint angles), * **[ , ], 1,...,i
l i iq Q Q i n , **
iQ is the upper joint limit and
*
iQ is the lower joint limit of i
lq ; ( )lA q is the n n symmetric positive-definite
inertia matrix; the 1n vector l lb(q ,q ) takes into account the Coriolis and
centrifugal torques; 1{ , ... , }ndiag D denotes the diagonal n n matrix of the
36
coefficients of viscous friction; ( )lg q is the 1n vector representing gravity
torques; 1 T
1[ ( ) ( )]n
l n lf sign q ... f sign qf is the vector of coefficients of Coulomb
friction and l c u u u is the 1n vector of generalized torques where lu and cu
are feedforward and feedback terms, respectively. The allowable set of
generalized torques is a rectangular hyper-parallelepiped - max max[ , ]i i iu U U ,
max
iU and max
iU are the upper and lower limits of the control signal iu .
The synthesis of an ILC with SD feedback consists of three steps: first,
synthesis of an update control law, then, synthesis of a SD-feedback control law,
and finally, specification of a learning operator.
We assume the following notations: ( )h h and ( , )H H , and denotes
Euclidean norm.
For the nonlinear MIMO dynamic model in equation (2.1), we propose the
following feedforward update law:
1 [ ( ) ( )]l l d l v d l p d lL L u u L q q q q q q , (2.2)
where: 0,1,...,l N is the iteration number of the ILC procedure, dq is an
attainable and desired trajectory and lq is the output trajectory at the thl iteration;
( ( ))l tL L q , is a learning operator; 0 0 ( )tu u is the initial feed-forward control
input; : [0, ]t t T is the tracking time and [0, ]T is the robot tracking time interval;
vL and pL are learning control gains.
It has to be mentioned that, the calculation of 1lu in Eq. (2.2) is offline and
in the case of noisy measurements of lq or lq , a commonly used method is to
low-pass filter the measured data and then differentiate the resultant signal.
Let us assume the following notations:
0
( ) , ( ) , : [0, ]K
k k k s
k k k kh t h H h H t t kt T
, / sK T t and
1* *, *, [ , )
( , ) , ( , ) , 0,1... 1,, [ , ]
k
k kk t k
K
K
h t t th h t h h h t k K
h t t T
is continuous
piecewise-constant function on [0, ]T , and st is the sampling interval.
Then, we consider the following continuous piecewise-constant feedback
control term [22]
37
*, * ˆ ˆˆ ˆ ˆ( , ) : [ ( ) ( )] ,t k k k k k k k k k k k k
c c c c d v l d p l d lt K K u u u u A q q q q q b Dq g f (2.3)
where: ˆ ˆ ( ( ))k
ktA A q and A is an estimate of the inertia matrix A , obtained by a
parameter identification technique, ˆ ˆ ˆ, ,k kb D g , and ˆ k
f are the corresponding
estimates of , ,k kb D g , and k
f , and vK and pK are the feedback gains.
Usually, the learning operator should be selected to satisfy a sufficient
condition for convergence of the ILC algorithm [1,14]. Therefore, the third step
(learning operator selection) of the ILC synthesis will be completed after the
proof of the learning convergence.
Fig. 2.3. Continuous-time ILC scheme with sampled-data feedback.
Equations (2.1), (2.2) and (2.3) describe the following classical off-line ILC
scheme with SD feedback depicted in Fig. 2.3 where P represents the robot arm;
C and L are feedback and feed-forward controllers, respectively; M is the
memory of the control system; the input trajectory lu is the feed-forward term of
the control law l c u u u and 0,1,...,l N is the current iteration number; lq is
the actual output trajectory; dq is the desired output trajectory; S and H are the
sampler and the holder, correspondingly. The off-line computed feed forward
term 1lu improves the tracking performance of the robot on the next iteration.
In contrast to the SD ILC and multirate ILC schemes shown in Fig. 2.2(a) and
l u
C
P
M
y
L
l y
l+1 u
S
D←A
d
H
A←D
d y
l u
cl
u
u
38
Fig. 2.2 (b), we assume that the proposed ILC (Fig. 2.3) is continuous because of
the following reasons.
The digital ILC controller (computer) calculates off-line the
feedforward control term. Therefore, the discretization frequency
(sampling rate) of the output of the learning controller could be as
high as needed. For example, if the robot actuators are controlled by
a PWM (Pulse Width Modulation) signal, the sampling rate of the
feedforward control output faster than the duty cycle command rate
of the PWM will not be reflected in the PWM output [29] and the
feedforward output could be assumed as a continuous one.
The ILC systems with input and output signals that are transmitted
through a communication network (networked ILC systems [25])
require a sampled-data or a multirate ILC, but this is not the typical
case of industrial robot applications. That’s why we don’t consider
networked ILC systems in this paper.
Recently, several multirate ILC schemes have been proposed to
guarantee good learning transient (acceptable transient error) [27].
As the learning transient (transient growth) is the major problem for
the applicability of a nonlinear ILC [6], we use the bounded error
ILC algorithm [17] in order to solve this problem in a safe, fast and
simple manner and consequently multirate ILC schemes are not
needed to solve the learning transient problem.
It has to be mentioned that the output signal of the feedback controller,
depicted by dotted line in Fig. 2.3, has not to be assumed as a continuous one
because it has to be calculated online in real-time. Moreover, the calculation time
of the feedback, generated by the proposed model based (dynamics based)
controller in Eq. (2.3), cannot be neglected especially for a multi-joint robotic
arm and a zero-order holder which generates a piecewise constant signal
(staircase signal) must be used.
2.3 Convergence and robustness of the learning
control with a sampled-data feedback
In this section we present the main result of this work, that is, the proof of the
39
robustness and convergence of the proposed ILC scheme with a sampled-data
feedback.
We consider a class of multi-input-multi-output nonlinear time-varying
systems described by the following state-space equations [3, 4]:
, (2.4)
where, for and all , , , are not
necessarily continuous, and represents both deterministic and random
disturbances. The functions and are
piecewise continuous in and is differentiable in
and , with partial derivatives and . In addition, the following
assumptions hold [3, 4]:
I. For each fixed initial state with the output map
and the state map
are one-to-one. In this notation
and .
II. The disturbance is bounded on i.e. .
III. The functions , , , and are uniformly globally
Lipschitz in on the interval .
IV. The operators and are bounded on .
V. All functions are assumed measurable and integrable.
Let us consider the following learning update law
, (2.5)
where is a bounded learning operator, and
( ) ( ( ), ) ( ( ), ) ( ) ( )
( ) ( ( ), )
l l l l l
l l
t f t t B t t t t
t g t t
x x x u
y x
{0,1..., }l [0, ]t T ( ) n
l t x R ( ) m
l t y R ( ) r
l t u R
( ) n
l t R
: [0, ]n nf T R R : [0, ]n n rB T R R
[0, ]t T : [0, ]n mg T R R x
t ( , )xg ( , )tg
(0)x ( ) 0
: ([0, ], ) ([0, ], )r n mO C T C T R R R
: ([0, ], ) ([0, ], )r n nS C T C T R R R
( ) ( ( ), (0))l l lO y u x ( ) ( ( ), (0))l l lS x u x
( )l [0, ]T ( )t b
( , )f ( , )B ( , )xg ( , )tg
x [0, ]T
( , )B ( , )xg [0, ] nT R
1 0( ) (1 ) ( , )( )l l l d lt L t u u u y y y
: [0, ]m r mL T R R 0( ) ( )d t ty y
40
allows the influence of a bias term [3]. Equation (2.5) describes a
standard D-type NLIC for a class of nonlinear time-varying systems (Eq. (2.4)),
with a nonlinear time-varying learning operator.
Lemma 2.1 [3].
If is a sequence of real numbers such that
, then .
The proof of Lemma 2.1 is presented in [25].
We define the time-weighted norm ( norm) for a function by
.
Lemma 2.2
If is a discrete form of ( ) and
is the corresponding continuous piecewise-constant function on , then
.
The proof of Lemma 2.2 is presented in Appendix 1.
Using equations (2.2) and (2.3) and taking into account that ( )A q is an n n
symmetric positive-definite matrix, the equation (2.1) can be written as:
*,
1 1
1 1
0
( )
0 0
ll t
c
l l
l
qqu
q A b Dq g A
f uA A
. (2.6)
Let us define
l
l
l
qx
q, *,
1 2( , ) ( ) ( , )t k k
l l l l lf f B f q q q q q ,
1 1( )
l
l
f
q
A b Dq g
and
*, *
2 2 2 2ˆ ˆ ˆ ˆ( , ) : ( ( ) ( ))t k k k k k k k k k k k
d v l d p l d lf f f t f K K A q q q q q b Dq g , (2.7)
[0.1)
{ }, {0,1..., }la l
1 , 0 1l la a 1(1 )lim sup ll
a
( ), [0, ]h t t T
[0, ]
( ) sup ( )t
t T
h t e h t
kh ( ), [0, ]h t t T 0
( ), : [0, ]K
k s
k k k kh h t t t kt T
*,th [0, ]T
*, ( )th h t
41
1
0B
A
, and l ly x .
Thus, treating the Coulumb friction as a disturbance 1
0ˆ( )l
f f
A, from
equation (2.6) we obtain equation (2.4) for r m n .
Let us consider the continuous feedback control law proposed in [22] which
corresponds to the discrete feedback law defined by Eq. (2.3):
ˆ ˆˆ ˆ ˆ( , , ) [ ( ) ( )]c l l d v l d p l d lt K K u q q A q q q q q b Dq g f . (2.8)
Using Eq. (2.7) and Eq. (2.8) we obtain the continuous form of *,
2
tf
2ˆ ˆ ˆ ˆ[ ( ) ( )]d v l d p l d lf K K A q q q q q b Dq g . (2.9)
Let us define ( ) ( )l d lt t u u u and ( ) ( )l d lt t y y y . Because in all practical
cases the input signal belongs to a compact subset of rR further we will consider
that ( ), [0, ]: ( )l l ut t T t b
u u where [0, ]
( ) sup ( )l lt T
t t
u u is the supremum norm.
For simplification, we introduce the following notations:
( )l l tx x , ( )d d tx x , ( )l l ty y , ( )d d ty y ,
( )l l tu u , ( )d d tu u , ( )l l tω ω , ( ( ), )l l lf f t t x ,
( ( ), )d d df f t t x , ( ( ), )l l lB B t t x , ( ( ), )d d dB B t t x ,
( )( ( ), )lxl tg g t t
x xxx
, ( )( ( ), )dxd tg g t t
x xxx
,
( )( ( ), )ltl tg g t t
t
x xx , ( )( ( ), )dtd tg g t t
t
x xx ,
and gxk , gtk , fk , Bk are Lipschitz constants for ( , )xg , ( , )tg , ( , )f , and ( , )B
respectively, and gxb , udb , 2f db , Lb , and Bb are the corresponding norm bounds
on ( , )xg , ( )d u , *,
2 ( , )t
df , ( , )L , and ( , )B , and d d d db f B
u .
Now we proceed to prove a sufficient condition for convergence and
42
robustness of the ILC with the SD feedback for robot manipulators described in
Eqs. (2.6), (2.5) and (2.3).
Theorem 2.2
Let the system of robot dynamics with a SD feedback controller attached is
described by the state-space equations in (2.6) where the feedback control law is
given by Eq. (2.7) with the piecewise-continuous function *,
2
tf and the function
2f in Eq. (2.9) is the continuous form of *,
2
tf . If the system in (2.6) with
1 2( , ) ( ) ( , )l l l l lf f B f q q q q q , i.e. – the feedback control is given by 2f in Eq.
(2.9), satisfies Assumptions I-V and use the update law in Eq. (2.5), and the
sufficient condition (2.10) for robustness and convergence of the ILC is satisfied,
, (2.10)
and given an attainable desired trajectory , is the
corresponding input, and is the corresponding state (according to I), the
initial state error is bounded by , then, as , the error
between and is bounded. In addition, the state and output asymptotic
errors are bounded. These bounds depend continuously on the bound on the initial
state error, bound on the state disturbance, and , as , , and tend to zero,
these bounds also tend to zero.
Proof
From Eq. (2.5) it follows
1 0(1 )l d l lL u u u u y . (2.11)
Using the above notations for ly from Eq. (2.4) we obtain:
( ) ( )l xd d d d td xl l l l l tlg f B g g f B g y u u . (2.12)
Combining Eq. (2.12) with *, *,
2 2( ) ( )t t
xl d d d l d xl d d d l dg f B B f g f B B f u u and
(1 ) ( ( , ), ) ( , ) ( , ) 1, ( , ) [0, ]n
xL g t t g t B t t T I x x x x R
( ) ( ( ), ), [0,T]d dt g t t t y x ( )d tu
( )d tx
(0) (0)d lx x 0xb l
( )l tu ( )d tu
0xb b
43
( )xl l d l dg B Bu u , and *,
1 2
t
l l l lf f B f , and *,
1 2
t
d d d df f B f yields
*, *,
1 1 2 2
*,
2
( )( ) ( ) ( )
( )( ) ( ) ( )
t t
l xd xl d d d xl d l xl l d l
t
xl d l d d xl l l td tl xl l
g g f B g f f g B f f
g B B f g B g g g
y u
u u. (2.13)
Combining Eq. (2.11) with d d u u and Eq. (2.13) yields
*, *,
1 1 2 2
1 0 *,
2
( ) ( )( )
( ) ( )(1 )
( )( )
( )
xl l l xd xl d d d
t t
xl d l xl l d l
l l t
xl d l d d
td tl xl l
g B g g f B
g f f g B f fL
g B B f
g g g
u u
u u uu
. (2.14)
Taking norms, using the bounds and using the Lipschitz conditions (see
assumptions II-IV) from Eq. (2.14) we obtain:
1 0
*,
1 2 2
(1 )
( ( ( )))
l xl l l
t
L gx d gt gx f B ud f d l f gx B l gx
Lg B
b k b k b k k b b k b b b b
u I u u
x x. (2.15)
Multiplying (2.15) by te , defining
1 1 2( ( ( )))L gx d gt gx f B ud f dk b k b k b k k b b , 2 2L f gx Bk b k b b and 3 L gxk b b ,
and assuming the sufficient condition for convergence (2.10) the inequality
(2.15) simplifies to
*,
1 0 1 2 3
t t t t t t t
l l l le e e k e k e k e b
u u u x x . (2.16)
Taking the supremum of both sides of (2.16) and applying Lemma 2.2 to *,t
lx yields
1 0
[0, ] [0, ] [0, ]
1 2 3[0, ]
sup sup sup
( ) sup
t t t
l lt T t T t T
t
lt T
e e e
k k e k b
u u u
x. (2.17)
44
Using the time-weighted norm definition and defining 1 2k k k from (2.17)
we obtain
1 0 3l l lk k b u u u x . (2.18)
On the other hand, using the integral expression
( ) (0) ( ( ), ) ( ) ( ( ), ) ( )t
l l l l l lt B f d x x x u xo
,
where *,
1 2( ( ), ) ( ( ), ) ( ( ), ) ( ( ), )tf t t f t t B t t f t t x x x x and *,
2 ( ( ), ), [0, ]tf t t t Tx is
continuous piecewise-constant function, and taking norms for lx we have
1 1
*, *,
2 2
(0)t t
l l d d l l d l
t t
d d l l l
B B d f f d
B f B f d d
x x u uo o
o o
. (2.19)
Adding *, *,
2 2
t t
l d l dB f B f and l d l dB Bu u to Eq. (2.19), and using the bounds, and
the Lipschitz conditions yields
*,
4 5(0)t t t
l l B l l lb d k d k d b t
x x u x xo o o
, (2.20)
where 4 2 1( ( ) )B ud f d fk k b b k and 5 2B fk b k .
Multiplying (2.20) by te we have that
*,
4 5
(0)
.
tt t t
l l B l
t tt t t
l l
e e b e e e d
k e e e d k e e e d b te
x x u
x x
o
o o
(2.21)
Noticing that e is always of the same sign and *,
[0, ] [0, ]
sup ( ) sup ( )t t t
l lt T t T
e t e t
x x , (see Lemma 2.2), and applying the Second
Mean Value Theorem for Integrals [30] to the integral expressions in inequality
(2.21) yields:
45
[0, ]
4 5[0, ] [0, ]
(0) sup
sup sup
tt t t
l l B lt
tt t t
l lt t
e e b e e e d
k e e k e e e d b te
x x u
x x
o
o
. (2.22)
Taking into account that [0, ] [0, ]
sup sup , [0, ]t
l lt t T
e e t T
x x and taking the
supremum of both sides of (2.22) we have that
1
1
6
(0) (1 )
(1 )
T
l l B l
T
l
b e
k e b T
x x u
x, (2.23)
where 6 4 5k k k . Thus, from inequality (2.23) we obtain
1
6
1
6
1
6
( (1 )) (0)
(1 )( (1 ))
( (1 ))
T
l l
T T
B l
T
k e
b e k e
T k e b
x x
u . (2.24)
Combining inequalities (2.18) and (2.24) yields
1
6
6
3 0
6
(0)(1 )
(1 )
(1 )
(1 )
l lT
T
BlT
T
k
k e
kb e
k e
k Tk b
k e
u x
u
u
. (2.25)
Defining 1
6(1 )( (1 ))T T
Bkb e k e and 1
7 6( (1 ))Tk k k e ,
and 1
8 3 6( ( (1 )) )Tk k k T k e , and 7 8 0(0)lk k b x u we have
that
1l l u u . (2.26)
46
Since 1 we can find 6 (1 ) : 1Tk e .
Thus we can apply Lemma 2.1 so that
1
1limsup (1 )ll
u . (2.27)
implying that lu converges to du when l and (0) 0l x , and 0b , and
0 implying that 0 .
Corollary 2.1
If the update law (2.5) is replaced with
, (2.28)
with bounded, then Theorem 2.1 still holds. Equation (2.28) describes a
PD-type ILC.
It has to be mentioned that all corollaries of Theorem 1.1 in Chapter 1
proven in [3] for continuous-time systems are also valid for Theorem 2.2 (for
nonlinear time-varying systems with sampled-data feedback) implying that the
Corollary 2.1 holds and consequently the proposed PD-type ILC update law
(2.2) is robust and convergent if the sufficient condition for convergence (2.10)
is satisfied.
Moreover, for robust convergence the update law (2.2), (2.5) and (2.28) must
contain derivatives of the output [3].
Thus, applying Theorem 2.2 to prove the convergence of considered ILC
algorithm with SD feedback, for the nonlinear system in Eq. (2.6) one can obtain
from Eq. (2.10) the simplest ( 0, l l y x ) sufficient condition for convergence
of the proposed ILC design for robotic manipulators [3]:
1 1 I LA . (2.29)
where I is the identity matrix of size n.
Consequently, the learning operator remains to be specified so that the
sufficient condition for robustness and convergence, Eq. (2.29), to hold.
1 0( ) (1 ) ( , )( ) ( , )( )l l l d l l d lt L t K t u u u y y y y y y
( , )K
47
2.4 Simulation results
In this section, we present the simulation results from implementation of the ILC
with SD feedback described in the previous sections. We consider the dynamic
model of PUMA 560, 6 DOF, robot reported in [31] and given by Eq. (2.1).
Table 2.1. Model parameters of the virtual robot arm.
ξ Links 1 2 3 4 5 6
Mass values [kg] (without Wrist)
i1 6,...,1 , imi - 17.4 4.8 0.82 0.35 0.09
Link center of gravity (COG) [mm]
i2 xiS - 68 0 0 0 0
i3 yiS - 6 -70 0 0 0
i4 ziS - -16 14 -19 0 32
Moments of inertia about COG [kgm2]
i5 xxiI - 0.130 66.0 e-3 1.8 e-3 300 e-6 150 e-6
i6 yyiI - 0.524 12.5 e-3 1.8 e-3 300 e-6 150 e-6
i7 zziI 0.35 0.539 86.0 e-3 1.3 e-3 400 e-6 40 e-6
Effective motor inertia [kgm2]
i8 miJ 0.784 2.305 0.576 0.1057 0.0946 0.1074
Kinematic constants i9
deg.][ 1i 0 -90 0 90 -90 90
i10 m][ 1ia 0 0 0.4318 -0.0203 0 0
i11 m][ id 0 0.2435 -0.0934 0.4331 0 0
Viscous friction coefficients [Nmsec/rad]
i12
0, ii q
3.45 8.53 3.02 - - -
0, ii q
4.94 7.67 3.27 - - -
Coulomb friction coefficients [Nm]
48
i13
0, ii qf
8.26 11.34 5.57 - - -
0, ii qf
8.43 12.77 5.93 - - -
For a realistic computer simulation of parameter uncertainty we are going to
use two sets of model parameters. We assume the explicit dynamic model of
PUMA560 robot arm (left-hand side of the Eq. (2.1)) with parameters reported
by Armstrong et al. [31], Corke, and Armstrong-Helouvry [32]. This first set of
model parameters is described in Table 2.1 and we consider them as virtual
parameters 1 13( ,..., ), 1,...,6i i i ξ , where i corresponds to the link number.
Table 2.2. Model parameter estimates of the virtual robot arm.
ξ Links 1 2 3 4 5 6
Mass values [kg] (without Wrist)
i1 6,...,1 , imi - 22.4 5 1.2 0.62 0.16
Link center of gravity (COG) [mm]
i2 xiS - 103 20 0 0 0
i3 yiS - 5 -4 -3 -1 0
i4 ziS - -40 14 -86 -10 3
Moments of inertia about COG [kgm2]
i5 xxiI - 0.403 74.8 e-3 5.32 e-3 487 e-6 123 e-6
i6 yyiI - 0.969 7.3 e-3 5.20 e-3 482e-6 123 e-6
i7 zziI 0.177 0.965 75.6 e-3 3.37 e-3 572 e-6 58 e-6
Effective motor inertia [kgm2]
i8 miJ 0.776 2.34 0.5823 0.1057 0.0946 0.1074
Kinematic constants
i9 deg.][ 1i 0 -90 0 90 -90 90
i10 m][ 1ia 0 0 0.4318 -0.0203 0 0
i11 m][ id 0 0.2435 -0.0934 0.4331 0 0
49
Viscous friction coefficients [Nmsec/rad]
i12
0,ˆ ii q
3.85 8.89 5.31 - - -
0,ˆ ii q
3.20 11.7 2.91 - - -
Coulomb friction coefficients [Nm]
i13
0,ˆ ii qf
6.74 13.0 5.87 - - -
0,ˆ ii qf
7.24 14.7 4.19 - - -
Torque limits [Nm] i14
maxˆiU 100 180 90 25 25 25
For the learning-control-input synthesis, we assume the PUMA 560 model
parameters estimated by Tarn et al. [33]. This second set of parameters, pointed
out as 1 14ˆ ˆ ˆ( ,..., ), 1,...,6i i i ξ , is shown in Table 2.2 and we consider it as the set
of identification estimates of the virtual parameters.
From Eq. (2.29), following Arimoto’s ideas in [1] for better convergence rate,
we consider a learning operator to be as close as possible to the inertia matrix.
Therefore, we propose the learning operator to be identically equal to , i.e. -
[14]. Thus, the ILC synthesis for robotic manipulators is completed by
specification of the learning operator.
Let the desired trajectory be given in generalized (joint) coordinates by:
1 2 3
4 5 6
( ) 0 ; ( ) 2cos(4 ) 2 ; ( ) cos(4 ) 1 ;
( ) 3cos(4 ) 3 ; ( ) 1.5cos(4 ) 1.5 ; ( ) 0
d d d
d d d
q t q t t q t t
q t t q t t q t
(2.30)
and [0,0.25 ]t .
Thus, for [ 2 ,2 ], 1,2...,6i
lq i , ˆL A , combining Eq. (2.1) with Eq. (2.2)
and Eq. (2.3) yields the control law of the proposed NILC with SD feedback:
A
ˆL A
50
1 1
1 1
1
ˆ [ ( ) ( )]
ˆ ˆ[ ( ) ( )]
ˆˆ ˆ
l l l d l v d l p d l
k k k k k k k
d v l d p l d
k k k
l
L L
K K
Aq b Dq g f u A q q q q q q
A q q q q q b
Dq g f
(2.31)
where {0,1..., }l N and ˆ ˆ ˆ ˆ, , ,A b D g , and f , calculated by using parameters in
Table 2.2, are the corresponding estimates of , , ,A b D g , and f in Eq. (2.1),
calculated by using parameters in Table 1.1 .
We are able to validate inequality (2.29) for [ 2 ,2 ], 1,...,6i
lq i . A
computation of the maximum of the left-hand side of (2.29) results in:
1ˆˆ ( ) ( ) 0.6013il
l lq
ma x I A q ,ξ A q ,ξ with a tolerance of 0.0018 and consequently
the sufficient condition for convergence (2.29) holds with respect to
[ 2 ,2 ], 1,...,6i
lq i , and the NILC procedure proposed in Eq. (2.31) is
convergent.
The maximal error of the iterative learning procedure is given by:
max max( ), 0,1,...,l
le ma x e l N (2.32)
where:
max ( ) ( ) , [0, ]l l dt
e ma x t t t T q q . (2.33)
In fact, there are great differences between values of ξ and ξ reported in
Table 2.1 and Table 2.2, respectively. Consequently a significant maximal initial
tracking error max
0e is expected. Therefore, solving the nonlinear differential
equation (2.1), for 0l and 6vK , and 3pK , with 0 ( ) 0t u ,
*,
0ˆ( , , )t
c c tu u q ξ in Eq. (2.3), for sampling interval 0.1[sec]st , and with initial
conditions 1(0) 0q , and 1(0) 0q results in trajectories 0 ( )tq ( 0 ( )iq t ), which are
depicted in Fig. 2.4 together with the desired trajectories ( ( )i
dq t ).
51
Fig. 2.4. The initial and desired trajectories, (a)0
iq , i
dq , 1,2,3i ; (b)0
iq , i
dq ,
4,5,6i
Simulation results for the first iteration, shown in Fig. 2.4(a), reveal
significant initial tracking errors of the second and third joints. It has to be
mentioned that the on-line computation time of the feedback control (computed
torque calculated by Matlab software) is from 0.0620 sec. to 0.0940 sec. and,
consequently, the sampling interval of 0,1 sec. is adequate for simulation. The
time of the offline computation (update) of the feedforward control is
approximately 30 sec., depending on the solution of the dynamic equations, and
in real implementation this calculation time could be reduced by use of optimal
spline approximation and powerfull computer system. The initial tracking error max
0e given by Eq. (2.33) for 1l arises not only due to parameter uncertainties
but also due to the large sampling interval 0.1[sec]st . For example,
max
0 3.1944 [ ]e rad for 0.1[sec]st and max
0 2.8745 [ ]e rad for 0.0[sec]st .
The calculation of the maximal tracking errors max
le for 0,1...20l by solving
the nonlinear differential equation (2.31) ( 6vK , 3pK , 0vL , 0pL ,
0 ( ) 0t u , 0.1[sec]st , (0) 0l q , (0) 0l q ) reveals the convergence behavior and
the convergence rate of the proposed NILC with SD feedback. The profile of max
le
versus iteration number is depicted (continuous line) in Fig. 2.5.
a b
52
Fig. 2.5. max
le obtained by the standard (cont. line) and bounded-error (dotted
line) algorithms.
Referring to Fig. 2.5 one can see a fast and monotonic convergence of the
iterative procedure and obviously in this case the maximal learning error maxe
given by Eq. (2.32) equals to the initial tracking error max
0e . But a problem for the
practical application of the considered NILC arises because of the large initial
error max
0e and this problem can be solved by using the “Bounded-error algorithm”
for NILC presented in Chapter 1.
The profile of max
le obtained by the “Bounded-error algorithm” (0.5 [rad] error
norm bound) is shown (dotted line) in Fig. 2.5.
In order to illustrate how the feedforward control term ( ), [0,0.25 ]l t t u
compensates for the uncertainties of the sampled data feedback control term *,t
cu
we investigate by simulation the control input at the final 20th iteration which
results in the profiles of *,t
cu , 20u and u shown in Fig. 2.6.
The graphs in Fig. 2.6(a), Fig. 2.6(b) and Fig. 2.6(c) show the feedback *,( , )i t
c c cu u u , feedforward 20( )iu , and the resultant 20( , 1,2,3)i i i
cu u u i control
inputs for the first, second and third joints correspondingly while the Fig. 2.6(d),
Fig. 2.6(e), and Fig. 2.6(c) illustrate the same control terms but for the fourth,
fifth, and sixth joints of the robot arm.
53
Fig. 2.6. Feedback i
cu : (a) 1 3i , (d) 4 6i ; Feedforward 20
iu : (b) 1 3i ,
(e) 4 6i ; 20
i i i
cu u u : (c) 1 3i , (f) 4 6i ; Cont. line: 1,4i ; Dashed line:
2,5i ; Dotted line: 3,6i .
Finally, the value-adding process of the continuous piecewise-constant
feedback control (dotted line) and the piecewise feedforward control (dashed
line) that results in the continuous control input (continuous line) of the second
joint is schematically represented in Fig. 2.7.
The simulation result shows that the piecewise continuous feedforward
control compensates for inaccuracies (instability) that arise from the piecewise-
constant feedback with sampling interval 0.1 sec. This sampling interval is
relatively long and makes possible real-time implementation (real-time
computation) of complex dynamics-based feedback control taking into account
properties of modern CPUs.
b a c
d e f
54
Fig. 2.7. Feedback 2
cu (dotted line), feedforward 2
20u (dashed line) and
2 2 2
20 cu u u (cont. line).
2.5 Conclusions
The continuous nonlinear iterative learning control with sampled-data feedback
proposed in this paper for continuous nonlinear multiple-input multiple-output
time varying systems, such as robotic manipulators, consists of two controllers
(see Eq. (2.31)): a continuous time learning controller (see Eq. (2.2)) and a
sampled-data (SD) feedback controller (see Eq. (2.3)). An off-line ILC scheme
with SD feedback is presented in Fig. 2.3. The SD feedback is required for
feedback controllers based on complicated dynamic models, for instance,
dynamic models of 6 or 7 DOF robotic manipulators (see Eq. (2.31)). In this case,
the nonlinear model based feedback control requires a lot of on-line calculations
that cause a time delay and the corresponding sampling interval which should not
be neglected, while the learning controller requires off-line calculations and
therefore, the nonlinear model-based feedforward control can be assumed as a
continuous one.
We came to the following conclusions for the proposed NILC with SD
feedback for robotic manipulators:
55
The robustness and convergence of the proposed NILC with SD feedback
is proven (by Theorem 2.2) with respect to the time-weighted norm (
norm) of the control input error.
The sufficient condition for robustness and convergence of the proposed
NILC with SD feedback is the same as the corresponding condition of a
standard NILC with a continuous feedback controller attached implying
that the convergence does not depend on the sampling interval, as it is the
case for other types of Sampled-data ILC (Fig 2. (a)) and Multirate ILC
(Fig 2. (b)) controllers.
The proposed NILC with SD feedback for PUMA560 robotic
manipulator is robust and convergent.
The analysis of the iterative learning control simulations on the
PUMA560 robot reveals that at the final iteration the piecewise
feedforward control term successfully compensates for the uncertainties
of the continuous piecewise-constant feedback control term so that the
sum of both terms is a continuous control input.
The proposed solution to the instability problem in the Sampled-Data
feedback is the main contribution of this paper to the ILC implementation
in practice.
In all practical cases of robotic manipulators the input signal belongs to a
compact subset and the output trajectory belongs to an allowable subset
( ). If the ILC update law (see Eq. (2.2)) produces a
feedforward input that is out of the allowable compact set of inputs, and this set
is a convex set, then projecting back into the allowable set ensures the validity of
Theorem 2.1. If the output trajectories are out of the corresponding allowable set
due to trajectory errors, then an application of the “Bounded-error algorithm”
presented in Chapter 1 can solve this problem for NILC.
The main contribution of Theorem 2.2 is that it proves the asymptotic
convergence ( ) of ILC with Sampled-Data feedback controller, because
only the asymptotic convergence of an ILC can assure its practical
implementation.
[ 2 ,2 ], 1,2...,6i
lq i
0
56
Chapter 3
Iterative learning control with bounded transient
errors: an application for PUMA 560 robotic
manipulator
3.1 Introduction
Iterative learning control (ILC) for robotic manipulators is a class of self-tuning
algorithms, which repeatedly implement assigned tasks of robot motions in order
to minimize positioning or trajectory-tracking errors, implementation time,
power costs or improve other motion characteristics. Thus, the ILC is applicable
for industrial robots, which are working in a repeatable manner in a determinate
environment.
The main advantages of the ILC for robotic manipulators are as follows. The
ILC minimizes the trajectory-tracking errors that arise from the unmodeled
dynamics of an actual robot. Theoretically, using only a feedback controller and
increasing the corresponding feedback gains also leads to smaller tracking-errors
[36,37,38], but in practice it may result in an unstable robot motion [37,38]. That
is why a precise feedforward control, obtained by an ILC procedure, can improve
tacking accuracy and can avoid closed-loop stability problems. Moreover at the
end of the ILC procedure the feedback terms of the control input are getting
reduced, so the feed-forward controller gets domination over the feedback
controller [16] and thus it reduces the power costs of the robot motion.
Anyway, the applicability in practice of the ILC for robotic manipulators
suffers from the following basic problems. The number of iterations, which are
necessary for a successful ILC procedure, is very important for the ILC
implementation. Therefore, first, the ILC procedure must be convergent to a
desired trajectory-tracking error and then an admissible number of iterations
should produce the preliminarily given tracking accuracy. Unfortunately, the
conditions for ILC convergence proposed in [1,3,35,36] are only sufficient but
not necessary. Consequently, an ILC procedure could have a very good rate of
57
convergence even if the corresponding sufficient condition for convergence does
not hold [14]. In addition, when these conditions apply the tracking error
converges to zero as the iteration number tends to infinity. So the ILC
convergence with respect to different learning operators or learning gains can be
examined by computer simulation [1,34-36,14]. If the computer simulation of an
ILC procedure reveals a fast convergence for a particular learning operator and
desired trajectory then it is reasonable to expect an acceptable convergence rate
during the implementation of this ILC for the actual robot. The validation of the
sufficient conditions for ILC convergence also may cause problems due to
necessity of knowledge of the real robot dynamics. It should be pointed out that
only estimations of dynamic model parameters are available through
identification techniques [31,33,32]. Therefore, to avoid the above-mentioned
problems, two conditions for ILC convergence are proposed in [14]. Both
conditions concern only estimations of the dynamic parameters and the accuracy
of the applied identification method. An inadmissible error of the initial trajectory
tracking also can spoil the practical applicability of the corresponding ILC
procedure. As it was mentioned, properly tuned feedback gains of a closed-loop
ILC can reduce the initial tracking error but in spite of the small enough initial
tracking error the ILC could fail due to a non-monotonic convergence of the
iterative procedure. The ILC computer simulations presented in [35,36,14,15]
reveal the non-monotony of maximal tracking errors versus iterations, which
leads to tracking errors of some trials of the ILC procedure that exceed the
maximal initial error. An NILC algorithm, aimed at solving problems of the
initial tracking error and non-monotonic convergence, is proposed in [15]. The
algorithm is based on the fact that the time interval of the trajectory performance
is divided into N subintervals and the NILC is executed for the first subinterval,
and than for the firs and the second subintervals, and so forth. Considerable
difficulties in implementation of this algorithm arise from the necessity of
determining N through computer simulations so that the acceptable maximal ILC
tracking error to be obtained.
The objective of this paper is to present and examine, by computer
simulation, a new NILC algorithm for robotic manipulators. This algorithm
avoids problems of the practical implementation of the NILC that arise from an
inadmissible initial tracking error or maximal tracking error of the whole NILC
procedure, i.e. the new ILC algorithm solves the problem that arises from
possible large transient growth ( increasing transient error). Transient growth is
a major problem for an ILC practical application [6], because at the beginning of
the iterative procedure the tracking error temporarily grows very large, before to
58
converge to a small value. This problem is generally avoided by using some
criteria for monotonic convergence [2], but it requires fundamental performance
limitations [39]. The new bounded error learning algorithm (Chapter 1) is not
addressed to a monotonic convergence in order to overcome the transient growth
problem and a classical condition for ILC convergence (Eq. (1.4)) is required.
The chapter is organized as follows. Section 3.2 presents the full dynamic
model of PUMA560, six degrees of freedom, robot [31,32]. Section 3.3 presents
the considered NILC for robotic manipulators. The new algorithm for the NILC
procedure performance is proposed in Section 3.4. The proposed NILC
procedure and the new algorithm are validated and examined by computer
simulation in Section 3.5.
3.2 Dynamic model of PUMA560 robot
We consider PUMA560 robotic manipulator with 6 degrees of freedom (6-DOF).
The nonlinear MIMO dynamic model of the robot is based on the Lagrange’s
formulation of equations of motion in the space of generalized coordinates:
ufg(q)qD)qb(q,qA(q) (3.1)
where: q is the 16 vector of generalized coordinates (joint angles),
6,...,1],2,2[ iqi ; )(qA is the 66 symmetric positive-definite inertia
matrix; the 16 vector ][][ 2qC(q)qqB(q))qb(q, takes into account the
Coriolis and centrifugal torques where ( )B q and )(qC are corresponding 156
and 66 matrices [40]; },...,{ 61 diagD denotes the diagonal 66 matrix
of the coefficients of viscous friction; )(qg is the 16 vector representing
gravity torques; T6611 )]()([ qsignf...qsignf f is the vector of coefficients of
Coulomb friction and cl uuu is the 16 vector of generalized torques where
lu and cu are feed-forward and feedback terms, respectively. The allowable set
of generalized torques is a rectangular hyper-parallelepiped –
],[ maxmaxiii UUu . The symbols ][ qq and ][ 2
q are the 115 vector of velocity
59
products and the 16 vector of squared velocities, respectively. They are given
as follows:
6 ,],...,[][ T12421 nqqqqqqqq...qq,qq nnnnn3121 qq ; T2
621 ],[][ q...q qq .
This dynamic model is appropriate for computer simulation of a learning
control, since matrices )( ),( ),( qCqBqA and the vector )(qg are expressed in
explicit form in [31].
The coordinate frames attached to the PUMA560 links are shown in Fig. 3.1
and are assigned by the modified Denavit-Hartenberg method [41].
For a realistic computer simulation of the NILC procedure, proposed in the
next section, we are going to use two sets of model parameters. We assume the
explicit dynamic model of PUMA560 robot arm (left-hand side of the Eq. (3.1))
with parameters reported by Armstrong et al. [31], Corke, and Armstrong-
Helouvry [32]. This first set of model parameters is described in Table 2.1 and
we consider them as virtual parameters 6,...,1),,...,( 131 iii ξ , where i
Fig. 3.1. Kinematic and design scheme of PUMA 560 robot arm in its initial
position with zero values of generalized coordinates [14].
1
2
3
4
5
6
1X
1Y
1Z
0X
0Y
0Z
2X
2Y
2Z
3X
3Y
3Z
5Y
654 XXX
564 ZYY
64 ZZ
60
corresponds to the link number. For the learning-control-input synthesis, we
assume the PUMA 560 model parameters estimated by Tarn et al. [33]. This
second set of parameters, pointed out as 6,...,1),ˆ,...,ˆ(ˆ141 iii ξ , is shown in
Table 2.2 and we consider it as the set of identification estimates of the virtual
parameters.
The joint-torque limits are described only in Table 2.2 because ξ concerns
the control input u . These limits are calculated using parameter estimates of
PUMA560 motors and drives reported in [32].
Another advantage of usage of the PUMA560 dynamic model is the
availability of more than two sets of parameter estimates [32], so that one can
select the appropriate model parameter values for an accurate computer
simulation of robot motion. However, it is necessary to mention that there are
considerable differences between some values of corresponding parameters
reported in Table 2.1 and Table 2.2, respectively, and an inaccurate tracking of
prescribed robot trajectories should be expected. Thus, using inaccurate
parameter estimates we are able to validate, through a computer simulation, the
efficiency of the new NILC algorithm, proposed in Section 3.4.
3.3 Iterative learning control for robotic manipulators
The control system of an industrial robotic manipulator is typically based on a
feedback controller. An improved ILC scheme for a robotic manipulator with a
feedback controller attached is depicted in Fig. 3.2:
Fig. 3.2. Learning control scheme with a feedback controller attached.
C
P
M
M
L,K
ul ul+1
qd ql
qd
61
where: P represents a plant; C is a feedback controller; the vector of the input
trajectory lu is the feedforward term of the control law; lq is the actual vector of
the output trajectory; dq is the vector of the desired output trajectory. The
learning controller {L,K} improves lu by using the tracking error of the output
trajectory ( )d ly q multiplied by a learning gain matrix K (P-type learning)
and/or the error derivative ( )d lq q multiplied by a learning operator matrix L
(D-type learning). If the learning operator is based on the dynamic model of the
plant then the ILC can be specified as a model-based one.
The realization of the considered ILC scheme (Fig. 3.2) requires the
specification of feed-forward and feedback controllers, respectively. The feed-
forward controller is based on the update control law that improves the feed-
forward control term ( lu ).
For the nonlinear MIMO dynamic model (3.1), we propose the following
update control law:
1( ) ( ) ( ( ))[ ( ) ( ) ( ( ) ( ))
( ( ) ( ))]
l l l d l v d l
p d l
t t t t t L t t
L t t
u u L q q q q q
q q, (3.2)
where: ))(( tlqL , Nl ,...,1,0 is a learning operator; )(0 tu is the initial feed-
forward control input; ],0[: Ttt is the tracking time and ],0[ T is the robot
tracking time interval; vL and pL are positive constants.
The convergence of the iterative procedure (3.2) is necessary for its
successful implementation. A sufficient condition for robustness and
convergence of the proposed update control law (3.2) is proven in [3] for the
considered dynamic model of robotic manipulators (3.1), i.e.:
11 LAI , (3.3)
where I is the identity matrix of size 6; A is the inertia matrix of the dynamic
equations of motion (3.1); L is the learning operator that is to be specified, and
... is the Euclidean matrix norm.
62
Unfortunately, the sufficient condition (3.3) is not applicable in practice,
because the inertia matrix A of an actual robot is not exactly known and only an
estimation of A could be available through identification [31,33]. Therefore, we
have to specify a learning operator that satisfies (3.3), but we are not able to
obtain the learning operator directly from (3.3). That is why, following ideas
from [1] for better convergence rate, we consider a learning operator to be as
close as possible to the inertia matrix and, in particular, it is to be identically
equal to an estimation of the inertia matrix - )ˆ,(ˆ)ˆ,( ξqAξqL [14,15], where ξ
is the vector of estimated parameters stated in Table 2.2. Thus, for a virtual
PUMA560 robot arm based on the dynamic model (3.1) with parameters ξ stated
in Table 2.1, from (3.3) we obtain the sufficient condition for convergence in
case of )ˆ,(ˆ)ˆ,( ξqAξqL , i.e.:
1)()ˆ(ˆ 1 ξq,Aξq,AI . (3.4)
This condition also is not applicable to a real robotic manipulator, but it is
applicable to computer simulation of the virtual robot. So using the explicit
formulation of A reported in [31] and the parameter values of ξ and ξ from
Table 2.1 and Table 2.2, respectively, we are able to validate inequality (3.4) for
6,...,1],2,0[ iqi . A computation of the maximum of the left-hand side of
(3.4) results in:
]2,0[,6013.0)()ˆ(ˆ 1 i
qqxam
i
ξq,Aξq,AI (3.5)
with a tolerance of 0018.0 . From (3.5) it follows that the sufficient condition
for convergence (3.4) holds with respect to 6,...,1],2,0[ iqi and
consequently the proposed NILC procedure (3.2) is convergent. Moreover, since
the inertia matrix A is a periodic function of q with period 2 , i.e.
6,...,1))),(cos()),((sin()( iqq ii AqA [31], the learning procedure (3.2) is
convergent in the whole space of generalized coordinates 6,...,1),,( iqi
.
63
Before specifying )(0 tu , vL and pL it is reasonable to formulate the
feedback term cu of the control law cl uuu . Therefore, we consider the
following stabilizing feedback term [22]:
ˆ ˆ ˆˆ ˆ( , ) ( , )[ ( ) ( )] ( , , )
ˆ ˆ ˆˆˆ ˆ( ) ( , ) ( ), 1,...,
c l l d v l d p l d l l
l l l N
u q ξ A q ξ q K q q K q q b q q ξ
D ξ q g q ξ f ξ, (3.6)
where, according to Fig. 3.2, lq and lq are the feedback generalized coordinates
and velocities of the output trajectory before memorizing it on the current
iteration l , and by means of parameters from Table 2.2 gDbA ˆ,ˆ ,ˆ ,ˆ and f are the
corresponding estimates of gDbA , , , and f of Eq. (3.1).
Using parameters from Table 2.1 and Table 2.2 and combining Eqs. (3.1),
(3.2) and (3.6) yields the following NILC law for robotic manipulators in case of
)ˆ,(ˆ)ˆ,( ξqAξqL :
1 1 1 1 1 1
1 1
1
1 1
( , ) ( , ) ( ) ( , )
ˆ ˆ( ) ( , ) ( , ),
ˆ ˆ ˆˆ( , ) ( , ) ( ( ), )( ( ) ( ))
ˆˆ ( ( ), )[ ( ( ) ( )) ( ( ) ( ))],
ˆ ˆˆ( , ) ( ,
l l l l l l
l c l
l l l d l
l v d l p d l
c l l
t
t t t t t
t L t t L t t
A q ξ q b q ,q ξ D ξ q g q ξ
f ξ u ξ u q ξ
u ξ u ξ A q ξ q q
A q ξ q q q q
u q ξ A q ξ 1 1 1 1
1 1 0
ˆˆ)[ ( ) ( )] ( , , )
ˆ ˆˆˆ ˆ( ) ( , ) ( ), ( ) ( ), [0, ], 0,..., .
d v l d p l d l l
l l dt t t T l N
q K q q K q q b q q ξ
D ξ q g q ξ f ξ q q
(3.7)
Assuming 0)(0 tu from (3.7) for 0l we obtain the initial robot control for
the proposed iterative process as follows:
1 1 1 1 1 1
1 1 1
1 1 1 1
( , ) ( , ) ( ) ( , )
ˆˆ( ) ( , )[ ( ) ( )]
ˆ ˆ ˆˆˆ ˆ ˆ( , , ) ( ) ( , ) ( ).
d v d p d
A q ξ q b q ,q ξ D ξ q g q ξ
f ξ A q ξ q K q q K q q
b q q ξ D ξ q g q ξ f ξ
(3.8)
64
If ξξ ˆ , and consequently
)()(ˆ,ˆ ),,(),(ˆ ),()(ˆ qgqgDDqqbqqbqAqA and ff ˆ ,
then from (3.8) we have the initial-error dynamics:
0))(,( 1111 eKeKeξqA pv , (3.9)
where 1,:1 ldll qqee . In this case, taking into account that ),( qA is
positive-definite, the initial feedback control, Eq. (3.8), is asymptotically stable
if vK and pK are Hurwitz matrices [22]. Therefore, for simplicity, we assume
IK vv K and IK pp K , where vK and pK are positive constants.
In fact, there are great differences between values of ξ and ξ reported in Table
2.1 and Table 2.2, respectively. Consequently, if values of vK and pK are not
specified properly, a significant maximal initial tracking error
],0[,)(1max1 Tttxame
t e (3.10)
is expected because of the initial control (3.8). On the other hand, the increasing
of feedback gains vK and pK leads to reduction of max1e [36-38]. That is why,
varying values of vK and pK , we are able to examine, trough computer
simulation, the new NILC algorithm proposed in the next section in case of a
large or small initial error max1e .
The maximal error of the iterative learning procedure:
Nlexame ll
,...,1),( maxmax , (3.11)
where:
NlTtttxame dlt
l ,...,1],,0[,)()(max qq , (3.12)
65
is of a great importance for the applicability of the proposed NILC (3.7).
Therefore, we are in a position also to vary the learning gains vL and pL in order
to examine the new NILC algorithm in case of a large transient error maxe .
3.4 Iterative learning algorithm with a bounded
tracking error
In this section, we present the main result of this paper, consisting of an
implementation of the new bounded error algorithm in Chapter 1 for the
proposed NILC for robot manipulators (Eq. (3.7)). The algorithm solves
problems of the NILC practical application arising from inadmissible maximal
transient errors maxe (Eq. (3.12)).
Our previous simulation results presented in Chapter 1 and Chapter 2
[14,15] and similar results reported in [35,36] reveal the non-monotony of
maximal tracking errors Nlel ,...,1,max (Eq. (3.11)) versus iterations, as it is
shown in Fig. 3.3.
0 1 0 2 0 3 0 4 0 5 00
0 . 5
1
1 . 5
2
2 . 5
3
3 . 5
4
Fig. 3.3 Convergence of the maximal trajectory error maxmax lq e [rad]
of PUMA560 tracking performance [14].
66
The maximal transient error maxe (Eq. (3.12)) of the iterative procedure
depicted in Fig. 3.3 equals to the corresponding initial error max1e (Eq. (3.10)) and
consequently increasing feedback gains vK and pK (Eq. (3.9)) could reduce it
[36-38], but actually, for a digital controller with a time delay, increasing
feedback gains may cause tracking instability [37,38] . On the other hand, a small
initial error max1e does not guarantee a corresponding smaller maximal transient
error maxe , as it is shown in Fig. 2, Fig. 1.2 and Fig. 3.4.
For instance, if the initial error max1e is admissible for an assigned task and if
the transient error maxe is larger than
max1e (Fig. 3.4), then the ILC procedure could
fail due to a conflict between the robot arm and some obstacle of the environment.
That is why an algorithm for the ILC performance with bounded maximal
transient error maxe is of a great importance for the ILC implementation in
practice.
The main idea of the new algorithm is that at each iteration the robot motion
trajectory must be inside a hyper-tube of width 2 around the desired trajectory
in the space of generalized coordinates, i.e.:
Fig. 3.4. Maximal tracking error [rad] vs. iterations.
67
( ) { ( ) : ( ) ( ) , [0, ], (0, ]}, 1,...,f f
l dt t t t t T T T l N q q q q , (3.13)
where 0 is a preliminary given error bound; ],0(],,0[),( TTTttf
lf
ll q is the
solution of the differential equation (3.7) with initial conditions: )0()0( dl qq
and )0()0( dl qq for the corresponding iteration l . Thus, starting from the initial
position )0(lq with the initial velocity )0(lq the trajectory tracking motion of the
robot continues until the tracking error norm is less than and the robot stops if
this norm is equal to (Inequality (3.13)). Obviously, an immediate stop of the
robot motion is impossible. Consequently, the condition for stopping the robot
has to be more restrictive than (3.13) so that the robot arm starts deceleration in
advance in order for inequality (3.13) to be valid when it stops. Therefore, the
robot should start deceleration when the following condition holds:
NlTTTT sl
sld
sll ,...,1],,0(,0ˆ,ˆ)()( qq , (3.14)
where 6,...,1,,ˆ iqi and iq is the joint angle displacement which
is necessary for the corresponding joint to come to a stop from maximum velocity
at maximum deceleration when (3.14) holds. Thus, the condition (3.14)
guarantees that inequality (3.13) holds and the tracking error Nlel ,...,1,max is
bounded by .
Given an attainable desired trajectory ],0[,6,..,1],2,2[)(),( Ttitqt did q
the feedback gains vK and pK , the learning gains vL and pL , and an 0ˆ error
norm bound the new algorithm for the implementation of the NILC procedure
(Eq. (3.7)) could be formulated as follows:
I. Set the initial iteration number 0l and begin the iterative procedure.
II. Starting from the initial position )0()0( dl qq with the initial velocity
)0()0( dl qq the robot arm is following the desired trajectory under the
NILC law, described on the right hand side of the Eq. (3.7), until
),0[,ˆ)()( sldl Tttt qq or the end tracking position )(Tlq is reached.
68
If during the current tracking performance the condition (3.14) becomes
valid at the very time when ],0(, TTTt sl
sl , then the robot arm comes
to a stop as fast as possible at the maximum deceleration.
III. After the current tracking performance has finished, the learning
controller updates the feed-forward control term according to the
following learning update law:
*1 1 0 1
*
ˆ ˆ ˆ ˆ( , , ) ( , , ) ( , , ), ( , , ) 0,
ˆˆ ( ( ), )[ ( ) ( ) ( ( ) ( ))]
ˆ ˆˆ( , , ) ( ( ), ) ( ( ) ( )), [0, ], (0, ];
0, ( , ]
s s s sl l l l l l
l d l v d l
s s sl l l p d l l l
sl
t T t T t T t T
t t t L t t
t T t L t t t T T T
t T T
u ξ u ξ u ξ u ξ
A q ξ q q q q
u ξ A q ξ q q, (3.15)
which is obtained from (3.2) for the resultant trajectory
],0(],,0[),( TTTtt sl
sll q .
IV. If the maximal tracking error maxle for ],0[),( s
ll Ttt q is less than or equals
to an acceptable tracking accuracy, then exit from the learning procedure,
else set 1 ll and go to step II.
Therefore from (3.1), (3.6) and (3.15) for 0, vvv KK IK , and
0, ppp KK IK , it follows that the NILC law of the new algorithm is:
1 1 1 1 1 1
1 1
*1 1
1 1
( , ) ( , ) ( ) ( , )
ˆ ˆ( ) ( , , ) ( , ),
ˆ ˆ ˆ( , , ) ( , , ) ( , , ), [0, ], (0, ],
ˆ: ( ) ( ) ,
ˆ ˆˆ( , ) ( , )[
l l l l l l
sl l c l
s s s sl l l l l l l
s s sl l l d l
c l l d
t T
t T t T t T t T T T
T T T
A q ξ q b q ,q ξ D ξ q g q ξ
f ξ u ξ u q ξ
u ξ u ξ u ξ
q q
u q ξ A q ξ q 1 1 1 1
1 1 0 0 1
ˆˆ( ) ( )] ( , , )
ˆ ˆˆˆ ˆ( ) ( , ) ( ), ( ) ( ), ( , ) 0, 0,..., ,
v l d p l d l l
sl l d
K K
t t t T l N
q q q q b q q ξ
D ξ q g q ξ f ξ q q u
(3.16)
69
where is the preliminary given tracking error norm bound and fixes slT as it
is described in step II. In this way slT determines the current output trajectory
],0(),( sll Ttt q , which is used for offline updating the feed-forward term )(1 tlu
according to (3.16).
The described above NILC procedure (Eq. (3.16)) is robust and convergent
if the sufficient condition (3.4) holds. The proof is presented in Appendix B.
3.5 Simulation results and discussions
In this section, we present the simulation results from implementation of the
NILC algorithm (Eq. (3.16)) described in the previous section. These results are
compared with the simulation results from the standard NILC algorithm (Eq.
(3.2)) in cases of inadmissible initial tracking errors max1e (Eq. (3.10)) and
inadmissible transient errors 1,max le (Eq. (3.12)), and inadmissible both initial
and transient errors. Also the convergence rate of both NILC procedures (Eq.
(3.7) and Eq. (3.16)) is investigated through a simulation study of corresponding
maximal tracking errors Nlel ,...,1,max (Eq. (3.11)) versus iterations.
Using model parameters stated in Table 2.1 and Table 2.2 we are in a position
to perform computer simulations of proposed both the standard and new (Eq.
(3.2) and Eq. (3.15)) learning control procedures for the virtual robot PUMA 560
carrying out the following steps:
I. Define a desired trajectory in the space of generalized coordinates: T
61 ))(,...),(()( tqtqt ddd q , 6,...,1],,0[],2,2[)( iTttqd
i ;
II. Define matrices DξqCξqBξqA ),,( ),,( ),,( and vectors ),( ξqg and f
using model parameters of the virtual robot arm (Table 2.1) and formulas
reported in [31];
III. Define matrices DξqCξqBξqA ˆ),ˆ,(ˆ ),ˆ,(ˆ ),ˆ,(ˆ and vectors )ˆ,(ˆ ξqg and f
using parameter estimates (Table 2.2) and the same formulas [31] in order
to synthesize the learning operator )ˆ,(ˆ)ˆ,( ξqAξqL and the feedback
70
control term )ˆ,( ξquc (Eq. (3.7) or Eq. (3.16)).
IV. Implement by computer simulation the iterative learning procedure,
which consists of:
- Solution of the differential equation of motion (Eq. (3.7) or Eq.
(3.16)) to obtain the resultant output trajectory taking into
account the constraints on control inputs
6,...,1],ˆ,ˆ[ maxmax iUUu iii (Table 2.2), i.e.: if maxˆii Uu or
maxˆii Uu , then maxˆ
ii Uu or maxˆii Uu , respectively;
- Update of the control input according to the learning update law,
(Eq. (3.2) or Eq. (3.15)),
until an acceptable accuracy is reached for the tracking error Nlel ,max .
Let the desired trajectory be given in generalized coordinates by:
0)(;5.1)cos(5.1)(;3)cos(3)(
;1)cos()(;2)cos(2)(;0)(
654
321
tqttqttq
ttqttqtq
ddd
ddd
(3.17)
and ]2,0[ t . The specification of 0)(1 tqd and 0)(6 tqd does not simplify
the calculation of the learning operator )ˆ,(ˆ)ˆ,( ξqAξqL (Eq. (3.2)), because
according to the Full Explicit Puma 560 Model [31] the formulation of the inertial
matrix )(qA does not depend on )(1 tqd and )(6 tqd . From Eq. (3.17), one can
obtain the trajectories of the desired generalized velocities 6,...,1, iqdi , i.e.:
0)(;)sin(5.1)(;)sin(3)(
;)sin()(;)sin(2)(;0)(
654
321
tqttqttq
ttqttqtq
ddd
ddd
(3.18)
and ]2,0[ t .
Profiles of )(tqdi and 6,...,1],2.0[),( ittqd
i , computed with MATLAB, are
shown in Fig. 3.5, where the number of the curve corresponds to the coordinate
number 6,...,1 i .
71
3.5.1 Simulation results in cases of an unacceptable initial
tracking error
Solving the nonlinear differential equation (3.8) with different values of feedback
gains vK and pK of the initial control )ˆ,( 1 ξquc and with initial conditions
0)0(1 q and 0)0(1 q (Eq. (3.17)) one can obtain different trajectories )(1 tq
and, consequently, different corresponding initial maximal errors max1e (Eq.
(3.10)). Thus, as a result of varying values of vK and pK , for 6vK and 3pK
we have an attainable trajectory )(1 tq , where 6,...,1],2,0[],2,2[)(1 ittqi
are depicted in Fig. 3.6.
0 1 2 3 4 5 6 70
1
2
3
4
5
2
5
3
1, 6
4
0 1 2 3 4 5 6 7-3
-2
-1
0
1
2
3
1,6 1,6
4
4
2
2
5
5
3
3
b) Graphs of a) Graphs of
Fig. 3.5. Desired trajectories of generalized coordinates and the corresponding velocities.
t[s] t[s]
q[rad] q[rad]
72
c) Graphs of )(1
3 tq , )(25
3 tq and )(3 tqd
d) Graphs of )(1
4 tq , )(25
4 tq and )(4 tq d
b) Graphs of )(1
2 tq , )(25
2 tq and )(2 tq d b)
Graphs of )(1
2 tq , )(25
2 tq and )(2 tq d
a) Graphs of )(1
1 tq , )(25
1 tq and )(1 tq d
e) Graphs of )(1
5 tq , )(25
5 tq and )(5 tqd
f ) Graphs of )(1
6 tq , )(25
6 tq and )(6 tqd
Fig. 3.6. Trajectories )(1 tqi (dashed lines), )(25 tqi (solid lines) and
6,...,1,)( itqdi , (dotted lines) in case of the standard NILC control (Eq. (3.7)).
73
The maximal initial tracking error max1e , calculated from (3.10), is 1.5843[rad]
and we assume that it is unacceptable. Anyway, despite of this error, assuming
learning gains (Eq. (3.2)) equal to feedback gains, vv KL and pp KL , and
continuing the computer simulation performance of the standard NILC procedure
(3.7) at the 25th iteration we obtain an acceptable tracking error 0025.0max25 e
[rad]. The preliminary given accuracy of the MATLAB solver of ordinary
differential equations, based on the Runge-Kutta method, is 410 . Corresponding
trajectories ]2,0[),(25 ttqi , together with desired trajectories of robot links in
the space of generalized coordinates are also shown in Fig. 3.6.
The profile of maximal tracking errors 25,...,1,max lel , versus iterations
provides the best information about the convergence of the standard NILC
procedure. This profile is depicted in Fig. 3.7.
74
Referring to Fig. 3.7, the profile of 25,...,1,max lel , (dashed line) is
monotonically decreasing and the rate of convergence is high, but in this case,
the standard NILC procedure (3.7) is not applicable in practice for the large initial
error.
Assuming 6,6 vv LK and 3,3 pp LK and 7.0 [rad] and 5.0ˆ
[rad] the computer simulation of the iterative learning procedure (3.16) results in
the profile of corresponding tracking errors maxle vs. iterations 25,...,1l given
by the solid curve in Fig. 3.7. The accuracy of the Runge-Kutta method, applied
by MATLAB, for solving the differential equation (3.16), is 410 and the
corresponding maximal tracking error of the last iteration is error 0018.0max25 e
[rad]. Thus, the accuracy of both NILC procedures (Eq. (3.7)) and Eq. (3.16)) is
similar, but the maximal transient error maxe of the NILC (3.16) is less then 0.6
[rad] and, consequently, the considered above problem for the initial error of the
standard NILC is solved.
The graph of the error maxle of the iterative process (3.16) (Fig. 3.7) shows
lower rate of convergence then the corresponding rate of convergence for (3.7).
For the considered particular case the standard NILC reaches an acceptable
accuracy at the 15th iteration, while the bounded-error NILC (3.16) requires 20
iterations for the same result. Actually, the last iteration number N is of great
importance for the applicability of the learning control algorithm, because the
total elapsed time of the standard learning process equals to the number of
iterations N multiplied by the iteration time T ( NT ). As it was mentioned above,
the standard procedure requires at least 15 iterations and the corresponding
elapsed time is 30 [sec]. At least 20 iterations are required for implementation
of the new learning procedure (Eq. (3.16)), but the elapsed time is not 40 [sec],
because for each of iterations 8,...,1l the tracking time TT sl and the elapsed
time of the whole process is: TTl
sl 12
8
1
. The exact value of this elapsed time
could be calculated using the profile of slT , 25,...,1l , vs. iterations shown in Fig.
3.8 and it is 24865.22 [sec].
75
Fig. 3.8. Tracking times of error-bounded learning for 6,6 vv LK ,
and 3,3 pp LK .
Fig. 3.9. Tracking trajectories of the third joint: ],0(),(3s
ll Tttq ,
20,10,7,6,4,1l for 6,6 vv LK and 3,3 pp LK .
76
Consequently, the virtual duration of the new procedure (3.16) is less than
the duration of the standard one despite of the greater total number of iterations
minimally required for its successful implementation.
Fig. 3.9 illustrates in detail the new learning process for the third link of the
robot arm. Trajectories ],0(),(3s
ll Tttq , 20 and 10,7,6,4,1l are shown and
corresponding maximal tracking errors: ],0(,)()( 333s
ldl
t
l Tttqtqxame , are
bounded, i.e.: 8.03 le [rad].
3.5.2 Simulation results in cases of an unacceptable transient
error
The results presented here concern both the NILCs (3.7) and (3.16) in case of
acceptable initial error max1e and inadmissible error:
maxe (Eq. (3.11)).
Fig. 3.10. Maximal tracking error of standard and error-bounded learning
for 0,16 vv LK and 16,48 pp LK .
77
Increasing values of feedback gains vK and pK reduces initial tracking
errors [36-38], but one can find learning gains vL and pL which cause bigger
errors: Nlel ,...,2,max . Thus, for 0,16 vv LK and 16,48 pp LK the
resultant graph of tracking errors maxle vs. iterations 50,...,1l , obtained from the
computer simulation of the iterative learning procedure (3.7), is given by the
dashed curve in Fig. 3.10.
The initial tracking error is: 0.4835max1 e [rad], while the transient error is:
3.0805max e [rad] and the final tracking error is: 0.007max50 e [rad]. For 7.0
[rad], 5.0ˆ [rad] and same feedback and learning gains the computer simulation
of the NILC (3.16) gives the profile of tracking errors maxle given by the solid
curve in Fig 10. The values of corresponding errors are 0.4835max1 e [rad],
0.5101max e [rad] and 0.0077max50 e [rad]. Again, the accuracy of the Runge-
Kutta method for both computer simulations is 410 .
Fig. 3.11. Tracking times of error-bounded learning for
0,16 vv LK and 16,48 pp LK .
78
Consequently the iterative procedures (3.7) and (3.16) have a similar rate of
convergence and obtain acceptable accuracy at the 50th iteration, but the transient
error maxe of the standard NILC is unacceptable, while the corresponding error
of NILC (3.16) is bounded, i.e.: maxe , and thus the problem arisen from the
big transient error of the standard procedure is solved.
The duration of the standard learning process is 100 [sec]. Referring to Fig.
3.11, where the profile of tracking times slT , 50,...,1l , is plotted, the elapsed
time of the error-bounded learning process (3.16) is 889596.28 [sec].
3.5.3 Simulation results in cases of both initial and transient
error
As results from computer simulations of NILCs (3.7) and (3.16) for
3,6 vv LK and 6,3 pp LK and 7.0 [rad], 5.0ˆ [rad] two graphs of
corresponding tracking errors 20,...,2,max lel are shown in Fig. 3.12.
In this case, the initial error of the standard NILC (3.7) is the same (
1.5843max1 e [rad]), as it is for the first graph depicted by dashed line in Fig. 3.7,
because feedback gains also are the same. Corresponding transient errors and
final tracking errors of both dashed and solid curves shown in Fig. 3.12 are
2.8762max e [rad] and 0.0037max20 e [rad] for the dashed curve, 0.5218max e
[rad] and 0.0012max20 e [rad] for the solid curve. Analogically to Fig. 3.7, the
standard learning procedure has better rate of convergence then the error-
bounded one. The dashed curve reaches an acceptable error at the 10th iteration (
0.0062max10 e [rad]), while the solid one does the same at the 12th iteration (
0.0054max12 e [rad]), but corresponding elapsed times are: 20 [sec] and
84956.30 [sec] and consequently, in this particular case, the proposed in
Section 3.4 learning algorithm is faster then the standard one described in
Section 3.3.
79
Fig. 3.12. Maximal tracking error of standard and bounded-error
learning for 3,6 vv LK , and 6,3 pp LK .
Fig. 3.13. Tracking times of error-bounded learning for
3,6 vv LK , and 6,3 pp LK .
80
The profile of tracking times slT , 20,...,1l , is depicted in Fig. 3.13. The
simulation results presented above verify the practical applicability of the
proposed learning algorithm (Section 3.4).
With regard to the possible reduction of the transient error for the standard
NILC (3.7), it has to be mentioned that similar or better results could be obtained
by specification of appropriate feedback and learning gains, but in practical
setting such a specification with respect to an actual industrial robot is difficult,
complicated and time consuming. Therefore, computer simulations could be used
for achievement of proper values of these gains. However, due to the unmodeled
dynamics neither simulations nor analytical methods give full guarantees for
successful real implementation of the standard NILC (3.7). That is why ensuring
in a natural way that a robot will track a desired trajectory with a given tracking
error the proposed NILC (3.16) is useful in practical applications without any
necessity of previous computer simulations. Anyway, a computer simulation of
a learning procedure saves efforts, time and energy if the resultant feed-forward
control is going to be applied as an initial feed-forward term of an ILC when
implemented in practice. The resultant feed-forward control trajectories
6,...,1],2.0[),(20 lttu l , obtained by the last simulation of the error-bounded
learning procedure (3.16) are shown in Fig. 3.14.
Fig. 3.14. Generalized torques of the feed-forward control input at the 20th iteration of
the error-bounded learning for 3,6 vv LK , and 6,3 pp LK .
b) Graphs of 6,5,4 ),(20 itui a) Graphs of 3,2,1 ),(20 itui
81
In this paper, we consider a constant norm bound on the maximal tracking
error in order to simplify the description of the error-bounded learning algorithm.
Trajectory-tracking errors in the space of generalized coordinates can be more
precisely bounded if the condition (3.14) is replaced by
NlTTiTqTq slii
sl
di
sl
li ,...,1],,0(},6,...,1{,0ˆ,ˆ)()( . This will increase the
applicability of the error-bounded learning algorithm. Moreover, an error bound
could be a function of time defined by: ],0[,6,...,1,0)(ˆ Ttiti , because the
proof of NILC (3.16) convergence does not depend on )(ˆ ti . Thus, in case of
obstacles in the robot’s working space more complicated conditions on trajectory
tracking errors could be stated in this 3D space. Further, from these conditions
one can obtain corresponding conditions in generalized coordinates by solving
inverse kinematic problems that have arisen [42] and consequently, the proposed
learning control (3.16) can be used.
Finally, the accuracy (the maximal tracking error at the last iteration) of the
presented simulation results improves when a higher accuracy is set for the
MATLAB solver of ordinary differential equations, but the corresponding
computation time also increases.
3.6 Conclusions
A new algorithm for implementation of a model-based NILC of robotic
manipulators is proposed. The algorithm ensures that in each of iterations the
tracking error norm is bounded by a given error bound 0 . Thus, during the
implementation of the learning process in the space of generalized coordinates
all output trajectories lie in a hyper-tube with radius around a desired trajectory
and consequently, this learning strategy solves the problem arising from
unacceptable tracking errors of standard strategies for NILC implementation. The
uniform convergence and robustness of the proposed error-bounded learning
algorithm are proven.
With regard to PUMA 560 6-DOF robot, computer simulations are presented
in order to verify the practical applicability of the new algorithm. The
comparative analysis of corresponding results of both standard and error-
bounded learning procedures reveals:
82
High rates of convergence and achievement of acceptable tracking errors
for reasonable numbers of iterations (from 10 to 50);
Less duration of the error-bounded learning process (Eq. 16), compared to
the standard one (Eq. (3.7)), in all considered cases;
The tracking errors of the new learning algorithm are bounded in contrast
to corresponding errors of the conventional one.
83
Chapter 4
References
1. Arimoto, S., S. Kawamura and F. Miyazaki, “Bettering Operation of dynamic
systems by learning: A new control theory for servomechanism of
mechatronics systems”, Proc. of 23rd Conference on Decision and Control,
Las Vegas, NV, pp. 1064-1068, (1984).
2. Ahn, H.-S., K. L. Moore and Y. Chen, Iterative Learning Control Robustness
and Monotonic Convergence for Interval Systems, Springer-Verlag London
Limited, London , (2007).
3. Heinzinger, D., B. Fenwick, B. Paden and F. Miyazaki, “Robust Learning
Control”, Proc. of 28th Conference on Decision and Control, Tampa, FL, pp.
436-440, (1989).
4. Heinzinger, D., B. Fenwick, B. Paden and F. Miyazaki, “Stability of Learning
Control with Disturbances and Uncertain Initial Conditions”, IEEE
Transactions on Automatic Control, Vol. 37, No. 1, pp. 110-114, (1992).
5. Ahn, H.-S., and D. A. Bristow, “Special Issue on Iterative Learning Control”,
Asian Journal of Control, Vol. 13, No. 1, pp. 1-2, (2011).
6. Longman, R.W., “Iterative learning control and repetitive control for
engineering practice”, International Journal of Control, Vol. 73, No. 10, pp.
930-954, (2000).
7. Madady, A. and H.-R. Reza-Alikhani, “A guaranteed monotonically
convergent iterative learning control”, Asian Journal of Control, (2011), doi:
10.1002/asjc.397.
8. Schoellig, A. P., J. Alonso-Mora, and R. D'Andrea, “Limited benefit of joint
estimation in multi-agent iterative learning”, Asian Journal of Control,
(2011), doi: 10.1002/asjc.398.
9. Ruan, X., Z. Z. Bien, and Q. Wang, “Convergence properties of iterative
learning control processes in the sense of the Lebesgue-P norm”, Asian
Journal of Control, (2011), doi: 10.1002/asjc.425.
84
10. Hahn, B. and K. Oldham, “On-off iterative adaptive controller for low-power
micro-robotic step regulation”, Asian Journal of Control, (2011), doi:
10.1002/asjc.410.
11. Ouyang P.R., B.A. Petz and F.F. Xi, “Iterative Learning Control with
Switching Gain Feedback for Nonlinear Systems”, Transactions of the
ASME, Journal of Computational and Nonlinear Dynamics, Vol. 6, No. 1,
pp. 011020-1 –7, (2011).
12. Xu, J.-X. and Y. Tan, Linear and Nonlinear Iterative Learning Control,
Lecture Notes in Control and Information Sciences, Springer, New York,
(2003).
13. Fujimoto, K. and T. Sugie, “Iterative learning control of Hamiltonian
systems: I/O based optimal control approach”, IEEE Trans. on Automatic
Control, 48(10), pp. 1756–1761, (2003).
14. Delchev, K. and E. Zahariev, “Computer simulation based synthesis of
learning control law of robots”, Mechanics Based Design of Structures and
Machines Vol. 36 No 3, pp. 225-248, (2008).
15. Delchev, K. and G. Boyadjiev, “Synthesis and computer simulation of
learning control of horizontal robot arm”, Proc. of 9th IFAC Symposium on
Robot Control “SYROCO’09”, Gifu, Japan, September 9-12, pp. 301-306,
(2009).
16. S. Gopinath and I. Kar: Iterative learning control scheme for manipulators
including actuator dynamics. Mechanism and Machine Theory, 39, (2004),
1367-1384.
17. Delchev K., Iterative learning control for nonlinear systems: A bounded-
error algorithm, Asian Journal of Control 15 (2) , pp. 453-460 (2013).
18. Delchev K., Simulation-based design of monotonically convergent iterative
learning control for nonlinear systems, Archives of Control Sciences,
Volume 22(LVIII) No. 4, (2012) pp. 371–384.
19. Delchev K., Boiadjiev G., Kawasaki, H., Mouri T., Iterative learning
control with sampled-data feedback for robot manipulators, Archives of
Control Sciences, Volume 24(LX) No. 3, (2014) pp. 257–277.
20. Delchev K., Iterative learning control for robotic manipulators: A bounded-
error algorithm, International Journal of Adaptive Control and Signal
Processing, 28, pp. 1454–1473 (2014). DOI: 10.1002/acs.2454 (2013)
21. Shinji, W. and T. Mita, “A Parameter Identification Method of Horizontal
Robot Arms”, Advanced Robotics, Vol. 4, No 4, pp. 337-352, (1990).
85
22. Kozlov, V., V. Makarychev, A. Timofeev and E. Yurevich, Dynamics of
Robot Control. Moscow, Science, (1984), (in Russian).
23. Pertew, H. Marquez and Q. Zhao, A Direct Sampled-data Design Approach
for Robot Stabilization, in Proc.2008 American Control Conference, Seattle,
Washington, (2008), 369-374.
24. T. Oomen, van de Wijdeven and Q. Zhao, Suppressing intersample behavior
in iterative learning control, Automatica 45, (2009), 981–988.
25. Y.-J. Pan, H. Marquez and T. Chen, Sampled-data Iterative Learning Control
for a Class of Nonlinear Networked Control Systems, in Proc. 2006 American
Control Conference, Minneapolis, Minnesota, (2006), 3494-3499.
26. T.-W. Ma and H. Yang, Sampled Data Feedback-Feedforward Control of
Structures with Time Delays, Journal of Structural Engineering, 132(7),
(2006), 1129–1138. DOI: 10.1061/(ASCE)0733-9445(2006)132:7(1129)
27. B. Zhang, D. Wang, Y. Wang, Y. Ye and K. Zhou, Comparison Studies on
Anti-Aliasing/Anti-Imaging Filtering and Signal Extension in Multi-rate
ILC, in Proc.17th World Congress The International Federation of Automatic
Control, Seoul, (2008), 12468-12473.
28. D. Vassileva, G. Boiadjiev, H. Kawasaki and T. Mouri, Force Compensating
Trajectories for Redundant Robots: Experimental Results, Journal of
Robotics and Mechatronics, 21(1), (2009), 104-112.
29. J. Zhang, Advanced Pulse Width Modulation Controller ICs for Buck DC-
DC Converters, Ph.D. Thesis, University of California, Berkeley, 2006.
30. R. James and G. James, Mathematics dictionary, 5th ed., Chapman & Hall,
New York, 1992.
31. B. Armstrong, O. Khatib and J. Burdick, The explicit dynamic model and
inertial parameters of the PUMA 560 arm, in Proc. IEEE Int. Conf. Robotics
and Automation, vol. 1, San Francisco, USA, (1986), 510-518.
32. P. Corke and B. Armstrong-Helouvry, A search for consensus among model
parameters reported for the PUMA 560 robot, in Proc. IEEE Int. Conf.
Robotics and Automation, vol. 1, San Diego, USA, (1994), 1608-1613.
33. T. Tarn, K. Bejczy, S. Han and X. Yun, Inertia parameters of Puma 560 robot
arm. Tech. Rep. SSM-RL-85-01, Washington University, St. Louis, MO,
1985.
34. Smolders K, Volckaert M, Swevers J. Tracking control of nonlinear lumped
continuous-time systems: A model-based iterative learning approach.
86
Mechanical Systems and Signal Processing 2008; 22 (8): 1896-1916. doi:
10.1016/j.ymssp.2008.03.004.
35. Park HJ, Cho HS. On the realization of an accurate hydraulic servo through
an iterative learning control. Mechatronics 1992; 2(1): 75-88.
36. Kawamura S, Svinin M. Advances in Robot Control: From Everyday Physics
to Human-Like Movements. Springer, Berlin, 2006.
37. Insperger T, Stepan G. Optimization of digital control with delay by periodic
variation of the gain parameters. Proc. of IFAC Workshop on Adapt. and
Learn. in Control and Signal Processes, Yokohama, Japan, 2004; 145-150.
38. Al-Salem N, Fanni M. Design of digital high-gain PD control systems using
analytical approach. JSME international journal. Series C, Mechanical
systems, machine elements and manufacturing, 2004; 47(3). 792-802.
39. Bristow DA, Singler JR. Analysis of Transient Growth in Iterative Learning
Control Using Pseudospectra. Proc. of Symposium on Learning Control,
Shanghai, China, 2009.
40. Liegois A, Khalil W, Dumas J, Renaud M. Mathematical and computer
models of interconnected mechanical systems. Proc. of Theory and Practice
of Robots and Manipulators, Warsaw, 1977; 5-17.
41. Craig JJ. Introduction to Robotics: Mechanics and Control, Addison-Wesley,
Reading, 1985.
42. Le Boudec B, Saad M, Nerguizian V. Modeling and adaptive control of
redundant robots. Mathematics and Computers in Simulation 2006; 71(4-6):
395-403.
87
Appendix A
Lemma 2.2
If is a discrete form of , ,
, is the sampling interval and is the corresponding
continuous piecewise-constant function on , then .
Proof
The norm of is defined by
. (A1)
For continuous piecewise-constant functions it has to be mentioned that
, .
Using for the right-hand side of the definition (A1) it follows:
(A2)
On the other hand for
. (A3)
Thus, from Eq. (A2) and Eq. (A3) it follows:
( ) k
kh t h ( ), [0, ]h t t T 0
: [0, ]K
s
k k kt t kt T
/ sK T t st *, *( , )t kh h h t
[0, ]T *, ( )th h t
( ), [0, ]h t t T
[0, ]
( ) sup ( )t
t T
h t e h t
*( , )k kh h t h 1 0[ , ) [ , ], 0, 0,1,..., 1, / s
k k Kt t t t T t k K K T t
*( , )k kh h t h
1 1
1
*
[ , ) [ , )
[ , )
*
[ , ]
sup ( , ) sup ;
sup , 0,1... 1;
sup ( , ) .
k k k k
k
k k
K
K
t k k t
t t t t t t
tk t k
t t t
tt K K
t t T
e h h t h e
h e e h k K
e h h t e h
[0, ]t T
1
* * *
1
0
( , ) ( , ) , [ , ) ( , ) , [ , ]K
t k t k t K
k k K
k
e h h t e h h t t t t e h h t t t T
88
, (A4)
but , and
. (A5)
From Eq. (A4) and inequality (A5) we have
, (A6)
and using the definition (A1), we obtain from inequality (A6) that .
*
[0, ]
sup ( , ) max , 0,1,...,ktt k k
kt T
e h h t e h k K
( ) , 0,1,...,k
kh h t k K [0, ]t T
[0, ]
max ( ) sup ( )kt t
kk t T
e h t e h t
*
[0, ] [0, ]
sup ( , ) sup ( )t k t
t T t T
e h h t e h t
*, ( )th h t
89
Appendix B
This appendix presents a proof of robustness and convergence of the learning
algorithm proposed in Section 3.4. The proof is based on the convergence
Theorem 1.1 presented in Chapter 1 and [3,4] concerning a learning scheme for
the following class of multi-input-multi-output nonlinear time-varying state-
space equations:
( ) ( ( ), ) ( ( ), ) ( ) ( )
( ) ( ( ), )
l l l l l
l l
t f t t B t t t t
t g t t
x x x u
y x, (B1)
where, for },...,0{ l and all ],0[ Tt , nl t Rx )( , m
l t Ry )( , rl t Ru )(
are not necessarily continuous, and nl t R)( represents both deterministic and
random disturbances. The functions nn Tf RR ],0[: and
rnn TB RR ],0[: are piecewise continuous in ],0[ Tt and
mn Tg RR ],0[: is differentiable in x and t , with partial derivatives ),( xg
and ),( tg . In addition, the following assumptions hold [3,4]:
(1) For each fixed initial state )0(x with 0)( the output map
)],,0([)],,0([: mnr TCTCO RRR and the state map
)],,0([)],,0([: nnr TCTCS RRR are one-to-one. In this notation
))0(),(()( lll O xuy and ))0(),(()( lll S xux .
(2) The disturbance )(l is bounded on ],0[ T i.e. bt )( .
(3) The functions ),( f , ),( B , ),( xg , and ),( tg are uniformly globally
Lipschitz in x on the interval ],0[ T .
(4) The operators ),( B and ),( xg are bounded on nT R],0[ .
(5) All functions are assumed measurable and integrable.
90
Let us consider the following learning update law
))()()(),(()()()1()( 01 ttttLttt ldlll yyyuuu , (B2)
where mrm TL RR ],0[: is a bounded learning operator and )1.0[ allows
the influence of a bias term [3,4].
Theorem B1 (Theorem 1.1 ) [3,4]
Let the system described by equation (B1) satisfy assumptions (1)-(5) and use
the update law, Equation (B2). Given an attainable desired trajectory
T],0[),),(()( tttgt dd xy , )(tdu is the corresponding input and )(tdx is the
corresponding state (according to assumption (1). If
],0[),(,1 ),(),()),,(()1( TttBtgttgL nx RxxxxI , (A3)
and the initial state error )0()0( ld xx is bounded by 0xb , then, as l , the
error between )(tlu and )(tdu is bounded. In addition, the state and output
asymptotic errors are bounded. These bounds depend continuously on the bound
on the initial state error, bound on the state disturbance, and , as 0xb , b , and
tend to zero, these bounds also tend to zero.
The proof of Theorem B1 is presented in [3,4], where, in particular, it is
proven that
1( ) ( ) , 0 1l lt t
u u , (B4)
where )()()( ttt ldl uuu , constant, kk , and the norm is defined by
91
)(sup)(],0[
tet lt
Ttl uu
, (B5)
and
1
10
limsup ( ) (1 ) ,
limsup ( ) , ( (1 ) ).
ll
l y y g xl
t
t b b k b k
u
y (B6)
where ( ) ( ) ( )l d lt t t y y y , ( , ) 0k k T , and combines 0xb , b and
0 u , and gk is the Lipschitz constant for ( ( ), )lg t tx .
If 00 xb , 0b , and 0 , then 0 , and 0yb .
Corollary B1 (Corollary 1.1 )[3]
If the update law (B2) is replaced with
1 0( ) (1 ) ( ) ( ) ( ( ), )( ( ) ( ))
( ( ), )( ( ) ( ))
l l l d l
l d l
t t t L t t t t
K t t t t
u u u y y y
y y y (B7)
with ),( K bounded, then Theorem B1 still holds.
The proof of Corollary B1 is presented in [3].
For T],0[),()( ttt ll xy , T))(),(()( ttt lll qqx , )))((,0())(( ttL ll qLy
and ),))((())(( vpll LLttK qLy from equation (B7) we obtain the update law,
Eq. (2), considered in Section 3.3.
Applying the proposed algorithm, Eq. (15), to the dynamic model described
by equation (B1) from equation (B7) we obtain the following update law:
*1 1 0 0 1 0( , ) (1 ) ( , ) ( ) ( , ), ( , ) ( ),s s s s
l l l l l lt T t T t t T t T t u u u u u u (B8)
92
],(,0
];,0(],,0[
)),()()(),(())()()(),((
),(*
TTt
TTTt
ttttKttttL
Tt
sl
sl
sl
ldlldl
sll
yyyyyy
u
,
where ˆ: ( ) ( ) , [0,1,..., )s s sl d l l lT T T l y y .
Corollary B2 (Corollary 1.2 in Chapter 1)
If ( 0)Tye b and the update law (B2) is replaced with (B8), then
Theorem B1 still holds.
The proof of Corollary B2 is presented in Chapter 1 and [17].
It has to be mentioned that according to Theorem B1, the tracking accuracy
of the output trajectory of the classical (update law (B2)) ILC process is
[0, ]
limsup sup ( ) Tl y
l t T
t e b
y (see Eqs. (B5) and (B6)) and obviously, the
radius of the hypertube must be greater than this accuracy.
Thus, we are in a position to prove the robustness and convergence of the
learning algorithm proposed in Section 3.4 for robotic manipulators, i.e. it has to
be shown that the Theorem B1 is applicable to the dynamic equation of robot
motion (1):
1c l A(q)q b(q,q) Dq g(q) f u u , (B9)
where from Eq. (6) in Section 3.3 we have
ˆ ˆˆ ˆ ˆ[ ( ) ( )]c d v l d p l d l u A q K q q K q q b Dq g f , (B10)
93
and Eq. (15) in Section 3.4 can be rewritten as
*1 1 0 1
*
( , ) ( , ) ( , ), ( , ) 0,
ˆ ( )[ ( ) ( )],
( , ) [0, ], (0, ];
0, ( , ]
s s s sl l l l l l
l d l v d l p d l
s s sl l l l
sl
t T t T t T t T
L L
t T t T T T
t T T
u u u u
A q q q q q q q
u (B11)
Using equations (B9) and (B10) and taking into account that )(qA is an n n
symmetric positive-definite matrix, the equation (B9) can be written as:
1 1 1 1
0 0 0
( )
llc l
l l
qqu u f
q A b Dq g A A A (B12)
Assuming l
ll
qx
q, * **f f f where
*
1( )
l
l
f
q
A b Dq g and
**1
0
ˆ ˆ ˆ ˆ[ ( ( ) ( )) ]d v l d p l d l
f
A A q K q q K q q b Dq g
, 1
0B
A
,
and l ly x , and treating the Coulumb friction as a disturbance
1
0ˆ( )l
f fA
, from equation (B12) we obtain equation (B1). In [3] it is
proven that lq is bounded, implying that *f and B satisfy assumptions I-V, and
consequently **f also satisfies assumptions I-V.
Assuming ˆ( ( ), ) 0, ( )l lL t t y A q and ˆ ˆ( ( ), ) ( ), ( )l p l v lK t t L Ly A q A q , and
taking into account that 0 from equation (B8) we obtain equation (B11). It
has to be mentioned that in [3], as an example of a robotic manipulator, the
94
sufficient condition for robustness and convergence (4) validated in Section 3.3
is obtained from equation (B3).
Thus, the application of Theorem B1, Corollary B1 and Corollary B2 to the
system (B12) with the feedback controller described by Eq. (B10) and the
learning update law (B11) results in robustness and convergence of the proposed
bounded-error NILC algorithm for robotic manipulators.