New Iterative learning control for nonlinear time-variant systems … · 2018. 10. 3. · getting...

Sofia University “St. Kl. Ohridski”

Faculty of Mathematics and Informatics

Department of Mechatronics, Robotics and Mechanics

Iterative learning control for nonlinear

time-variant systems describing the

dynamics of robot manipulators

Kamen Krastev Delchev

Assoc. Prof., Ph.D.

Habilitation thesis for promotion to Full Professor in

4.5. Mathematics

(applied mechanics, robotics)

Sofia, Bulgaria

(March), 2018

c ⃝ 2017, 2018 Kamen Delchev, [email protected]

2

С чувство на дълбоко уважение бих искал да изразя своята голяма

благодарност към своите учители: проф. Константин Марков –

ръководител на моята магистърска дипломна работа в СУ „Климент

Охридски“, проф. Л. Лилов – мой преподавател в СУ „Климент Охридски“,

акад. Д. Е. Охоцимский – ръководител на моята докторска работа в

„Институт прикладной математики им. М. В. Келдыша“ – РАН и към своите

колеги проф. Е. Маноах, проф. Е. Захариев и доц. П. Кирязов – и-т по

механика – БАН, проф. Г. Бояджиев и проф. Е. Кръстев – СУ „Климент

Охридски“, проф. А. К. Платонов и вед. науч. сотр. В. С. Ярошевский –

„Институт прикладной математики им. М. В. Келдыша“ – РАН, както и към

всички колеги и приятели, които са спомогнали за моето развитие в науката

и осъществяване на моите научни изследвания.

Бих искала да благодаря на семейството си и по-специално на моята

съпруга, за подкрепата, търпението, разбирането и помощта, които

получавах и получавам в своята научна дейност.

3

П Р Е Д Г О В О Р

Хабилитационният труд се състои от „Общо въведение“, три глави и

„Списък на литературните източници“ и отразява моите научни

изследвания и получените от тях резултати през последните десет години в

областта на „Итеративно управление със самообучение“ (ИУС) за

нелинейни и нестационарни системи, моделиращи динамиката на

манипулационни роботи [1-20]. Изследванията са мотивирани от факта, че

ИУС се явява най-добрия подход за оптимизация на „Програмно

управление (Feedforward control)“ [1, 2, 12] чрез минимизация грешката на

изходната траектория на системата, получена вследствие на неточности на

структурата на модела, на стойностите на неговите параметри и на

детерминираните смущения. Тъй като оптимизацията се постига чрез

изпълнение от реалния обект на определен брой итерации на зададеното

движение, то броят на итерациите и неговата минимизация (бърза

сходимост) е от съществен интерес за практическото приложение на ИУС,

като той се явява мотивация на първоначалните изследвания. Най-

съществен мотивационен ефект за разработване на нов метод за ИУС бе

получен от необходимостта за разрешаване на проблема на „нарастването

на междинната грешка (transient growth)“ при управлението със

самообучение. На практика този проблем обезсмисля приложението на

ИУС и не бе известен ефективен метод за неговото решаване при

нелинейните системи [6, 17]. Предложеният в Глава 1 „Алгоритъм за ИУС

с ограничена грешка“ [17] решава описания по-горе проблем по един

естествен, лесен за реализация и сигурен начин, който се явява и основния

принос в предлагания хабилитационен труд. Във Глава 2 е представен един

подход за стабилизация на нелинейни и нестационарни системи с

дискретизиран сигнал на обратната връзка, базиран на използването на

ИУС за компенсиране чрез Feedforward control на неточностите от

дискретизацията [19]. Глава 3 разглежда едно приложение на

предложеният в Глава 1 нов метод за ИУС при траекторно управление на

виртуален робот PUMA 560 с цел верификация на „Алгоритъм за ИУС с

ограничена грешка“ и сравнение на неговите характеристики с тези на

стандартната процедура за ИУС [20].

echrk

Highlight

echrk

Highlight

echrk

Highlight

4

Резултатите, представени в хабилитационния труд, са публикувани в

статиите [17, 19, 20, 15, 18].

Резултатите, обобщени в първа глава са докладвани на международната

конференция EUROMECH Colloquium 515, Advanced Applications and

Perspectives of Multibody System Dynamics, Blagoevgrad, July 13 – 16, 2010,

Bulgaria.

5

Contents

General Introduction . . . . . . . . . . . . . . . . . . . . . . . 8

1 Iterative learning control for nonlinear systems: bounded-

error algorithm 15

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 15

1.2 Iterative learning algorithm with bounded tracking error . 20

1.3 Simulation results . . . . . . . . . . . . . . . . . . . . 26

1.4 Discussion and conclusions . . . . . . . . . . . . . . . . 32

2 Iterative Learning Control with Sampled-Data Feedback 34

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 34

2.2 Formulation of the problem . . . . . . . . . . . . . . . . 36

2.3 Convergence and robustness of the learning control with a

sampled-data feedback . . . . . . . . . . . . . . . . . . 39

2.4 Simulation results . . . . . . . . . . . . . . . . . . . . 48

2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 55

3 Iterative learning control with bounded transient errors: an

application for PUMA 560 robotic manipulator 57

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 57

3.2 Dynamic model of PUMA560 robot . . . . . . . . . . . 59

3.3 Iterative learning control for robotic manipulators. . . . . 61

3.4 Iterative learning algorithm with a bounded tracking error 66

3.5 Simulation results and discussions . . . . . . . . . . . . 70

3.5.1 Simulation results in cases of an unacceptable initial

tracking error . . . . . . . . . . . . . . . . . . . . . . 72

3.5.2 Simulation results in cases of an unacceptable transient

error . . . . . . . . . . . . . . . . . . . . . . . . . . 77

3.5.3 Simulation results in cases of both initial and transient

error . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6

3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 82

4 References 84

Appendix A 88

Appendix B 90

7

General Introduction

8

Survey of the state-of-the-art in the field and

motivation

Iterative learning control (ILC) improves the tracking accuracy of repetitive

processes. The ILC is based on the idea that the information from previous trial

is used to update the control input in order to obtain better performance of the

assigned task on the next trial [1-17]. Some other definitions of ILC are quoted

by Ahn et al. in [2]. A common feature of them is the “repetition” of the assigned-

task performance and the usage of the information of previous trials (passes,

iterations, cycles or repetitions).

Therefore, the ILC is applicable in optimization of repetitive processes,

especially of industrial robot motions (motions of ridged-body robots in

industrial environment) [14-16, 19, 20].

In classical ILC the following postulates are required [2, 5], although

recently, some ILC algorithms are reported so that these postulates could be

broken or relaxed (adopted from [2]):

Every trail ends in a fixed time of duration.

In this Habilitation Thesis (Chapter 1) we present new bounded-error

learning algorithm which does not satisfy this postulate.

Repetition of the initial setting (initial state coordinates) is satisfied.

This postulate also has been weakened by Heinzinger et al. in [3, 4], where

the influence on the ILC convergence of errors in the initial time is proven.

Invariance of the system dynamics is ensured throughout the repetition.

The system output is measured in a deterministic way.

According to the first postulate, the control input trajectory should to be

updated off line and this is an advantageous way of updating.

Classical offline ILC scheme for a plant with a feedback controller attached

is depicted in Fig. 1.

9

where: P represents a plant; C is a feedback controller; L is a learning controller;

the input trajectory lu is the feedforward term of the control law; ly is the actual

output trajectory; dy is the desired output trajectory. The feed forward term 1lu

improves trajectory tracking and l is the iteration number.

With respect to the learning controller L (learning update law), ILC can be

categorized as P-type (the update - proportional to the state-space error), D-type

(the update - proportional to the higher derivative error), I-type (the update -

proportional to the state-space error integral), PI-type, PD-type, and PID-type [3,

4, 11, 12].

In similar way, a closed-loop ILC can be classified according to the type of

the attached feedback controller (P, PD, PID …).

Thus, the resultant controller-based (on learning and feedback controllers)

classification of a closed-loop ILC can be P-P or P-PD, or PD-D, or any other

combination of types of the learning and feedback controllers correspondingly

[11].

On the other hand, with regard to the model of the plant, the ILC should be

classified as linear (LILC – an ILC for linear systems) [2, 7, 9] or nonlinear

(NILC) [1-4, 11-13] iterative learning control.

Linear iterative learning control (LILC) is an ILC for linear systems or based

on a linear model of nonlinear systems [1, 2]. If the linear approximation of a

nonlinear dynamics results in great uncertainties, the corresponding LILC may

fail to ensure the admissible tracking accuracy. In this case, one should resort to

nonlinear models and nonlinear iterative learning control (NILC) [1, 12].

C + P

L

+

Fig. 1. Learning control application for a plant with a feedback controller

attached [3].

10

Moreover, the ILC update law could be classified as linear or nonlinear with

respect to the learning operator [12]. The ILC update-laws, proposed in [1] for

linear or nonlinear dynamic systems, utilize learning operators which are

constant gain-matrices or do not concern the robot dynamic model. Another

approach to learning operator synthesis is based on the dynamic models with

estimated parameters, respectively [14].

Classical linear feedback control (PD or PID) has many applications in

practice, but the linear PD/PID control of nonlinear and uncertain systems is not

adequate for precise tracking [11]. Thus the combination of nonlinear feedback

control and nonlinear iterative learning control is an effective approach to

achieve good tracking performance [11].

The main advantages of the offline closed-loop ILC are as follows:

The ILC minimizes trajectory-tracking errors that arise from the

unmodelled dynamics (uncertain systems). It is well known in the

literature [1, 2, 12] that the best approach for an improvement of the

feedforward control for uncertain systems is the Iterative Learning

Control.

The ILC procedure, can avoid closed-loop stability problems because at

the end of the ILC procedure the feedback terms of the control input are

getting reduced, so the feed-forward controller gets domination over the

feedback controller [16].

The calculation of the ILC control update is offline, so the update law

could be complicated and all available tools for data processing (filtering,

interpolation etc.) can be utilized.

Anyway, the ILC applicability suffers from the following basic problems:

An admissible number of iterations should produce the desired tracking

accuracy. Therefore, the ILC procedure must be convergent with a high

rate of convergence [1, 9, 11, 12, 14].

In spite of the small enough initial tracking error the ILC could fail due

to non-monotonic convergence that results in a big transient error [2, 6,

11, 14, 15], as it is shown in Fig. 2. The learning transient (transient

11

growth) is the major problem for the applicability of an ILC procedure

[15]

Computer simulations presented in [6, 15, 17] reveal the non-monotony of

maximal tracking errors versus iterations, which leads to inapplicability of the

corresponding ILC. Therefore, with regard to the conditions for monotonic

convergence of the ILC, numerous publications are available and most of them

consider linear systems [2, 7, 9]. A monotonically convergent ILC for nonlinear

systems (Hamiltonian systems) is proposed in [13], but an assumption for large

enough feedback gains must be valid. Similar ideas of monotonically Switching

Gain PD-PD type ILC, for a specific class of nonlinear time-varying systems,

based on increasing feedback gains are addressed in [11]. Following these works

and the ideas of accelerating the ILC convergence, reported in [2], in [18] a

simulation-based design of monotonically convergent PD type ILC for nonlinear

time-varying systems is proposed. It has to be mentioned, that in [18] an

Fig. 2. Maximal tracking error of TT3000 robot.

12

engineering approach to practical synthesis of a monotonically convergent ILC

law is presented while the general solution (bounded-error algorithm) of the

problem of transient errors (see Fig. 2) is presented in [17].

An NILC algorithm, aimed at solving problems of a big initial tracking error

and/or the transient growth, is proposed in [15]. The algorithm is based on the

fact that the time interval of the trajectory performance is divided into N

subintervals and the NILC is executed for the first subinterval, than for the firs

and the second subintervals, and so forth. Considerable difficulties in

implementation of this algorithm arise from the necessity of determining N

through computer simulations so that the acceptable maximal ILC tracking error

to be obtained.

The main purpose of this habilitation thesis is to present some new methods,

ILC designs, results and applications of the ILC theory motivated by the

mentioned above problems of the ILC applicability (in particular – the transient

growth in ILC) and stabilization problems of nonlinear time-variant systems with

sampled-data feedback that arise from the discretization of the feedback signal.

The work is organized as follows:

In Chapter 1 [17] we propose a new approach (bounded-error algorithm)

which is motivated by the major problem for the ILC implementation – the

transient growth, and it is not addressed to the monotonic convergence of the ILC

procedure but it limits the tracking (transient) error (Fig. 2) in a simple, efficient

and safety way with respect to the application of the learning procedure. This

algorithm ensures that trajectory-tracking errors of the proposed NILC, when

implemented, are bounded by a given error norm bound. A special feature of the

algorithm is that the trial-time interval is finite but not fixed as it is for the other

iterative learning algorithms. A sufficient condition for convergence and

robustness of the bounded-error learning procedure is derived.

Chapter 2 [19] deals with the improvement of the stability of sampled-data

(SD) feedback control for nonlinear multiple-input multiple-output time varying

systems, such as robotic manipulators, by incorporating an off-line model based

nonlinear iterative learning controller. The proposed scheme of nonlinear

iterative learning control (NILC) with SD feedback is applicable to a large class

of robots because the sampled-data feedback is required for model based

feedback controllers, especially for robotic manipulators with complicated

dynamics (6 or 7 DOF, or more), while the feedforward control from the off-line

iterative learning controller should be assumed as a continuous one. The

robustness and convergence of the proposed NILC law with SD feedback is

13

proven, and the derived sufficient condition for convergence is the same as the

condition for a NILC with a continuous feedback control.

Chapter 3 [20] is concerned with a model-based nonlinear iterative learning

control (NILC) for nonlinear multiple-input and multiple-output mechanical

systems of robotic manipulators. A new strategy for the NILC implementation

based on the new bounded-error algorithm is proposed. This algorithm ensures

that trajectory-tracking errors of the proposed NILC, when implemented, are

bounded by a given error norm bound. Both standard and bounded-error learning

control laws with feedback controllers attached are considered. The NILC

synthesis is based on a dynamic model of a 6-DOF robotic manipulator. The

dynamic model includes viscous and Coulomb friction and input generalized

torques are bounded. With respect to the bounded-error and standard learning

processes applied to a virtual PUMA 560 robot, simulation results are presented

in order to verify maximal tracking errors, convergence and applicability of the

proposed learning control.

14

Chapter 1

Iterative learning control for nonlinear systems:

bounded-error algorithm

1.1 Introduction

The problem of the increased transient error (transient growth) of an ILC system

is due to the non-monotonic convergence of the tracking error (Fig. 1.2). The

problem formulation is based on the convergence theorem presented by

Heinzinger et al. in [3, 4] concerning a learning scheme (Fig. 1.1):

where: P represents a plant; C is a feedback controller; M is memory; the input

trajectory lu is the feedforward term of the control law; ly is the actual output

Fig. 1.1. Learning control scheme with a feedback controller attached.

l u

C

P

M

d y

L

l y

l+1 u

d y

15

trajectory; dy is the desired output trajectory. The feed forward term

1lu

improves trajectory tracking performance according to the following standard

update law ( 0,1,2...)l :

1( ) ( ) ( ( ), )( ( ) ( ))l l l d lt t t t t t u u L y y y (1.1)

where : [0, ]t t T , is the tracking time and [0, ]T is the time interval of the trial

duration; ( ( ), )l t tL y is a nonlinear time-varying learning operator and 0u is the

initial control input.

The learning scheme in Fig. 1.1 is convenient to describe the following class

of multi-input-multi-output nonlinear time-varying state-space equations:

( ) ( ( ), ) ( ( ), ) ( ) ( )

( ) ( ( ), )

l l l l l

l l

t f t t B t t t t

t g t t

x x x u

y x (1.2)

where, for {0,..., }l and all [0, ]t T , ( ) n

l t x R , ( ) m

l t y R , ( ) r

l t u R are

not necessarily continuous, and ( ) n

l t R represents both deterministic and

random disturbances. The functions : [0, ]n nf T R R and

: [0, ]n n rB T R R are piecewise continuous in [0, ]t T and

: [0, ]n mg T R R is differentiable in x and t , with partial derivatives ( , )xg

and ( , )tg . In addition, the following assumptions hold (adopted from [3, 4]):

I. For each fixed initial state (0)x with ( ) 0 the output map

: ([0, ], ) ([0, ], )r n mO C T C T R R R and the state map

: ([0, ], ) ([0, ], )r n nS C T C T R R R are one-to-one. In this notation

( ) ( ( ), (0))l l lO y u x and ( ) ( ( ), (0))l l lS x u x .

II. The disturbance ( )l is bounded on [0, ]T i.e. ( )t b , is the

Euclidean norm.

III. The functions ( , )f , ( , )B , ( , )xg , and ( , )tg are uniformly globally

Lipschitz in x on the interval [0, ]T .

IV. The operators ( , )B and ( , )xg are bounded on [0, ] nT R .

echrk

Highlight

echrk

Highlight

echrk

Highlight

16

V. All functions are assumed measurable and integrable.

Lemma 1.1 [3]. If { }, {0,..., }la l is a sequence of real numbers such that

1 , 0 1l la a , then 1limsup (1 )ll

a

.

The proof of Lemma 1.1 is presented in [3].

Let us consider the following learning update law

1 0( ) (1 ) ( ) ( ) ( ( ), )( ( ) ( ))l l l d lt t t L t t t t u u u y y y (1.3)

where : [0, ]m r mL T R R is a bounded learning operator and [0.1) allows

the influence of a bias term [3, 4].

Theorem 1.1 [3, 4]. Let the system described by equation (1.2) satisfy

assumptions I-V and use the update law (1.3). Given an attainable desired

trajectory ( ) ( ( ), ), [0,T]d dt g t t t y x , ( )d tu is the corresponding input and

( )d tx is the corresponding state (according to assumption I). If

(1 ) ( ( , ), ) ( , ) ( , ) 1,

( , ) [0, ],

x

n

L g t t g t B t

t T

I x x x

x R (1.4)

and the initial state error (0) (0)d lx x is bounded by 0xb , then, as l , the

error between ( )l tu and ( )d tu is bounded. In addition, the state and output

asymptotic errors are bounded. These bounds depend continuously on the bound

on the initial state error, bound on the state disturbance, and , as 0xb , b , and

tend to zero, these bounds also tend to zero.

The proof of Theorem 1.1 is presented by Heinzinger et al. in [3, 4], where,

in particular, it is proven that

1( ) ( ) , 0 1l lt t

u u (1.5)

and

echrk

Highlight

eps undefined

echrk

Highlight

echrk

Highlight

echrk

Highlight

Notation for u with one parameter

17

( ) (0) ( ) ,

( ) ( ) ,

l l l

l g l

t k t

t k t

x x u

y x (1.6)

where ( ) ( ) ( )l d lt t t u u u , ( ) ( ) ( )l d lt t t x x x , ( ) ( ) ( )l d lt t t y y y ,

( , ) 0k k T , , 0k k , and combines 0xb , b and

0 u , and

(0, )gk is the Lipschitz constant for ( ( ), )lg t tx . The time-weighted norm (

norm) is defined by

[0, ]

( ) sup ( )t

l lt T

t e t

u u . (1.7)

Applying Lemma 1.1 to the inequality (1.5) and using Eq. (1.6) yields

1

1

0

limsup ( ) (1 ) ,

limsup ( ) , ( (1 ) ).

ll

l y y g xl

t

t b b k b k

u

y (1.8)

If 0 0xb , 0b , and 0 , then 0 and 0yb .

Because in all practical cases the input signal belongs to a compact subset of r

R further we will consider that [0, ]

( ) : sup ( )l l ut T

t t b

u u . If the ILC update law

(see Eq. (1.3)) produces a feedforward input that is out of the allowable compact

set of inputs, and this set is a convex set, then projecting back into the allowable

set ensures the validity of Theorem 1.1 [3, 4].

Thus, the robustness and monotonic convergence of the control input error is

proven regarding to the corresponding time-weighted norm and the monotonic

convergence of the state error time-weighted norm ( )l t

x , and the output error

time-weighted norm ( )l t

y also is proven (see Eq. (1.6)) [3, 4]. Moreover, the

convergence of PD-type and PID-type of the standard learning update law, (Eq.

(1.3)), with nonlinear proportional (P) and integral (I) learning gains is proven in

[3]. Unfortunately, the monotonic convergence of the time-weighted norm

echrk

Highlight

echrk

Highlight

18

( )l t

u is not a sufficient condition for monotonic convergence of the

corresponding supremum norm: [0, ]

( ) sup ( )l lt T

t t

u u because:

( ) ( ) ( ) , [0, ]T

l l lt t e t t T

u u u (1.9)

Obviously, if ( ) , 0l t

u or (and) 0T , is convergent to zero with

respect to l , then ( )l t

u and ( )l t

y also are zero convergent (see Eq.

(1.9)) but a large transient error ( )l t

y is possible [2, 6, 12] as it is shown in

Fig. 1.2.

Referring to Fig. 1.2, one can see that the corresponding learning control is

not applicable and an optimization of the ILC is necessary to reduce the transient

Fig. 1.2. Maximal tracking error of PUMA 560

robot

echrk

Highlight

notation p-norm is defined for p>=1

echrk

Sticky Note

-lambda??

19

error. However, the only approach, reported in the literature [2, 12], for transient

error reduction is the achievement of a monotonic convergence of the tracking

error by adding conditions on the learning operator (learning gains) [2, 11, 13].

In the next section we present new approach to the problem of the increased

transient error of an ILC system. It is based on the natural idea of stopping the

tracking of the desired trajectory if the output error in the time domain is so large

that the further tracking is impossible. For example, if a robot tracks a desired

trajectory and the corresponding tracking error increases dramatically, then the

robot should stop moving to prevent damage to the robot or its surroundings.

This requirement raises two basic questions:

How to update the feedforward control (Eq. (1.1)) using only a part of the

data ( ), [0, ]s

l lt t Ty where s

lT T in case of a premature termination of

the trail performance due to the increased tracking error?

If we have answered the first question and continue the learning

procedure, is this procedure an asymptotically convergent one?

Chapter 1 is aimed at answering both questions.

This Chapter is organized as follows. Section 1.2 presents the main result

of this habilitation thesis – a new algorithm for NILC. The new algorithm is

validated and examined by computer simulation in Section 2.3.

1.2 Iterative learning algorithm with bounded

tracking error

In this section, we present the main result of this habilitation thesis, which is a

new algorithm for nonlinear iterative learning control (for the system in Eq. (1.2))

with bounded supremum norm of the tracking error and the proof of its robustness

and convergence.

The main idea of the new algorithm is that the output trajectory at each

iteration must be inside a hypertube of width 2 around the desired trajectory

i.e.:

20

( ) { ( ) : ( ) ( ) , [0, ],

(0, ]}, 0,1,...

s

l d

s

t t t t t T

T T l

y y y y (1.10)

where ( ), [0, ], (0, ]s s

l l lt t T T T y is the output trajectory in equation (1.2) and

T

ye b is a preliminary given error norm bound. It has to be mentioned that

according to Theorem 1.1, the tracking accuracy of the output trajectory of the

standard (update law (1.3)) ILC process is ( ) T

l yt e by , l , (see Eqs.

(1.8) and (1.9)) and obviously, the radius of the hypertube must be greater than

this accuracy.

Given an attainable desired output trajectory and an error bound

( 0)T

ye b the new algorithm for the implementation of the NILC

procedure could be formulated as follows:

i. Set the initial iteration number 0l and begin the iterative procedure.

ii. Starting from the initial position (0)ly the system is tracking the desired

trajectory under the control ( )l tu until ( ) ( )s s

l l d lT T y y or the end

position ( )l Ty is reached. When , (0, ]s s

l lt T T T the tracking process

stops.

iii. After the current tracking performance has finished, the learning

controller updates the feed-forward control term according to the

following learning update law:

*

1 1 0

0 1 0

( , ) (1 ) ( , ) ( ) ( , ),

( , ) ( ),

s s s

l l l l l l

s

t T t T t t T

t T t

u u u u

u u

*

( ( ), )( ( ) ( )), [0, ],

( , ) (0, ];

0, ( , ]

s

l d l l

s s

l l l

s

l

L t t t t t T

t T T T

t T T

y y y

u

(1.11)

iv. If the output error ( )l t

y is less than or equals to an acceptable

tracking accuracy, then exit from the learning procedure, else set 1l l

and go to step (ii).

echrk

Highlight

echrk

Highlight

two params for u

echrk

Sticky Note

one param for u

21

Thus, we are in a position to prove the robustness and convergence of the

new bounded-error learning algorithm.

Corollary 1.1. If the update law (1.3) is replaced with (1.11), then Theorem 1.1

still holds.

Proof. The main idea of the proof is to show that a sequence

*1( , ) , {0,..., }s

l lt T l

u , [0, ]t T (1.12)

exists and

* *1 1( , ) ( , ) , 0 1s s

l l l lt T t T

u u (1.13)

and therefore, that Lemma 1.1 can be applied.

At first, we have to prove that the sequence defined by (1.12) exists and the

inequality (1.13) holds when the update law is given by Eq. (1.11), and

(0, )s

lT T , and the learning process runs in a hypertube of width 2 around the

desired trajectory.

Then we shall prove that there exists * *: , s

ll l l T T and ( )s

l lT y .

Thus, the iterative learning process consists of two phases. During the first phase *{0,1,..., }l l , [0, ]s

lt T , and (0, )s

lT T , and during the second one *l l and

l , [0, ]t T , and s

lT T . Moreover, the update laws (1.3) and (1.11) are

equivalent when s

lT T so we can directly apply Theorem 1.1 during the second

phase when l , and [0, ]t T , and the entire output trajectory (from the

beginning at 0t to the end at time t T ) lies inside the hypertube.

For simplification, we introduce the following notations:

( )l l tx x , ( )d d tx x , ( )l l ty y , ( )d d ty y ,

( )l l tu u , ( )d d tu u , ( )l l tω ω , ( ( ), )l l lf f t t x ,

( ( ), )d d df f t t x , ( ( ), )l l lB B t t x , ( ( ), )d d dB B t t x ,

echrk

Highlight

notation lambda?

echrk

Highlight

22

( )( ( ), )lxl tg g t t

x xxx

, ( )( ( ), )dxd tg g t t

x xxx

,

( )( ( ), )ltl tg g t t

t

x xx , ( )( ( ), )dtd tg g t t

t

x xx ,

and gxk , gtk , fk , Bk are Lipschitz constants for ( , )xg , ( , )tg , ( , )f , and ( , )B

respectively, and gxb , udb , and Bb are the corresponding norm bounds on ( , )xg

, ( )d u , and ( , )B , and [0, ]

supd d d dt T

b f B

u .

In the case of (0, )s

lT T we consider the time interval [0, ]s

lT and

( ) ( ) ( )l d lt t t y y y . Using the above notations from Eq. (1.2) we obtain:

( ) ( ) .l xd d d d td xl l l l l tlg f B g g f B g y u u (1.14)

Combining Eq. (1.14) with

( ) ( ) ( )xl d d d xl d d d xl l d l dg f B g f B g B B u u u u ,

taking norms, using the bounds and using the Lipschitz conditions (see

assumptions II-IV) yields:

( ( ))

.

l gx d gt gx f B ud l

gx B l gx

k b k b k k b

b b b b

y x

u (1.15)

Noticing that l u udb b u from inequality (1.15) and inequality (1.6),

using Eq. (1.9), we obtain that ly is bounded and consequently ( )l ty is

Lipschitz in [0, ]s

lt T . Thus, ( )l ty is continuous on [0, ]s

lT and 0(0)l yk y

, and 0 ( )s

y l lk T y , and 0 0

T

y g xk e k b

. Therefore, * (0, ) :s

lt T

*

0( ) 0.5l yt k y .

On the other hand, * *( ) ( )s s

t l l l y lb T t k T t y y , where (0, )yk

is the Lipschitz constant for ( )l ty and

* 1( ) ( ) 0.5 (1 )s

t l l l ub T t k y y , T

u gk e k k

. Consequently,

23

* 0s tl

y

bT t

k

. (1.16)

We can apply Theorem 1.1 for *[0, ]t t so that from Eq. (1.6), using Eq.

(1.9), we have:

*

*

0[0, ]

( ) sup ( )l y u lt t

t k k t

y u , (1.17)

and consequently

*[0, ]

sup ( ) 02

lt t u

tk

u . (1.18)

In the case of (0, )s

lT T from (1.11) and (1.7) for 1( , )s

l lt T

u and

1( , )s

l lt T

u it follows

1[0, ]

1[0, ]

1[ , ]

sup ( , )

sup ( , )

max .sup ( , )

sl

sl

t s

l lt T

t s

l lt T

t s

l lt T T

e t T

e t T

e t T

u

u

u

(1.19)

and

1[0, ]

1[0, ]

1[ , ]

sup ( , )

sup ( , )

max ,sup ( , )

sl

sl

t s

l lt T

t s

l lt T

t s

l lt T T

e t T

e t T

e t T

u

u

u

(1.20)

We consider both possible cases for 1( , )s

l lt T u , when

24

1 1[0, ] [ , ]

sup ( , ) sup ( , )s

l

t s t s

l l l lt T t T T

e t T e t T

u u (1.21)

and when

1 1[0, ] [0, ]

sup ( , ) sup ( , )s

l

t s t s

l l l lt T t T

e t T e t T

u u . (1.22)

In the case when the equality (1.21) holds, if * 1( )* 1: 2 ( )( )t yk b k

u u udk e k b b

then from inequalities (1.16) and (1.18) it

follows that ** *

*[0, ]

sup ( ) ( )s

lk Tk t

l u udt t

e t e b b

u and using

1[ , ]

sup ( , )s

l

s

u ud l lt T T

b b t T

u , and Eq. (1.9) for *[0, ]t t , and the inequality

* *

*1 1

[0, ] [0, ]

sup ( , ) sup ( , )s

l

k t k ts s

l l l lt T t t

e t T e t T

u u we obtain

* *

1 1[0, ] [ , ]

sup ( , ) sup ( , ) .s s

l l

k t k ts s

l l l lt T t T T

e t T e t T

u u (1.23)

Noticing that 0 1 from Eq. (1.19) and Eq. (1.23) we have that

* *

1 1[0, ] [0, ]

sup ( , ) sup ( , )s

l

k t k ts s

l l l lt T t T

e t T e t T

u u . (1.24)

Taking into account that 0 from inequality (1.23) combined with Eq.

(1.24) and Eq. (1.21) by using Eq. (1.7) yields (1.12), and (1.13) for * *max( , )k .

In the case when the equality (1.22) holds, we can apply Theorem 1.1 for

[0, ]s

lt T because the update laws (1.3) and (1.11) are equivalent when

[0, ]s

lt T , and from inequality (1.5) by using equation (1.7) for k we obtain

1 1[0, ] [0, ]

sup ( , ) sup ( , ) .s s

l l

t s t s

l l l lt T t T

e t T e t T

u u (1.25)

25

Combining equalities (1.22) and (1.24) with inequality (1.25) for [0, ]t T ,

and * *max( , )k yields (1.12), and the inequality (1.13).

Now assuming that l we are in a position to apply Lemma 1.1 in the

case when (0, )s

lT T , and [0, ]t T . Thus, from inequalities (1.8) we have that

*1limsup ( , ) ,s

l l yl

t T b

y (1.26)

and consequently *

1( , )s s T

l l l yT T e b y (see Eq. (1.9)). On the other hand, from

the algorithm description (see (ii)) 1( , )s s

l l lT T y implying that *T

ye b

but *

( 0)T

ye b by definition. Therefore, the above assumption is not

valid and * *: , s

ll l l T T and ( )s

l lT y . Thus, the results of Theorem

1.1 are obtained during the second phase of the iterative learning process when *,l l l .

Consequently, Theorem 1.1 still holds.

1.3 Simulation results

In this section, we present the simulation results from implementation of the

error bounded NILC algorithm described in the previous section. These results

are compared with the simulation results from the standard NILC algorithm. We

consider a dynamic model of TT-3000 SCARA-type robot of SEIKO Instruments

Inc. [21].

The Lagrange’s formulation of equations of motion for this robot is [21]:

c Aq b Dq f u , (1.27)

26

211 12 2 1 2 2

212 22 1 2

1 11 1

2 22 2

T T T]1 2 1 2 1 2

( ) ( ) sin (2 ), ,

( ) ( ) sin

sign( )0, , ,

sign( )0

[ , , [ , ] , [ , ] ,

c

c

c

a a q q q q

a a q q

f qd u

f qd u

q q q q q q

q qA b

q q

D f u

q q q

(1.28)

2 2 2

11 2 1 1 1 2 2 1 2

12 2 2 1 2

2

22 2 2 2

( ) 2 cos , ( ),

( ) cos , ,

( ) , ,

a q I m s I m I s

a q m l s

a I m s

q

q

q

(1.29)

and , 1, 2iq i is the corresponding joint angle, id is viscous friction, cif is

Coulomb friction, il is the length of the link 1, 2i given in the specifications

of the robot, is is the position of the centre of the mass, im is the total mass of

the link, and iI is the inertia of the link about its centre of the mass, u is the

vector of generalized torques, l c u u u , and ,l cu u are the feedforward and

feedback terms described below.

We propose the following standard PD update control law which is similar

to the update law described in equation (1.3) when 0 :

1ˆ ( ( ))[ ( )

( )]

l l l d l v d l

p d l

t L

L

u u A q q q q q

q q (1.30)

where: dq is an attainable and desired trajectory: 1 ( ) 0.25cos(2 ) 0.25dq t t ,

2 ( ) 0.5cos( ) 0.5dq t t , [0, ]t T , and lq is the output trajectory at the thl

iteration; ˆ ( ( ))l tA q , 0,1,...,l N is the learning operator; 0( ) 0t u is the initial

feed-forward control input; vL and pL are learning control gains. Using equation

(1.11) the corresponding error bounded learning update law is given by:

*

1 1 0 1( , ) ( , ) ( , ), ( , ) 0,s s s s

l l l l l lt T t T t T t T u u u u (1.31)

27

*

ˆ ( ( ))[ ( ) ( )],

[0, ], (0, ], ( ) ( ) ;

0, ( , ].

l d l v d l p d l

s s s s

l l l l l d l

s

l

t L L

t T T T T T

t T T

A q q q q q q q

u y y

Then, we consider the following stabilizing P-type feedback controller

proposed in [22]

ˆ ˆ[ ( ) ( )]c d v d l p d lK K u A q q q q q b ˆˆl c Dq f , (1.32)

where vK and pK are the feedback gains, and ˆ ˆ ˆ, , A b D , and ˆcf are the

corresponding estimates of , , A b D , and cf in equation (1.27).

In order to simulate system uncertainties (Eqs. (1.27)-(1.32)) we use two sets

of model parameters which are described in Table 1.1.

Table 1.1. The standard and estimated model parameters.

Standard

Parameters

x10-2

[Nm2] [Nm2] [Nm2]

65.962 15.056 14.534

1d [Nms] 2d [Nms] 1cf [Nm] 2cf [Nm]

17.29 12.49 6.46 3.44

Experimental

Estimations

x10-2

[Nm2] [Nm2] [Nm2]

65.553 14.718 18.478

1d [Nms] 2d [Nms] 1ˆcf [Nm] 2

ˆcf [Nm]

17.29 12.49 8.13 5.51

28

Let assume (0) (0) 0l d q q , 0.3[ ],rad 2 [sec]T . The feedback

gains are 2vK and 0pK , and the learning gains are 0vL and 3pL .

As a result of the implementation of the bounded-error NLIC, for 0l we

have the output trajectory at the first iteration depicted in Fig. 1.3 by a dotted line

together with the desired trajectory (a solid line).

The maximum of the initial tracking error, (Fig. 1.3), is 0.2297 [rad] that is

less than and therefore, the end time of the tracking process is 2 [sec]T .

Further, the bounded-error learning process is illustrated in detail in Fig. 1.4.

Fig. 1.3. The desired trajectory (solid line) and the initial

trajectory (dotted line).

echrk

Highlight

29

Fig. 1.4. The bounded-error learning process.

Fig. 1.5. Max. error of the standard (dotted line) and the

bounded-error (solid line) ILC.

30

Trajectories 2 2 1( )l l lq q q , 3,4,5,6,and 7l are shown and corresponding

supremum norms of the tracking errors are bounded, i.e.,

( ) ( ) , 0.3[ ], [0, )s

l d lt t rad t T q q and ( ) ( )s s

l l d lT T q q , and

(0, ]s

lT T . This is shown for the whole bounded-error learning process by the

profile of the maximum tracking error vs. iteration number, which is depicted by

a solid line in Fig. 1.5 where the corresponding profile for the standard NILC is

plotted by a dotted line.

Referring to Fig. 1.5, we can see that the maximum transient error of the

standard NILC (Eq. (1.30)) shown by a dotted line is successfully truncated

(bounded) as a result (the solid line) of the new bounded-error learning algorithm

(Eq. (1.31)).

Fig. 1.6. Tracking times of the bounded-error learning

process.

31

Moreover, the profile of the values of (0, ]s

lT T , 0 15l , plotted in Fig.

1.6, reveals that, in this particular case, the bounded-error learning process is

faster than the standard one (6

2

10 15s

l

l

T T T

).

1.4 Discussion and conclusions

The main feature of the presented approach is that it solves, in the time

domain, the problem of the transient growth of the tracking error, (Fig. 1.5), while

the other methods concerning this problem provide solutions in the iteration

domain. The implementation of the proposed bounded-error learning update law,

(Eq. (1.11)), results in a premature termination of the tracking process in the time

domain if the tracking error norm is greater than the given value. Consequently

this learning algorithm does not satisfy the postulate which states that the time

interval of duration of each trial must be fixed. If we weaken this postulate i.e.,

the time duration of each trial is finite but not fixed and it is less or equal than the

time duration of the desired trajectory, then the bounded-error algorithm satisfies

this relaxed version.

To some extent, the proposed algorithm could be compared to the Newton-

type ILC [12] because the Newton-type ILC changes the learning update law

from linear to nonlinear depending on the supremum norm of the tracking error.

But the great difference between these methods is that the bounded-error

algorithm terminates the tracking process in the time domain (time interval [0, ]T

) in order to bound the increase in tracking error, while the Newton-type ILC

changes the update law in the iteration domain and therefore it satisfies the first

postulate about the fixed trial duration.

We came to the following conclusions:

A nonlinear iterative learning control for a class of nonlinear time-varying

systems is proposed. Using a new learning strategy (bounded-error

algorithm) it ensures that the transient growth of tracking error is bounded

by terminating the tracking process in the time domain.

The robustness and convergence of the bounded-error learning algorithm

is proven with respect to the time-weighted norm of the input error.

The applicability and efficiency of the bounded-error learning algorithm

are tested by a numerical simulation of the proposed learning control for

32

TT-3000 SCARA-type robot of SEIKO Instruments Inc. The simulation

result reveals that the increase in transient tracking error is successfully

bounded and the convergence rate in this particular case is similar to the

convergence rate of the standard learning algorithm. Moreover, the new

learning process is faster than the standard one and this is of considerable

interest for further research.

33

Chapter 2

Iterative Learning Control with Sampled-Data

Feedback

2.1 Introduction

Sampled-data systems are a class of control systems, where a continuous-time

plant is controlled by a discrete-time controller [23-25]. The stabilization of

sampled-data systems (SD stabilization problem) is motivated by the use of

digital computers in most recent controllers. In particular, a robot arm with

rotational joints controlled by a programmable industrial controller is a typical

nonlinear time-varying sampled-data (SD) system. The difference between the

discrete system and the SD one is shown in Fig. 2.1(a) and Fig. 2.1(b),

correspondingly,

a b

Fig. 2.1. (a) Control of a discrete system [23]; (b) Control of a sampled-data

system [23].

where P is a plant and C is a feedback controller, solid lines represent continuous

signals and dotted lines represent discrete signals, and S and H are a sampler

(Analog to Digital Converter) and a holder (Digital to Analog Converter),

34

correspondingly, and [0, ]t T , 0

: [0, ]K

s

k k kt t kt T

, / sK T t , where

denotes the integer part of a real number, and st is the sampling interval.

Three basic approaches for SD controller design are discussed by Petrew et

al. [23]: 1) continuous-time emulation design, 2) discrete-time discretization

design, and 3) sampled-data direct design. Unfortunately, the efficiency of these

methods for SD system stabilization requires sufficiently high sampling rate.

Therefore, in [26] a feedforward gain was proposed to improve significantly

control efficiency. It is well known in the literature [1, 2, 12] that the best

approach for an improvement of the feedforward control for uncertain systems is

the Iterative Learning Control (ILC).

A classical off-line sampled-data ILC scheme for a plant with a sampled-data

feedback controller attached is depicted in Fig. 2.2(a) [24] where sampled signals

(dotted lines) from a continuous plant P are processed with the digital learning

and feedback controllers L and C, and where the inputs and (solid lines)

are obtained by using a hold device H on the discrete-time signals generated by

both controllers. So, the feedback and feedforward channels have equal sampling

rate (Fig. 2.2(a)), and therefore, SD ILC is a particular case of the multirate ILC

[24, 27] shown in Fig. 2.2 b), where the sampling rate for the feedback loop

(dotted lines in Fig. 2.2(b)) is different than the sampling rate for the feedforward

loop (dashed lines in Fig. 2.2(b)).

a b

Fig. 2.2. (a) Sampled-data ILC with a sampled-data feedback and sampled

signals (dotted lines) have the same sampling rates ; (b) Multirate ILC – the

lu cu

l u

C

P

M E M O R Y

y

l y

l+1 u

S

D←A

d

H

A←D

d y

l u

L

S

D→A

H

A→D

l u

C

P

M E M O R Y

y

l y

l+1 u

S

D←A

d

H

A←D

d y

l u

L

S

D→A

H

A→D

35

sampling rate for the feedback loop (dotted lines) is different than the sampling

rate for the feedforward loop (dashed lines).

In this paper we consider the special case of a continuous ILC for continuous

nonlinear time-varying system (plant P in Fig. 2.3) with a sampled-data feedback

controller attached. This case is applicable to ILC for robot manipulators with

multi-joint arms with 4-6 degree of freedom (DOF) [14] or redundant kinematic

structure [28]. This study is aimed to present a model based nonlinear ILC with

SD feedback that improves SD system stabilization (compensates for the

uncertainties of the sampled data feedback), and to prove the stability (uniform

asymptotic convergence) of the new ILC law for robot manipulators, and to

verify the proposed ILC algorithm by computer simulations of PUMA 560 robot

manipulator.

Chapter 2 is organized as follows. Section 2.2 presents the design of a

nonlinear ILC with a SD feedback for robot manipulators. Section 2.3 introduces

the robustness and convergence analysis of the presented learning algorithm. The

proposed ILC procedure with the SD feedback is validated and examined by

computer simulation in Section 2.4.

2.2 Formulation of the problem

We consider robotic manipulator with n-DOF. The nonlinear multiple-input

multiple-output (MIMO) dynamic model of the robot is based on the Lagrange’s

formulation of equations of motion in the space of generalized coordinates:

l l l l l l A(q )q b(q ,q ) Dq g(q ) f u , (2.1)

where: 0,1,...,l N is the iteration number, lq is the 1n vector of generalized

coordinates (joint angles), * **[ , ], 1,...,i

l i iq Q Q i n , **

iQ is the upper joint limit and

*

iQ is the lower joint limit of i

lq ; ( )lA q is the n n symmetric positive-definite

inertia matrix; the 1n vector l lb(q ,q ) takes into account the Coriolis and

centrifugal torques; 1{ , ... , }ndiag D denotes the diagonal n n matrix of the

echrk

Highlight

36

coefficients of viscous friction; ( )lg q is the 1n vector representing gravity

torques; 1 T

1[ ( ) ( )]n

l n lf sign q ... f sign qf is the vector of coefficients of Coulomb

friction and l c u u u is the 1n vector of generalized torques where lu and cu

are feedforward and feedback terms, respectively. The allowable set of

generalized torques is a rectangular hyper-parallelepiped - max max[ , ]i i iu U U ,

max

iU and max

iU are the upper and lower limits of the control signal iu .

The synthesis of an ILC with SD feedback consists of three steps: first,

synthesis of an update control law, then, synthesis of a SD-feedback control law,

and finally, specification of a learning operator.

We assume the following notations: ( )h h and ( , )H H , and denotes

Euclidean norm.

For the nonlinear MIMO dynamic model in equation (2.1), we propose the

following feedforward update law:

1 [ ( ) ( )]l l d l v d l p d lL L u u L q q q q q q , (2.2)

where: 0,1,...,l N is the iteration number of the ILC procedure, dq is an

attainable and desired trajectory and lq is the output trajectory at the thl iteration;

( ( ))l tL L q , is a learning operator; 0 0 ( )tu u is the initial feed-forward control

input; : [0, ]t t T is the tracking time and [0, ]T is the robot tracking time interval;

vL and pL are learning control gains.

It has to be mentioned that, the calculation of 1lu in Eq. (2.2) is offline and

in the case of noisy measurements of lq or lq , a commonly used method is to

low-pass filter the measured data and then differentiate the resultant signal.

Let us assume the following notations:

0

( ) , ( ) , : [0, ]K

k k k s

k k k kh t h H h H t t kt T

, / sK T t and

1* *, *, [ , )

( , ) , ( , ) , 0,1... 1,, [ , ]

k

k kk t k

K

K

h t t th h t h h h t k K

h t t T

is continuous

piecewise-constant function on [0, ]T , and st is the sampling interval.

Then, we consider the following continuous piecewise-constant feedback

control term [22]

37

*, * ˆ ˆˆ ˆ ˆ( , ) : [ ( ) ( )] ,t k k k k k k k k k k k k

c c c c d v l d p l d lt K K u u u u A q q q q q b Dq g f (2.3)

where: ˆ ˆ ( ( ))k

ktA A q and A is an estimate of the inertia matrix A , obtained by a

parameter identification technique, ˆ ˆ ˆ, ,k kb D g , and ˆ k

f are the corresponding

estimates of , ,k kb D g , and k

f , and vK and pK are the feedback gains.

Usually, the learning operator should be selected to satisfy a sufficient

condition for convergence of the ILC algorithm [1,14]. Therefore, the third step

(learning operator selection) of the ILC synthesis will be completed after the

proof of the learning convergence.

Fig. 2.3. Continuous-time ILC scheme with sampled-data feedback.

Equations (2.1), (2.2) and (2.3) describe the following classical off-line ILC

scheme with SD feedback depicted in Fig. 2.3 where P represents the robot arm;

C and L are feedback and feed-forward controllers, respectively; M is the

memory of the control system; the input trajectory lu is the feed-forward term of

the control law l c u u u and 0,1,...,l N is the current iteration number; lq is

the actual output trajectory; dq is the desired output trajectory; S and H are the

sampler and the holder, correspondingly. The off-line computed feed forward

term 1lu improves the tracking performance of the robot on the next iteration.

In contrast to the SD ILC and multirate ILC schemes shown in Fig. 2.2(a) and

l u

C

P

M

y

L

l y

l+1 u

S

D←A

d

H

A←D

d y

l u

cl

u

u

38

Fig. 2.2 (b), we assume that the proposed ILC (Fig. 2.3) is continuous because of

the following reasons.

The digital ILC controller (computer) calculates off-line the

feedforward control term. Therefore, the discretization frequency

(sampling rate) of the output of the learning controller could be as

high as needed. For example, if the robot actuators are controlled by

a PWM (Pulse Width Modulation) signal, the sampling rate of the

feedforward control output faster than the duty cycle command rate

of the PWM will not be reflected in the PWM output [29] and the

feedforward output could be assumed as a continuous one.

The ILC systems with input and output signals that are transmitted

through a communication network (networked ILC systems [25])

require a sampled-data or a multirate ILC, but this is not the typical

case of industrial robot applications. That’s why we don’t consider

networked ILC systems in this paper.

Recently, several multirate ILC schemes have been proposed to

guarantee good learning transient (acceptable transient error) [27].

As the learning transient (transient growth) is the major problem for

the applicability of a nonlinear ILC [6], we use the bounded error

ILC algorithm [17] in order to solve this problem in a safe, fast and

simple manner and consequently multirate ILC schemes are not

needed to solve the learning transient problem.

It has to be mentioned that the output signal of the feedback controller,

depicted by dotted line in Fig. 2.3, has not to be assumed as a continuous one

because it has to be calculated online in real-time. Moreover, the calculation time

of the feedback, generated by the proposed model based (dynamics based)

controller in Eq. (2.3), cannot be neglected especially for a multi-joint robotic

arm and a zero-order holder which generates a piecewise constant signal

(staircase signal) must be used.

2.3 Convergence and robustness of the learning

control with a sampled-data feedback

In this section we present the main result of this work, that is, the proof of the

39

robustness and convergence of the proposed ILC scheme with a sampled-data

feedback.

We consider a class of multi-input-multi-output nonlinear time-varying

systems described by the following state-space equations [3, 4]:

, (2.4)

where, for and all , , , are not

necessarily continuous, and represents both deterministic and random

disturbances. The functions and are

piecewise continuous in and is differentiable in

and , with partial derivatives and . In addition, the following

assumptions hold [3, 4]:

I. For each fixed initial state with the output map

and the state map

are one-to-one. In this notation

and .

II. The disturbance is bounded on i.e. .

III. The functions , , , and are uniformly globally

Lipschitz in on the interval .

IV. The operators and are bounded on .

V. All functions are assumed measurable and integrable.


, (2.5)

where is a bounded learning operator, and

( ) ( ( ), ) ( ( ), ) ( ) ( )

( ) ( ( ), )

l l l l l

l l

t f t t B t t t t

t g t t

x x x u

y x

{0,1..., }l [0, ]t T ( ) n

l t x R ( ) m

l t y R ( ) r

l t u R

( ) n

l t R

: [0, ]n nf T R R : [0, ]n n rB T R R

[0, ]t T : [0, ]n mg T R R x

t ( , )xg ( , )tg

(0)x ( ) 0

: ([0, ], ) ([0, ], )r n mO C T C T R R R

: ([0, ], ) ([0, ], )r n nS C T C T R R R

( ) ( ( ), (0))l l lO y u x ( ) ( ( ), (0))l l lS x u x

( )l [0, ]T ( )t b

( , )f ( , )B ( , )xg ( , )tg

x [0, ]T

( , )B ( , )xg [0, ] nT R

1 0( ) (1 ) ( , )( )l l l d lt L t u u u y y y

: [0, ]m r mL T R R 0( ) ( )d t ty y

40

allows the influence of a bias term [3]. Equation (2.5) describes a

standard D-type NLIC for a class of nonlinear time-varying systems (Eq. (2.4)),

with a nonlinear time-varying learning operator.

Lemma 2.1 [3].

If is a sequence of real numbers such that

, then .

The proof of Lemma 2.1 is presented in [25].

We define the time-weighted norm ( norm) for a function by

.

Lemma 2.2

If is a discrete form of ( ) and

is the corresponding continuous piecewise-constant function on , then

.

The proof of Lemma 2.2 is presented in Appendix 1.

Using equations (2.2) and (2.3) and taking into account that ( )A q is an n n

symmetric positive-definite matrix, the equation (2.1) can be written as:

*,

1 1

1 1

0

( )

0 0

ll t

c

l l

l

qqu

q A b Dq g A

f uA A

. (2.6)

Let us define

l

l

l

qx

q, *,

1 2( , ) ( ) ( , )t k k

l l l l lf f B f q q q q q ,

1 1( )

l

l

f

q

A b Dq g

and

*, *

2 2 2 2ˆ ˆ ˆ ˆ( , ) : ( ( ) ( ))t k k k k k k k k k k k

d v l d p l d lf f f t f K K A q q q q q b Dq g , (2.7)

[0.1)

{ }, {0,1..., }la l

1 , 0 1l la a 1(1 )lim sup ll

a

( ), [0, ]h t t T

[0, ]

( ) sup ( )t

t T

h t e h t

kh ( ), [0, ]h t t T 0

( ), : [0, ]K

k s

k k k kh h t t t kt T

*,th [0, ]T

*, ( )th h t

41

1

0B

A

, and l ly x .

Thus, treating the Coulumb friction as a disturbance 1

0ˆ( )l

f f

A, from

equation (2.6) we obtain equation (2.4) for r m n .

Let us consider the continuous feedback control law proposed in [22] which

corresponds to the discrete feedback law defined by Eq. (2.3):

ˆ ˆˆ ˆ ˆ( , , ) [ ( ) ( )]c l l d v l d p l d lt K K u q q A q q q q q b Dq g f . (2.8)

Using Eq. (2.7) and Eq. (2.8) we obtain the continuous form of *,

2

tf

2ˆ ˆ ˆ ˆ[ ( ) ( )]d v l d p l d lf K K A q q q q q b Dq g . (2.9)

Let us define ( ) ( )l d lt t u u u and ( ) ( )l d lt t y y y . Because in all practical

cases the input signal belongs to a compact subset of rR further we will consider

that ( ), [0, ]: ( )l l ut t T t b

u u where [0, ]

( ) sup ( )l lt T

t t

u u is the supremum norm.

For simplification, we introduce the following notations:

( )l l tx x , ( )d d tx x , ( )l l ty y , ( )d d ty y ,

( )l l tu u , ( )d d tu u , ( )l l tω ω , ( ( ), )l l lf f t t x ,

( ( ), )d d df f t t x , ( ( ), )l l lB B t t x , ( ( ), )d d dB B t t x ,

( )( ( ), )lxl tg g t t

x xxx

, ( )( ( ), )dxd tg g t t

x xxx

,

( )( ( ), )ltl tg g t t

t

x xx , ( )( ( ), )dtd tg g t t

t

x xx ,

and gxk , gtk , fk , Bk are Lipschitz constants for ( , )xg , ( , )tg , ( , )f , and ( , )B

respectively, and gxb , udb , 2f db , Lb , and Bb are the corresponding norm bounds

on ( , )xg , ( )d u , *,

2 ( , )t

df , ( , )L , and ( , )B , and d d d db f B

u .

Now we proceed to prove a sufficient condition for convergence and

42

robustness of the ILC with the SD feedback for robot manipulators described in

Eqs. (2.6), (2.5) and (2.3).

Theorem 2.2

Let the system of robot dynamics with a SD feedback controller attached is

described by the state-space equations in (2.6) where the feedback control law is

given by Eq. (2.7) with the piecewise-continuous function *,

2

tf and the function

2f in Eq. (2.9) is the continuous form of *,

2

tf . If the system in (2.6) with

1 2( , ) ( ) ( , )l l l l lf f B f q q q q q , i.e. – the feedback control is given by 2f in Eq.

(2.9), satisfies Assumptions I-V and use the update law in Eq. (2.5), and the

sufficient condition (2.10) for robustness and convergence of the ILC is satisfied,

, (2.10)

and given an attainable desired trajectory , is the

corresponding input, and is the corresponding state (according to I), the

initial state error is bounded by , then, as , the error

between and is bounded. In addition, the state and output asymptotic

errors are bounded. These bounds depend continuously on the bound on the initial

state error, bound on the state disturbance, and , as , , and tend to zero,

these bounds also tend to zero.

Proof

From Eq. (2.5) it follows

1 0(1 )l d l lL u u u u y . (2.11)

Using the above notations for ly from Eq. (2.4) we obtain:

( ) ( )l xd d d d td xl l l l l tlg f B g g f B g y u u . (2.12)

Combining Eq. (2.12) with *, *,

2 2( ) ( )t t

xl d d d l d xl d d d l dg f B B f g f B B f u u and

(1 ) ( ( , ), ) ( , ) ( , ) 1, ( , ) [0, ]n

xL g t t g t B t t T I x x x x R

( ) ( ( ), ), [0,T]d dt g t t t y x ( )d tu

( )d tx

(0) (0)d lx x 0xb l

( )l tu ( )d tu

0xb b

43

( )xl l d l dg B Bu u , and *,

1 2

t

l l l lf f B f , and *,

1 2

t

d d d df f B f yields

*, *,

1 1 2 2

*,

2

( )( ) ( ) ( )

( )( ) ( ) ( )

t t

l xd xl d d d xl d l xl l d l

t

xl d l d d xl l l td tl xl l

g g f B g f f g B f f

g B B f g B g g g

y u

u u. (2.13)

Combining Eq. (2.11) with d d u u and Eq. (2.13) yields

*, *,

1 1 2 2

1 0 *,

2

( ) ( )( )

( ) ( )(1 )

( )( )

( )

xl l l xd xl d d d

t t

xl d l xl l d l

l l t

xl d l d d

td tl xl l

g B g g f B

g f f g B f fL

g B B f

g g g

u u

u u uu

. (2.14)

Taking norms, using the bounds and using the Lipschitz conditions (see

assumptions II-IV) from Eq. (2.14) we obtain:

1 0

*,

1 2 2

(1 )

( ( ( )))

l xl l l

t

L gx d gt gx f B ud f d l f gx B l gx

Lg B

b k b k b k k b b k b b b b

u I u u

x x. (2.15)

Multiplying (2.15) by te , defining

1 1 2( ( ( )))L gx d gt gx f B ud f dk b k b k b k k b b , 2 2L f gx Bk b k b b and 3 L gxk b b ,

and assuming the sufficient condition for convergence (2.10) the inequality

(2.15) simplifies to

*,

1 0 1 2 3

t t t t t t t

l l l le e e k e k e k e b

u u u x x . (2.16)

Taking the supremum of both sides of (2.16) and applying Lemma 2.2 to *,t

lx yields

1 0

[0, ] [0, ] [0, ]

1 2 3[0, ]

sup sup sup

( ) sup

t t t

l lt T t T t T

t

lt T

e e e

k k e k b

u u u

x. (2.17)

44

Using the time-weighted norm definition and defining 1 2k k k from (2.17)

we obtain

1 0 3l l lk k b u u u x . (2.18)

On the other hand, using the integral expression

( ) (0) ( ( ), ) ( ) ( ( ), ) ( )t

l l l l l lt B f d x x x u xo

,

where *,

1 2( ( ), ) ( ( ), ) ( ( ), ) ( ( ), )tf t t f t t B t t f t t x x x x and *,

2 ( ( ), ), [0, ]tf t t t Tx is

continuous piecewise-constant function, and taking norms for lx we have

1 1

*, *,

2 2

(0)t t

l l d d l l d l

t t

d d l l l

B B d f f d

B f B f d d

x x u uo o

o o

. (2.19)

Adding *, *,

2 2

t t

l d l dB f B f and l d l dB Bu u to Eq. (2.19), and using the bounds, and

the Lipschitz conditions yields

*,

4 5(0)t t t

l l B l l lb d k d k d b t

x x u x xo o o

, (2.20)

where 4 2 1( ( ) )B ud f d fk k b b k and 5 2B fk b k .

Multiplying (2.20) by te we have that

*,

4 5

(0)

.

tt t t

l l B l

t tt t t

l l

e e b e e e d

k e e e d k e e e d b te

x x u

x x

o

o o

(2.21)

Noticing that e is always of the same sign and *,

[0, ] [0, ]

sup ( ) sup ( )t t t

l lt T t T

e t e t

x x , (see Lemma 2.2), and applying the Second

Mean Value Theorem for Integrals [30] to the integral expressions in inequality

(2.21) yields:

45

[0, ]

4 5[0, ] [0, ]

(0) sup

sup sup

tt t t

l l B lt

tt t t

l lt t

e e b e e e d

k e e k e e e d b te

x x u

x x

o

o

. (2.22)

Taking into account that [0, ] [0, ]

sup sup , [0, ]t

l lt t T

e e t T

x x and taking the

supremum of both sides of (2.22) we have that

1

1

6

(0) (1 )

(1 )

T

l l B l

T

l

b e

k e b T

x x u

x, (2.23)

where 6 4 5k k k . Thus, from inequality (2.23) we obtain

1

6

1

6

1

6

( (1 )) (0)

(1 )( (1 ))

( (1 ))

T

l l

T T

B l

T

k e

b e k e

T k e b

x x

u . (2.24)

Combining inequalities (2.18) and (2.24) yields

1

6

6

3 0

6

(0)(1 )

(1 )

(1 )

(1 )

l lT

T

BlT

T

k

k e

kb e

k e

k Tk b

k e

u x

u

u

. (2.25)

Defining 1

6(1 )( (1 ))T T

Bkb e k e and 1

7 6( (1 ))Tk k k e ,

and 1

8 3 6( ( (1 )) )Tk k k T k e , and 7 8 0(0)lk k b x u we have

that

1l l u u . (2.26)

46

Since 1 we can find 6 (1 ) : 1Tk e .

Thus we can apply Lemma 2.1 so that

1

1limsup (1 )ll

u . (2.27)

implying that lu converges to du when l and (0) 0l x , and 0b , and

0 implying that 0 .

Corollary 2.1

If the update law (2.5) is replaced with

, (2.28)

with bounded, then Theorem 2.1 still holds. Equation (2.28) describes a

PD-type ILC.

It has to be mentioned that all corollaries of Theorem 1.1 in Chapter 1

proven in [3] for continuous-time systems are also valid for Theorem 2.2 (for

nonlinear time-varying systems with sampled-data feedback) implying that the

Corollary 2.1 holds and consequently the proposed PD-type ILC update law

(2.2) is robust and convergent if the sufficient condition for convergence (2.10)

is satisfied.

Moreover, for robust convergence the update law (2.2), (2.5) and (2.28) must

contain derivatives of the output [3].

Thus, applying Theorem 2.2 to prove the convergence of considered ILC

algorithm with SD feedback, for the nonlinear system in Eq. (2.6) one can obtain

from Eq. (2.10) the simplest ( 0, l l y x ) sufficient condition for convergence

of the proposed ILC design for robotic manipulators [3]:

1 1 I LA . (2.29)

where I is the identity matrix of size n.

Consequently, the learning operator remains to be specified so that the

sufficient condition for robustness and convergence, Eq. (2.29), to hold.

1 0( ) (1 ) ( , )( ) ( , )( )l l l d l l d lt L t K t u u u y y y y y y

( , )K

47

2.4 Simulation results

In this section, we present the simulation results from implementation of the ILC

with SD feedback described in the previous sections. We consider the dynamic

model of PUMA 560, 6 DOF, robot reported in [31] and given by Eq. (2.1).

Table 2.1. Model parameters of the virtual robot arm.

ξ Links 1 2 3 4 5 6

Mass values [kg] (without Wrist)

i1 6,...,1 , imi - 17.4 4.8 0.82 0.35 0.09

Link center of gravity (COG) [mm]

i2 xiS - 68 0 0 0 0

i3 yiS - 6 -70 0 0 0

i4 ziS - -16 14 -19 0 32

Moments of inertia about COG [kgm2]

i5 xxiI - 0.130 66.0 e-3 1.8 e-3 300 e-6 150 e-6

i6 yyiI - 0.524 12.5 e-3 1.8 e-3 300 e-6 150 e-6

i7 zziI 0.35 0.539 86.0 e-3 1.3 e-3 400 e-6 40 e-6

Effective motor inertia [kgm2]

i8 miJ 0.784 2.305 0.576 0.1057 0.0946 0.1074

Kinematic constants i9

deg.][ 1i 0 -90 0 90 -90 90

i10 m][ 1ia 0 0 0.4318 -0.0203 0 0

i11 m][ id 0 0.2435 -0.0934 0.4331 0 0

Viscous friction coefficients [Nmsec/rad]

i12

0, ii q

3.45 8.53 3.02 - - -

0, ii q

4.94 7.67 3.27 - - -

Coulomb friction coefficients [Nm]

48

i13

0, ii qf

8.26 11.34 5.57 - - -

0, ii qf

8.43 12.77 5.93 - - -

For a realistic computer simulation of parameter uncertainty we are going to

use two sets of model parameters. We assume the explicit dynamic model of

PUMA560 robot arm (left-hand side of the Eq. (2.1)) with parameters reported

by Armstrong et al. [31], Corke, and Armstrong-Helouvry [32]. This first set of

model parameters is described in Table 2.1 and we consider them as virtual

parameters 1 13( ,..., ), 1,...,6i i i ξ , where i corresponds to the link number.

Table 2.2. Model parameter estimates of the virtual robot arm.

ξ Links 1 2 3 4 5 6

Mass values [kg] (without Wrist)

i1 6,...,1 , imi - 22.4 5 1.2 0.62 0.16

Link center of gravity (COG) [mm]

i2 xiS - 103 20 0 0 0

i3 yiS - 5 -4 -3 -1 0

i4 ziS - -40 14 -86 -10 3

Moments of inertia about COG [kgm2]

i5 xxiI - 0.403 74.8 e-3 5.32 e-3 487 e-6 123 e-6

i6 yyiI - 0.969 7.3 e-3 5.20 e-3 482e-6 123 e-6

i7 zziI 0.177 0.965 75.6 e-3 3.37 e-3 572 e-6 58 e-6

Effective motor inertia [kgm2]

i8 miJ 0.776 2.34 0.5823 0.1057 0.0946 0.1074

Kinematic constants

i9 deg.][ 1i 0 -90 0 90 -90 90

i10 m][ 1ia 0 0 0.4318 -0.0203 0 0

i11 m][ id 0 0.2435 -0.0934 0.4331 0 0

49

Viscous friction coefficients [Nmsec/rad]

i12

0,ˆ ii q

3.85 8.89 5.31 - - -

0,ˆ ii q

3.20 11.7 2.91 - - -

Coulomb friction coefficients [Nm]

i13

0,ˆ ii qf

6.74 13.0 5.87 - - -

0,ˆ ii qf

7.24 14.7 4.19 - - -

Torque limits [Nm] i14

maxˆiU 100 180 90 25 25 25

For the learning-control-input synthesis, we assume the PUMA 560 model

parameters estimated by Tarn et al. [33]. This second set of parameters, pointed

out as 1 14ˆ ˆ ˆ( ,..., ), 1,...,6i i i ξ , is shown in Table 2.2 and we consider it as the set

of identification estimates of the virtual parameters.

From Eq. (2.29), following Arimoto’s ideas in [1] for better convergence rate,

we consider a learning operator to be as close as possible to the inertia matrix.

Therefore, we propose the learning operator to be identically equal to , i.e. -

[14]. Thus, the ILC synthesis for robotic manipulators is completed by

specification of the learning operator.

Let the desired trajectory be given in generalized (joint) coordinates by:

1 2 3

4 5 6

( ) 0 ; ( ) 2cos(4 ) 2 ; ( ) cos(4 ) 1 ;

( ) 3cos(4 ) 3 ; ( ) 1.5cos(4 ) 1.5 ; ( ) 0

d d d

d d d

q t q t t q t t

q t t q t t q t

(2.30)

and [0,0.25 ]t .

Thus, for [ 2 ,2 ], 1,2...,6i

lq i , ˆL A , combining Eq. (2.1) with Eq. (2.2)

and Eq. (2.3) yields the control law of the proposed NILC with SD feedback:

A

ˆL A

50

1 1

1 1

1

ˆ [ ( ) ( )]

ˆ ˆ[ ( ) ( )]

ˆˆ ˆ

l l l d l v d l p d l

k k k k k k k

d v l d p l d

k k k

l

L L

K K

Aq b Dq g f u A q q q q q q

A q q q q q b

Dq g f

(2.31)

where {0,1..., }l N and ˆ ˆ ˆ ˆ, , ,A b D g , and f , calculated by using parameters in

Table 2.2, are the corresponding estimates of , , ,A b D g , and f in Eq. (2.1),

calculated by using parameters in Table 1.1 .

We are able to validate inequality (2.29) for [ 2 ,2 ], 1,...,6i

lq i . A

computation of the maximum of the left-hand side of (2.29) results in:

1ˆˆ ( ) ( ) 0.6013il

l lq

ma x I A q ,ξ A q ,ξ with a tolerance of 0.0018 and consequently

the sufficient condition for convergence (2.29) holds with respect to

[ 2 ,2 ], 1,...,6i

lq i , and the NILC procedure proposed in Eq. (2.31) is

convergent.

The maximal error of the iterative learning procedure is given by:

max max( ), 0,1,...,l

le ma x e l N (2.32)

where:

max ( ) ( ) , [0, ]l l dt

e ma x t t t T q q . (2.33)

In fact, there are great differences between values of ξ and ξ reported in

Table 2.1 and Table 2.2, respectively. Consequently a significant maximal initial

tracking error max

0e is expected. Therefore, solving the nonlinear differential

equation (2.1), for 0l and 6vK , and 3pK , with 0 ( ) 0t u ,

*,

0ˆ( , , )t

c c tu u q ξ in Eq. (2.3), for sampling interval 0.1[sec]st , and with initial

conditions 1(0) 0q , and 1(0) 0q results in trajectories 0 ( )tq ( 0 ( )iq t ), which are

depicted in Fig. 2.4 together with the desired trajectories ( ( )i

dq t ).

51

Fig. 2.4. The initial and desired trajectories, (a)0

iq , i

dq , 1,2,3i ; (b)0

iq , i

dq ,

4,5,6i

Simulation results for the first iteration, shown in Fig. 2.4(a), reveal

significant initial tracking errors of the second and third joints. It has to be

mentioned that the on-line computation time of the feedback control (computed

torque calculated by Matlab software) is from 0.0620 sec. to 0.0940 sec. and,

consequently, the sampling interval of 0,1 sec. is adequate for simulation. The

time of the offline computation (update) of the feedforward control is

approximately 30 sec., depending on the solution of the dynamic equations, and

in real implementation this calculation time could be reduced by use of optimal

spline approximation and powerfull computer system. The initial tracking error max

0e given by Eq. (2.33) for 1l arises not only due to parameter uncertainties

but also due to the large sampling interval 0.1[sec]st . For example,

max

0 3.1944 [ ]e rad for 0.1[sec]st and max

0 2.8745 [ ]e rad for 0.0[sec]st .

The calculation of the maximal tracking errors max

le for 0,1...20l by solving

the nonlinear differential equation (2.31) ( 6vK , 3pK , 0vL , 0pL ,

0 ( ) 0t u , 0.1[sec]st , (0) 0l q , (0) 0l q ) reveals the convergence behavior and

the convergence rate of the proposed NILC with SD feedback. The profile of max

le

versus iteration number is depicted (continuous line) in Fig. 2.5.

a b

52

Fig. 2.5. max

le obtained by the standard (cont. line) and bounded-error (dotted

line) algorithms.

Referring to Fig. 2.5 one can see a fast and monotonic convergence of the

iterative procedure and obviously in this case the maximal learning error maxe

given by Eq. (2.32) equals to the initial tracking error max

0e . But a problem for the

practical application of the considered NILC arises because of the large initial

error max

0e and this problem can be solved by using the “Bounded-error algorithm”

for NILC presented in Chapter 1.

The profile of max

le obtained by the “Bounded-error algorithm” (0.5 [rad] error

norm bound) is shown (dotted line) in Fig. 2.5.

In order to illustrate how the feedforward control term ( ), [0,0.25 ]l t t u

compensates for the uncertainties of the sampled data feedback control term *,t

cu

we investigate by simulation the control input at the final 20th iteration which

results in the profiles of *,t

cu , 20u and u shown in Fig. 2.6.

The graphs in Fig. 2.6(a), Fig. 2.6(b) and Fig. 2.6(c) show the feedback *,( , )i t

c c cu u u , feedforward 20( )iu , and the resultant 20( , 1,2,3)i i i

cu u u i control

inputs for the first, second and third joints correspondingly while the Fig. 2.6(d),

Fig. 2.6(e), and Fig. 2.6(c) illustrate the same control terms but for the fourth,

fifth, and sixth joints of the robot arm.

53

Fig. 2.6. Feedback i

cu : (a) 1 3i , (d) 4 6i ; Feedforward 20

iu : (b) 1 3i ,

(e) 4 6i ; 20

i i i

cu u u : (c) 1 3i , (f) 4 6i ; Cont. line: 1,4i ; Dashed line:

2,5i ; Dotted line: 3,6i .

Finally, the value-adding process of the continuous piecewise-constant

feedback control (dotted line) and the piecewise feedforward control (dashed

line) that results in the continuous control input (continuous line) of the second

joint is schematically represented in Fig. 2.7.

The simulation result shows that the piecewise continuous feedforward

control compensates for inaccuracies (instability) that arise from the piecewise-

constant feedback with sampling interval 0.1 sec. This sampling interval is

relatively long and makes possible real-time implementation (real-time

computation) of complex dynamics-based feedback control taking into account

properties of modern CPUs.

b a c

d e f

54

Fig. 2.7. Feedback 2

cu (dotted line), feedforward 2

20u (dashed line) and

2 2 2

20 cu u u (cont. line).

2.5 Conclusions

The continuous nonlinear iterative learning control with sampled-data feedback

proposed in this paper for continuous nonlinear multiple-input multiple-output

time varying systems, such as robotic manipulators, consists of two controllers

(see Eq. (2.31)): a continuous time learning controller (see Eq. (2.2)) and a

sampled-data (SD) feedback controller (see Eq. (2.3)). An off-line ILC scheme

with SD feedback is presented in Fig. 2.3. The SD feedback is required for

feedback controllers based on complicated dynamic models, for instance,

dynamic models of 6 or 7 DOF robotic manipulators (see Eq. (2.31)). In this case,

the nonlinear model based feedback control requires a lot of on-line calculations

that cause a time delay and the corresponding sampling interval which should not

be neglected, while the learning controller requires off-line calculations and

therefore, the nonlinear model-based feedforward control can be assumed as a

continuous one.

We came to the following conclusions for the proposed NILC with SD

feedback for robotic manipulators:

55

The robustness and convergence of the proposed NILC with SD feedback

is proven (by Theorem 2.2) with respect to the time-weighted norm (

norm) of the control input error.

The sufficient condition for robustness and convergence of the proposed

NILC with SD feedback is the same as the corresponding condition of a

standard NILC with a continuous feedback controller attached implying

that the convergence does not depend on the sampling interval, as it is the

case for other types of Sampled-data ILC (Fig 2. (a)) and Multirate ILC

(Fig 2. (b)) controllers.

The proposed NILC with SD feedback for PUMA560 robotic

manipulator is robust and convergent.

The analysis of the iterative learning control simulations on the

PUMA560 robot reveals that at the final iteration the piecewise

feedforward control term successfully compensates for the uncertainties

of the continuous piecewise-constant feedback control term so that the

sum of both terms is a continuous control input.

The proposed solution to the instability problem in the Sampled-Data

feedback is the main contribution of this paper to the ILC implementation

in practice.

In all practical cases of robotic manipulators the input signal belongs to a

compact subset and the output trajectory belongs to an allowable subset

( ). If the ILC update law (see Eq. (2.2)) produces a

feedforward input that is out of the allowable compact set of inputs, and this set

is a convex set, then projecting back into the allowable set ensures the validity of

Theorem 2.1. If the output trajectories are out of the corresponding allowable set

due to trajectory errors, then an application of the “Bounded-error algorithm”

presented in Chapter 1 can solve this problem for NILC.

The main contribution of Theorem 2.2 is that it proves the asymptotic

convergence ( ) of ILC with Sampled-Data feedback controller, because

only the asymptotic convergence of an ILC can assure its practical

implementation.

[ 2 ,2 ], 1,2...,6i

lq i

0

56

Chapter 3

Iterative learning control with bounded transient

errors: an application for PUMA 560 robotic

manipulator

3.1 Introduction

Iterative learning control (ILC) for robotic manipulators is a class of self-tuning

algorithms, which repeatedly implement assigned tasks of robot motions in order

to minimize positioning or trajectory-tracking errors, implementation time,

power costs or improve other motion characteristics. Thus, the ILC is applicable

for industrial robots, which are working in a repeatable manner in a determinate

environment.

The main advantages of the ILC for robotic manipulators are as follows. The

ILC minimizes the trajectory-tracking errors that arise from the unmodeled

dynamics of an actual robot. Theoretically, using only a feedback controller and

increasing the corresponding feedback gains also leads to smaller tracking-errors

[36,37,38], but in practice it may result in an unstable robot motion [37,38]. That

is why a precise feedforward control, obtained by an ILC procedure, can improve

tacking accuracy and can avoid closed-loop stability problems. Moreover at the

end of the ILC procedure the feedback terms of the control input are getting

reduced, so the feed-forward controller gets domination over the feedback

controller [16] and thus it reduces the power costs of the robot motion.

Anyway, the applicability in practice of the ILC for robotic manipulators

suffers from the following basic problems. The number of iterations, which are

necessary for a successful ILC procedure, is very important for the ILC

implementation. Therefore, first, the ILC procedure must be convergent to a

desired trajectory-tracking error and then an admissible number of iterations

should produce the preliminarily given tracking accuracy. Unfortunately, the

conditions for ILC convergence proposed in [1,3,35,36] are only sufficient but

not necessary. Consequently, an ILC procedure could have a very good rate of

57

convergence even if the corresponding sufficient condition for convergence does

not hold [14]. In addition, when these conditions apply the tracking error

converges to zero as the iteration number tends to infinity. So the ILC

convergence with respect to different learning operators or learning gains can be

examined by computer simulation [1,34-36,14]. If the computer simulation of an

ILC procedure reveals a fast convergence for a particular learning operator and

desired trajectory then it is reasonable to expect an acceptable convergence rate

during the implementation of this ILC for the actual robot. The validation of the

sufficient conditions for ILC convergence also may cause problems due to

necessity of knowledge of the real robot dynamics. It should be pointed out that

only estimations of dynamic model parameters are available through

identification techniques [31,33,32]. Therefore, to avoid the above-mentioned

problems, two conditions for ILC convergence are proposed in [14]. Both

conditions concern only estimations of the dynamic parameters and the accuracy

of the applied identification method. An inadmissible error of the initial trajectory

tracking also can spoil the practical applicability of the corresponding ILC

procedure. As it was mentioned, properly tuned feedback gains of a closed-loop

ILC can reduce the initial tracking error but in spite of the small enough initial

tracking error the ILC could fail due to a non-monotonic convergence of the

iterative procedure. The ILC computer simulations presented in [35,36,14,15]

reveal the non-monotony of maximal tracking errors versus iterations, which

leads to tracking errors of some trials of the ILC procedure that exceed the

maximal initial error. An NILC algorithm, aimed at solving problems of the

initial tracking error and non-monotonic convergence, is proposed in [15]. The

algorithm is based on the fact that the time interval of the trajectory performance

is divided into N subintervals and the NILC is executed for the first subinterval,

and than for the firs and the second subintervals, and so forth. Considerable

difficulties in implementation of this algorithm arise from the necessity of

determining N through computer simulations so that the acceptable maximal ILC

tracking error to be obtained.

The objective of this paper is to present and examine, by computer

simulation, a new NILC algorithm for robotic manipulators. This algorithm

avoids problems of the practical implementation of the NILC that arise from an

inadmissible initial tracking error or maximal tracking error of the whole NILC

procedure, i.e. the new ILC algorithm solves the problem that arises from

possible large transient growth ( increasing transient error). Transient growth is

a major problem for an ILC practical application [6], because at the beginning of

the iterative procedure the tracking error temporarily grows very large, before to

58

converge to a small value. This problem is generally avoided by using some

criteria for monotonic convergence [2], but it requires fundamental performance

limitations [39]. The new bounded error learning algorithm (Chapter 1) is not

addressed to a monotonic convergence in order to overcome the transient growth

problem and a classical condition for ILC convergence (Eq. (1.4)) is required.

The chapter is organized as follows. Section 3.2 presents the full dynamic

model of PUMA560, six degrees of freedom, robot [31,32]. Section 3.3 presents

the considered NILC for robotic manipulators. The new algorithm for the NILC

procedure performance is proposed in Section 3.4. The proposed NILC

procedure and the new algorithm are validated and examined by computer

simulation in Section 3.5.

3.2 Dynamic model of PUMA560 robot

We consider PUMA560 robotic manipulator with 6 degrees of freedom (6-DOF).

The nonlinear MIMO dynamic model of the robot is based on the Lagrange’s

formulation of equations of motion in the space of generalized coordinates:

ufg(q)qD)qb(q,qA(q) (3.1)

where: q is the 16 vector of generalized coordinates (joint angles),

6,...,1],2,2[ iqi ; )(qA is the 66 symmetric positive-definite inertia

matrix; the 16 vector ][][ 2qC(q)qqB(q))qb(q, takes into account the

Coriolis and centrifugal torques where ( )B q and )(qC are corresponding 156

and 66 matrices [40]; },...,{ 61 diagD denotes the diagonal 66 matrix

of the coefficients of viscous friction; )(qg is the 16 vector representing

gravity torques; T6611 )]()([ qsignf...qsignf f is the vector of coefficients of

Coulomb friction and cl uuu is the 16 vector of generalized torques where

lu and cu are feed-forward and feedback terms, respectively. The allowable set

of generalized torques is a rectangular hyper-parallelepiped –

],[ maxmaxiii UUu . The symbols ][ qq and ][ 2

q are the 115 vector of velocity

59

products and the 16 vector of squared velocities, respectively. They are given

as follows:

6 ,],...,[][ T12421 nqqqqqqqq...qq,qq nnnnn3121 qq ; T2

621 ],[][ q...q qq .

This dynamic model is appropriate for computer simulation of a learning

control, since matrices )( ),( ),( qCqBqA and the vector )(qg are expressed in

explicit form in [31].

The coordinate frames attached to the PUMA560 links are shown in Fig. 3.1

and are assigned by the modified Denavit-Hartenberg method [41].

For a realistic computer simulation of the NILC procedure, proposed in the

next section, we are going to use two sets of model parameters. We assume the

explicit dynamic model of PUMA560 robot arm (left-hand side of the Eq. (3.1))

with parameters reported by Armstrong et al. [31], Corke, and Armstrong-

Helouvry [32]. This first set of model parameters is described in Table 2.1 and

we consider them as virtual parameters 6,...,1),,...,( 131 iii ξ , where i

Fig. 3.1. Kinematic and design scheme of PUMA 560 robot arm in its initial

position with zero values of generalized coordinates [14].

1

2

3

4

5

6

1X

1Y

1Z

0X

0Y

0Z

2X

2Y

2Z

3X

3Y

3Z

5Y

654 XXX

564 ZYY

64 ZZ

60

corresponds to the link number. For the learning-control-input synthesis, we

assume the PUMA 560 model parameters estimated by Tarn et al. [33]. This

second set of parameters, pointed out as 6,...,1),ˆ,...,ˆ(ˆ141 iii ξ , is shown in

Table 2.2 and we consider it as the set of identification estimates of the virtual

parameters.

The joint-torque limits are described only in Table 2.2 because ξ concerns

the control input u . These limits are calculated using parameter estimates of

PUMA560 motors and drives reported in [32].

Another advantage of usage of the PUMA560 dynamic model is the

availability of more than two sets of parameter estimates [32], so that one can

select the appropriate model parameter values for an accurate computer

simulation of robot motion. However, it is necessary to mention that there are

considerable differences between some values of corresponding parameters

reported in Table 2.1 and Table 2.2, respectively, and an inaccurate tracking of

prescribed robot trajectories should be expected. Thus, using inaccurate

parameter estimates we are able to validate, through a computer simulation, the

efficiency of the new NILC algorithm, proposed in Section 3.4.

3.3 Iterative learning control for robotic manipulators

The control system of an industrial robotic manipulator is typically based on a

feedback controller. An improved ILC scheme for a robotic manipulator with a

feedback controller attached is depicted in Fig. 3.2:

Fig. 3.2. Learning control scheme with a feedback controller attached.

C

P

M

M

L,K

ul ul+1

qd ql

qd

61

where: P represents a plant; C is a feedback controller; the vector of the input

trajectory lu is the feedforward term of the control law; lq is the actual vector of

the output trajectory; dq is the vector of the desired output trajectory. The

learning controller {L,K} improves lu by using the tracking error of the output

trajectory ( )d ly q multiplied by a learning gain matrix K (P-type learning)

and/or the error derivative ( )d lq q multiplied by a learning operator matrix L

(D-type learning). If the learning operator is based on the dynamic model of the

plant then the ILC can be specified as a model-based one.

The realization of the considered ILC scheme (Fig. 3.2) requires the

specification of feed-forward and feedback controllers, respectively. The feed-

forward controller is based on the update control law that improves the feed-

forward control term ( lu ).

For the nonlinear MIMO dynamic model (3.1), we propose the following

update control law:

1( ) ( ) ( ( ))[ ( ) ( ) ( ( ) ( ))

( ( ) ( ))]

l l l d l v d l

p d l

t t t t t L t t

L t t

u u L q q q q q

q q, (3.2)

where: ))(( tlqL , Nl ,...,1,0 is a learning operator; )(0 tu is the initial feed-

forward control input; ],0[: Ttt is the tracking time and ],0[ T is the robot

tracking time interval; vL and pL are positive constants.

The convergence of the iterative procedure (3.2) is necessary for its

successful implementation. A sufficient condition for robustness and

convergence of the proposed update control law (3.2) is proven in [3] for the

considered dynamic model of robotic manipulators (3.1), i.e.:

11 LAI , (3.3)

where I is the identity matrix of size 6; A is the inertia matrix of the dynamic

equations of motion (3.1); L is the learning operator that is to be specified, and

... is the Euclidean matrix norm.

62

Unfortunately, the sufficient condition (3.3) is not applicable in practice,

because the inertia matrix A of an actual robot is not exactly known and only an

estimation of A could be available through identification [31,33]. Therefore, we

have to specify a learning operator that satisfies (3.3), but we are not able to

obtain the learning operator directly from (3.3). That is why, following ideas

from [1] for better convergence rate, we consider a learning operator to be as

close as possible to the inertia matrix and, in particular, it is to be identically

equal to an estimation of the inertia matrix - )ˆ,(ˆ)ˆ,( ξqAξqL [14,15], where ξ

is the vector of estimated parameters stated in Table 2.2. Thus, for a virtual

PUMA560 robot arm based on the dynamic model (3.1) with parameters ξ stated

in Table 2.1, from (3.3) we obtain the sufficient condition for convergence in

case of )ˆ,(ˆ)ˆ,( ξqAξqL , i.e.:

1)()ˆ(ˆ 1 ξq,Aξq,AI . (3.4)

This condition also is not applicable to a real robotic manipulator, but it is

applicable to computer simulation of the virtual robot. So using the explicit

formulation of A reported in [31] and the parameter values of ξ and ξ from

Table 2.1 and Table 2.2, respectively, we are able to validate inequality (3.4) for

6,...,1],2,0[ iqi . A computation of the maximum of the left-hand side of

(3.4) results in:

]2,0[,6013.0)()ˆ(ˆ 1 i

qqxam

i

ξq,Aξq,AI (3.5)

with a tolerance of 0018.0 . From (3.5) it follows that the sufficient condition

for convergence (3.4) holds with respect to 6,...,1],2,0[ iqi and

consequently the proposed NILC procedure (3.2) is convergent. Moreover, since

the inertia matrix A is a periodic function of q with period 2 , i.e.

6,...,1))),(cos()),((sin()( iqq ii AqA [31], the learning procedure (3.2) is

convergent in the whole space of generalized coordinates 6,...,1),,( iqi

.

63

Before specifying )(0 tu , vL and pL it is reasonable to formulate the

feedback term cu of the control law cl uuu . Therefore, we consider the

following stabilizing feedback term [22]:

ˆ ˆ ˆˆ ˆ( , ) ( , )[ ( ) ( )] ( , , )

ˆ ˆ ˆˆˆ ˆ( ) ( , ) ( ), 1,...,

c l l d v l d p l d l l

l l l N

u q ξ A q ξ q K q q K q q b q q ξ

D ξ q g q ξ f ξ, (3.6)

where, according to Fig. 3.2, lq and lq are the feedback generalized coordinates

and velocities of the output trajectory before memorizing it on the current

iteration l , and by means of parameters from Table 2.2 gDbA ˆ,ˆ ,ˆ ,ˆ and f are the

corresponding estimates of gDbA , , , and f of Eq. (3.1).

Using parameters from Table 2.1 and Table 2.2 and combining Eqs. (3.1),

(3.2) and (3.6) yields the following NILC law for robotic manipulators in case of

)ˆ,(ˆ)ˆ,( ξqAξqL :

1 1 1 1 1 1

1 1

1

1 1

( , ) ( , ) ( ) ( , )

ˆ ˆ( ) ( , ) ( , ),

ˆ ˆ ˆˆ( , ) ( , ) ( ( ), )( ( ) ( ))

ˆˆ ( ( ), )[ ( ( ) ( )) ( ( ) ( ))],

ˆ ˆˆ( , ) ( ,

l l l l l l

l c l

l l l d l

l v d l p d l

c l l

t

t t t t t

t L t t L t t

A q ξ q b q ,q ξ D ξ q g q ξ

f ξ u ξ u q ξ

u ξ u ξ A q ξ q q

A q ξ q q q q

u q ξ A q ξ 1 1 1 1

1 1 0

ˆˆ)[ ( ) ( )] ( , , )

ˆ ˆˆˆ ˆ( ) ( , ) ( ), ( ) ( ), [0, ], 0,..., .

d v l d p l d l l

l l dt t t T l N

q K q q K q q b q q ξ

D ξ q g q ξ f ξ q q

(3.7)

Assuming 0)(0 tu from (3.7) for 0l we obtain the initial robot control for

the proposed iterative process as follows:

1 1 1 1 1 1

1 1 1

1 1 1 1

( , ) ( , ) ( ) ( , )

ˆˆ( ) ( , )[ ( ) ( )]

ˆ ˆ ˆˆˆ ˆ ˆ( , , ) ( ) ( , ) ( ).

d v d p d


f ξ A q ξ q K q q K q q

b q q ξ D ξ q g q ξ f ξ

(3.8)

64

If ξξ ˆ , and consequently

)()(ˆ,ˆ ),,(),(ˆ ),()(ˆ qgqgDDqqbqqbqAqA and ff ˆ ,

then from (3.8) we have the initial-error dynamics:

0))(,( 1111 eKeKeξqA pv , (3.9)

where 1,:1 ldll qqee . In this case, taking into account that ),( qA is

positive-definite, the initial feedback control, Eq. (3.8), is asymptotically stable

if vK and pK are Hurwitz matrices [22]. Therefore, for simplicity, we assume

IK vv K and IK pp K , where vK and pK are positive constants.

In fact, there are great differences between values of ξ and ξ reported in Table

2.1 and Table 2.2, respectively. Consequently, if values of vK and pK are not

specified properly, a significant maximal initial tracking error

],0[,)(1max1 Tttxame

t e (3.10)

is expected because of the initial control (3.8). On the other hand, the increasing

of feedback gains vK and pK leads to reduction of max1e [36-38]. That is why,

varying values of vK and pK , we are able to examine, trough computer

simulation, the new NILC algorithm proposed in the next section in case of a

large or small initial error max1e .

The maximal error of the iterative learning procedure:

Nlexame ll

,...,1),( maxmax , (3.11)

where:

NlTtttxame dlt

l ,...,1],,0[,)()(max qq , (3.12)

65

is of a great importance for the applicability of the proposed NILC (3.7).

Therefore, we are in a position also to vary the learning gains vL and pL in order

to examine the new NILC algorithm in case of a large transient error maxe .

3.4 Iterative learning algorithm with a bounded

tracking error

In this section, we present the main result of this paper, consisting of an

implementation of the new bounded error algorithm in Chapter 1 for the

proposed NILC for robot manipulators (Eq. (3.7)). The algorithm solves

problems of the NILC practical application arising from inadmissible maximal

transient errors maxe (Eq. (3.12)).

Our previous simulation results presented in Chapter 1 and Chapter 2

[14,15] and similar results reported in [35,36] reveal the non-monotony of

maximal tracking errors Nlel ,...,1,max (Eq. (3.11)) versus iterations, as it is

shown in Fig. 3.3.

0 1 0 2 0 3 0 4 0 5 00

0 . 5

1

1 . 5

2

2 . 5

3

3 . 5

4

Fig. 3.3 Convergence of the maximal trajectory error maxmax lq e [rad]

of PUMA560 tracking performance [14].

66

The maximal transient error maxe (Eq. (3.12)) of the iterative procedure

depicted in Fig. 3.3 equals to the corresponding initial error max1e (Eq. (3.10)) and

consequently increasing feedback gains vK and pK (Eq. (3.9)) could reduce it

[36-38], but actually, for a digital controller with a time delay, increasing

feedback gains may cause tracking instability [37,38] . On the other hand, a small

initial error max1e does not guarantee a corresponding smaller maximal transient

error maxe , as it is shown in Fig. 2, Fig. 1.2 and Fig. 3.4.

For instance, if the initial error max1e is admissible for an assigned task and if

the transient error maxe is larger than

max1e (Fig. 3.4), then the ILC procedure could

fail due to a conflict between the robot arm and some obstacle of the environment.

That is why an algorithm for the ILC performance with bounded maximal

transient error maxe is of a great importance for the ILC implementation in

practice.

The main idea of the new algorithm is that at each iteration the robot motion

trajectory must be inside a hyper-tube of width 2 around the desired trajectory

in the space of generalized coordinates, i.e.:

Fig. 3.4. Maximal tracking error [rad] vs. iterations.

67

( ) { ( ) : ( ) ( ) , [0, ], (0, ]}, 1,...,f f

l dt t t t t T T T l N q q q q , (3.13)

where 0 is a preliminary given error bound; ],0(],,0[),( TTTttf

lf

ll q is the

solution of the differential equation (3.7) with initial conditions: )0()0( dl qq

and )0()0( dl qq for the corresponding iteration l . Thus, starting from the initial

position )0(lq with the initial velocity )0(lq the trajectory tracking motion of the

robot continues until the tracking error norm is less than and the robot stops if

this norm is equal to (Inequality (3.13)). Obviously, an immediate stop of the

robot motion is impossible. Consequently, the condition for stopping the robot

has to be more restrictive than (3.13) so that the robot arm starts deceleration in

advance in order for inequality (3.13) to be valid when it stops. Therefore, the

robot should start deceleration when the following condition holds:

NlTTTT sl

sld

sll ,...,1],,0(,0ˆ,ˆ)()( qq , (3.14)

where 6,...,1,,ˆ iqi and iq is the joint angle displacement which

is necessary for the corresponding joint to come to a stop from maximum velocity

at maximum deceleration when (3.14) holds. Thus, the condition (3.14)

guarantees that inequality (3.13) holds and the tracking error Nlel ,...,1,max is

bounded by .

Given an attainable desired trajectory ],0[,6,..,1],2,2[)(),( Ttitqt did q

the feedback gains vK and pK , the learning gains vL and pL , and an 0ˆ error

norm bound the new algorithm for the implementation of the NILC procedure

(Eq. (3.7)) could be formulated as follows:

I. Set the initial iteration number 0l and begin the iterative procedure.

II. Starting from the initial position )0()0( dl qq with the initial velocity

)0()0( dl qq the robot arm is following the desired trajectory under the

NILC law, described on the right hand side of the Eq. (3.7), until

),0[,ˆ)()( sldl Tttt qq or the end tracking position )(Tlq is reached.

68

If during the current tracking performance the condition (3.14) becomes

valid at the very time when ],0(, TTTt sl

sl , then the robot arm comes

to a stop as fast as possible at the maximum deceleration.

III. After the current tracking performance has finished, the learning

controller updates the feed-forward control term according to the

following learning update law:

*1 1 0 1

*

ˆ ˆ ˆ ˆ( , , ) ( , , ) ( , , ), ( , , ) 0,

ˆˆ ( ( ), )[ ( ) ( ) ( ( ) ( ))]

ˆ ˆˆ( , , ) ( ( ), ) ( ( ) ( )), [0, ], (0, ];

0, ( , ]

s s s sl l l l l l

l d l v d l

s s sl l l p d l l l

sl

t T t T t T t T

t t t L t t

t T t L t t t T T T

t T T

u ξ u ξ u ξ u ξ

A q ξ q q q q

u ξ A q ξ q q, (3.15)

which is obtained from (3.2) for the resultant trajectory

],0(],,0[),( TTTtt sl

sll q .

IV. If the maximal tracking error maxle for ],0[),( s

ll Ttt q is less than or equals

to an acceptable tracking accuracy, then exit from the learning procedure,

else set 1 ll and go to step II.

Therefore from (3.1), (3.6) and (3.15) for 0, vvv KK IK , and

0, ppp KK IK , it follows that the NILC law of the new algorithm is:

1 1 1 1 1 1

1 1

*1 1

1 1

( , ) ( , ) ( ) ( , )

ˆ ˆ( ) ( , , ) ( , ),

ˆ ˆ ˆ( , , ) ( , , ) ( , , ), [0, ], (0, ],

ˆ: ( ) ( ) ,

ˆ ˆˆ( , ) ( , )[

l l l l l l

sl l c l

s s s sl l l l l l l

s s sl l l d l

c l l d

t T

t T t T t T t T T T

T T T


f ξ u ξ u q ξ

u ξ u ξ u ξ

q q

u q ξ A q ξ q 1 1 1 1

1 1 0 0 1

ˆˆ( ) ( )] ( , , )

ˆ ˆˆˆ ˆ( ) ( , ) ( ), ( ) ( ), ( , ) 0, 0,..., ,

v l d p l d l l

sl l d

K K

t t t T l N

q q q q b q q ξ

D ξ q g q ξ f ξ q q u

(3.16)

69

where is the preliminary given tracking error norm bound and fixes slT as it

is described in step II. In this way slT determines the current output trajectory

],0(),( sll Ttt q , which is used for offline updating the feed-forward term )(1 tlu

according to (3.16).

The described above NILC procedure (Eq. (3.16)) is robust and convergent

if the sufficient condition (3.4) holds. The proof is presented in Appendix B.

3.5 Simulation results and discussions

In this section, we present the simulation results from implementation of the

NILC algorithm (Eq. (3.16)) described in the previous section. These results are

compared with the simulation results from the standard NILC algorithm (Eq.

(3.2)) in cases of inadmissible initial tracking errors max1e (Eq. (3.10)) and

inadmissible transient errors 1,max le (Eq. (3.12)), and inadmissible both initial

and transient errors. Also the convergence rate of both NILC procedures (Eq.

(3.7) and Eq. (3.16)) is investigated through a simulation study of corresponding

maximal tracking errors Nlel ,...,1,max (Eq. (3.11)) versus iterations.

Using model parameters stated in Table 2.1 and Table 2.2 we are in a position

to perform computer simulations of proposed both the standard and new (Eq.

(3.2) and Eq. (3.15)) learning control procedures for the virtual robot PUMA 560

carrying out the following steps:

I. Define a desired trajectory in the space of generalized coordinates: T

61 ))(,...),(()( tqtqt ddd q , 6,...,1],,0[],2,2[)( iTttqd

i ;

II. Define matrices DξqCξqBξqA ),,( ),,( ),,( and vectors ),( ξqg and f

using model parameters of the virtual robot arm (Table 2.1) and formulas

reported in [31];

III. Define matrices DξqCξqBξqA ˆ),ˆ,(ˆ ),ˆ,(ˆ ),ˆ,(ˆ and vectors )ˆ,(ˆ ξqg and f

using parameter estimates (Table 2.2) and the same formulas [31] in order

to synthesize the learning operator )ˆ,(ˆ)ˆ,( ξqAξqL and the feedback

70

control term )ˆ,( ξquc (Eq. (3.7) or Eq. (3.16)).

IV. Implement by computer simulation the iterative learning procedure,

which consists of:

- Solution of the differential equation of motion (Eq. (3.7) or Eq.

(3.16)) to obtain the resultant output trajectory taking into

account the constraints on control inputs

6,...,1],ˆ,ˆ[ maxmax iUUu iii (Table 2.2), i.e.: if maxîi Uu or

maxîi Uu , then maxˆ

ii Uu or maxîi Uu , respectively;

- Update of the control input according to the learning update law,

(Eq. (3.2) or Eq. (3.15)),

until an acceptable accuracy is reached for the tracking error Nlel ,max .

Let the desired trajectory be given in generalized coordinates by:

0)(;5.1)cos(5.1)(;3)cos(3)(

;1)cos()(;2)cos(2)(;0)(

654

321

tqttqttq

ttqttqtq

ddd

ddd

(3.17)

and ]2,0[ t . The specification of 0)(1 tqd and 0)(6 tqd does not simplify

the calculation of the learning operator )ˆ,(ˆ)ˆ,( ξqAξqL (Eq. (3.2)), because

according to the Full Explicit Puma 560 Model [31] the formulation of the inertial

matrix )(qA does not depend on )(1 tqd and )(6 tqd . From Eq. (3.17), one can

obtain the trajectories of the desired generalized velocities 6,...,1, iqdi , i.e.:

0)(;)sin(5.1)(;)sin(3)(

;)sin()(;)sin(2)(;0)(

654

321

tqttqttq

ttqttqtq

ddd

ddd

(3.18)

and ]2,0[ t .

Profiles of )(tqdi and 6,...,1],2.0[),( ittqd

i , computed with MATLAB, are

shown in Fig. 3.5, where the number of the curve corresponds to the coordinate

number 6,...,1 i .

71

3.5.1 Simulation results in cases of an unacceptable initial

tracking error

Solving the nonlinear differential equation (3.8) with different values of feedback

gains vK and pK of the initial control )ˆ,( 1 ξquc and with initial conditions

0)0(1 q and 0)0(1 q (Eq. (3.17)) one can obtain different trajectories )(1 tq

and, consequently, different corresponding initial maximal errors max1e (Eq.

(3.10)). Thus, as a result of varying values of vK and pK , for 6vK and 3pK

we have an attainable trajectory )(1 tq , where 6,...,1],2,0[],2,2[)(1 ittqi

are depicted in Fig. 3.6.

0 1 2 3 4 5 6 70

1

2

3

4

5

2

5

3

1, 6

4

0 1 2 3 4 5 6 7-3

-2

-1

0

1

2

3

1,6 1,6

4

4

2

2

5

5

3

3

b) Graphs of a) Graphs of

Fig. 3.5. Desired trajectories of generalized coordinates and the corresponding velocities.

t[s] t[s]

q[rad] q[rad]

72

c) Graphs of )(1

3 tq , )(25

3 tq and )(3 tqd

d) Graphs of )(1

4 tq , )(25

4 tq and )(4 tq d

b) Graphs of )(1

2 tq , )(25

2 tq and )(2 tq d b)

Graphs of )(1

2 tq , )(25

2 tq and )(2 tq d

a) Graphs of )(1

1 tq , )(25

1 tq and )(1 tq d

e) Graphs of )(1

5 tq , )(25

5 tq and )(5 tqd

f ) Graphs of )(1

6 tq , )(25

6 tq and )(6 tqd

Fig. 3.6. Trajectories )(1 tqi (dashed lines), )(25 tqi (solid lines) and

6,...,1,)( itqdi , (dotted lines) in case of the standard NILC control (Eq. (3.7)).

73

The maximal initial tracking error max1e , calculated from (3.10), is 1.5843[rad]

and we assume that it is unacceptable. Anyway, despite of this error, assuming

learning gains (Eq. (3.2)) equal to feedback gains, vv KL and pp KL , and

continuing the computer simulation performance of the standard NILC procedure

(3.7) at the 25th iteration we obtain an acceptable tracking error 0025.0max25 e

[rad]. The preliminary given accuracy of the MATLAB solver of ordinary

differential equations, based on the Runge-Kutta method, is 410 . Corresponding

trajectories ]2,0[),(25 ttqi , together with desired trajectories of robot links in

the space of generalized coordinates are also shown in Fig. 3.6.

The profile of maximal tracking errors 25,...,1,max lel , versus iterations

provides the best information about the convergence of the standard NILC

procedure. This profile is depicted in Fig. 3.7.

74

Referring to Fig. 3.7, the profile of 25,...,1,max lel , (dashed line) is

monotonically decreasing and the rate of convergence is high, but in this case,

the standard NILC procedure (3.7) is not applicable in practice for the large initial

error.

Assuming 6,6 vv LK and 3,3 pp LK and 7.0 [rad] and 5.0ˆ

[rad] the computer simulation of the iterative learning procedure (3.16) results in

the profile of corresponding tracking errors maxle vs. iterations 25,...,1l given

by the solid curve in Fig. 3.7. The accuracy of the Runge-Kutta method, applied

by MATLAB, for solving the differential equation (3.16), is 410 and the

corresponding maximal tracking error of the last iteration is error 0018.0max25 e

[rad]. Thus, the accuracy of both NILC procedures (Eq. (3.7)) and Eq. (3.16)) is

similar, but the maximal transient error maxe of the NILC (3.16) is less then 0.6

[rad] and, consequently, the considered above problem for the initial error of the

standard NILC is solved.

The graph of the error maxle of the iterative process (3.16) (Fig. 3.7) shows

lower rate of convergence then the corresponding rate of convergence for (3.7).

For the considered particular case the standard NILC reaches an acceptable

accuracy at the 15th iteration, while the bounded-error NILC (3.16) requires 20

iterations for the same result. Actually, the last iteration number N is of great

importance for the applicability of the learning control algorithm, because the

total elapsed time of the standard learning process equals to the number of

iterations N multiplied by the iteration time T ( NT ). As it was mentioned above,

the standard procedure requires at least 15 iterations and the corresponding

elapsed time is 30 [sec]. At least 20 iterations are required for implementation

of the new learning procedure (Eq. (3.16)), but the elapsed time is not 40 [sec],

because for each of iterations 8,...,1l the tracking time TT sl and the elapsed

time of the whole process is: TTl

sl 12

8

1

. The exact value of this elapsed time

could be calculated using the profile of slT , 25,...,1l , vs. iterations shown in Fig.

3.8 and it is 24865.22 [sec].

75

Fig. 3.8. Tracking times of error-bounded learning for 6,6 vv LK ,

and 3,3 pp LK .

Fig. 3.9. Tracking trajectories of the third joint: ],0(),(3s

ll Tttq ,

20,10,7,6,4,1l for 6,6 vv LK and 3,3 pp LK .

76

Consequently, the virtual duration of the new procedure (3.16) is less than

the duration of the standard one despite of the greater total number of iterations

minimally required for its successful implementation.

Fig. 3.9 illustrates in detail the new learning process for the third link of the

robot arm. Trajectories ],0(),(3s

ll Tttq , 20 and 10,7,6,4,1l are shown and

corresponding maximal tracking errors: ],0(,)()( 333s

ldl

t

l Tttqtqxame , are

bounded, i.e.: 8.03 le [rad].

3.5.2 Simulation results in cases of an unacceptable transient

error

The results presented here concern both the NILCs (3.7) and (3.16) in case of

acceptable initial error max1e and inadmissible error:

maxe (Eq. (3.11)).

Fig. 3.10. Maximal tracking error of standard and error-bounded learning

for 0,16 vv LK and 16,48 pp LK .

77

Increasing values of feedback gains vK and pK reduces initial tracking

errors [36-38], but one can find learning gains vL and pL which cause bigger

errors: Nlel ,...,2,max . Thus, for 0,16 vv LK and 16,48 pp LK the

resultant graph of tracking errors maxle vs. iterations 50,...,1l , obtained from the

computer simulation of the iterative learning procedure (3.7), is given by the

dashed curve in Fig. 3.10.

The initial tracking error is: 0.4835max1 e [rad], while the transient error is:

3.0805max e [rad] and the final tracking error is: 0.007max50 e [rad]. For 7.0

[rad], 5.0ˆ [rad] and same feedback and learning gains the computer simulation

of the NILC (3.16) gives the profile of tracking errors maxle given by the solid

curve in Fig 10. The values of corresponding errors are 0.4835max1 e [rad],

0.5101max e [rad] and 0.0077max50 e [rad]. Again, the accuracy of the Runge-

Kutta method for both computer simulations is 410 .

Fig. 3.11. Tracking times of error-bounded learning for

0,16 vv LK and 16,48 pp LK .

78

Consequently the iterative procedures (3.7) and (3.16) have a similar rate of

convergence and obtain acceptable accuracy at the 50th iteration, but the transient

error maxe of the standard NILC is unacceptable, while the corresponding error

of NILC (3.16) is bounded, i.e.: maxe , and thus the problem arisen from the

big transient error of the standard procedure is solved.

The duration of the standard learning process is 100 [sec]. Referring to Fig.

3.11, where the profile of tracking times slT , 50,...,1l , is plotted, the elapsed

time of the error-bounded learning process (3.16) is 889596.28 [sec].

3.5.3 Simulation results in cases of both initial and transient

error

As results from computer simulations of NILCs (3.7) and (3.16) for

3,6 vv LK and 6,3 pp LK and 7.0 [rad], 5.0ˆ [rad] two graphs of

corresponding tracking errors 20,...,2,max lel are shown in Fig. 3.12.

In this case, the initial error of the standard NILC (3.7) is the same (

1.5843max1 e [rad]), as it is for the first graph depicted by dashed line in Fig. 3.7,

because feedback gains also are the same. Corresponding transient errors and

final tracking errors of both dashed and solid curves shown in Fig. 3.12 are

2.8762max e [rad] and 0.0037max20 e [rad] for the dashed curve, 0.5218max e

[rad] and 0.0012max20 e [rad] for the solid curve. Analogically to Fig. 3.7, the

standard learning procedure has better rate of convergence then the error-

bounded one. The dashed curve reaches an acceptable error at the 10th iteration (

0.0062max10 e [rad]), while the solid one does the same at the 12th iteration (

0.0054max12 e [rad]), but corresponding elapsed times are: 20 [sec] and

84956.30 [sec] and consequently, in this particular case, the proposed in

Section 3.4 learning algorithm is faster then the standard one described in

Section 3.3.

79

Fig. 3.12. Maximal tracking error of standard and bounded-error

learning for 3,6 vv LK , and 6,3 pp LK .

Fig. 3.13. Tracking times of error-bounded learning for

3,6 vv LK , and 6,3 pp LK .

80

The profile of tracking times slT , 20,...,1l , is depicted in Fig. 3.13. The

simulation results presented above verify the practical applicability of the

proposed learning algorithm (Section 3.4).

With regard to the possible reduction of the transient error for the standard

NILC (3.7), it has to be mentioned that similar or better results could be obtained

by specification of appropriate feedback and learning gains, but in practical

setting such a specification with respect to an actual industrial robot is difficult,

complicated and time consuming. Therefore, computer simulations could be used

for achievement of proper values of these gains. However, due to the unmodeled

dynamics neither simulations nor analytical methods give full guarantees for

successful real implementation of the standard NILC (3.7). That is why ensuring

in a natural way that a robot will track a desired trajectory with a given tracking

error the proposed NILC (3.16) is useful in practical applications without any

necessity of previous computer simulations. Anyway, a computer simulation of

a learning procedure saves efforts, time and energy if the resultant feed-forward

control is going to be applied as an initial feed-forward term of an ILC when

implemented in practice. The resultant feed-forward control trajectories

6,...,1],2.0[),(20 lttu l , obtained by the last simulation of the error-bounded

learning procedure (3.16) are shown in Fig. 3.14.

Fig. 3.14. Generalized torques of the feed-forward control input at the 20th iteration of

the error-bounded learning for 3,6 vv LK , and 6,3 pp LK .

b) Graphs of 6,5,4 ),(20 itui a) Graphs of 3,2,1 ),(20 itui

81

In this paper, we consider a constant norm bound on the maximal tracking

error in order to simplify the description of the error-bounded learning algorithm.

Trajectory-tracking errors in the space of generalized coordinates can be more

precisely bounded if the condition (3.14) is replaced by

NlTTiTqTq slii

sl

di

sl

li ,...,1],,0(},6,...,1{,0ˆ,ˆ)()( . This will increase the

applicability of the error-bounded learning algorithm. Moreover, an error bound

could be a function of time defined by: ],0[,6,...,1,0)(ˆ Ttiti , because the

proof of NILC (3.16) convergence does not depend on )(ˆ ti . Thus, in case of

obstacles in the robot’s working space more complicated conditions on trajectory

tracking errors could be stated in this 3D space. Further, from these conditions

one can obtain corresponding conditions in generalized coordinates by solving

inverse kinematic problems that have arisen [42] and consequently, the proposed

learning control (3.16) can be used.

Finally, the accuracy (the maximal tracking error at the last iteration) of the

presented simulation results improves when a higher accuracy is set for the

MATLAB solver of ordinary differential equations, but the corresponding

computation time also increases.

3.6 Conclusions

A new algorithm for implementation of a model-based NILC of robotic

manipulators is proposed. The algorithm ensures that in each of iterations the

tracking error norm is bounded by a given error bound 0 . Thus, during the

implementation of the learning process in the space of generalized coordinates

all output trajectories lie in a hyper-tube with radius around a desired trajectory

and consequently, this learning strategy solves the problem arising from

unacceptable tracking errors of standard strategies for NILC implementation. The

uniform convergence and robustness of the proposed error-bounded learning

algorithm are proven.

With regard to PUMA 560 6-DOF robot, computer simulations are presented

in order to verify the practical applicability of the new algorithm. The

comparative analysis of corresponding results of both standard and error-

bounded learning procedures reveals:

82

High rates of convergence and achievement of acceptable tracking errors

for reasonable numbers of iterations (from 10 to 50);

Less duration of the error-bounded learning process (Eq. 16), compared to

the standard one (Eq. (3.7)), in all considered cases;

The tracking errors of the new learning algorithm are bounded in contrast

to corresponding errors of the conventional one.

83

Chapter 4

References

1. Arimoto, S., S. Kawamura and F. Miyazaki, “Bettering Operation of dynamic

systems by learning: A new control theory for servomechanism of

mechatronics systems”, Proc. of 23rd Conference on Decision and Control,

Las Vegas, NV, pp. 1064-1068, (1984).

2. Ahn, H.-S., K. L. Moore and Y. Chen, Iterative Learning Control Robustness

and Monotonic Convergence for Interval Systems, Springer-Verlag London

Limited, London , (2007).

3. Heinzinger, D., B. Fenwick, B. Paden and F. Miyazaki, “Robust Learning

Control”, Proc. of 28th Conference on Decision and Control, Tampa, FL, pp.

436-440, (1989).

4. Heinzinger, D., B. Fenwick, B. Paden and F. Miyazaki, “Stability of Learning

Control with Disturbances and Uncertain Initial Conditions”, IEEE

Transactions on Automatic Control, Vol. 37, No. 1, pp. 110-114, (1992).

5. Ahn, H.-S., and D. A. Bristow, “Special Issue on Iterative Learning Control”,

Asian Journal of Control, Vol. 13, No. 1, pp. 1-2, (2011).

6. Longman, R.W., “Iterative learning control and repetitive control for

engineering practice”, International Journal of Control, Vol. 73, No. 10, pp.

930-954, (2000).

7. Madady, A. and H.-R. Reza-Alikhani, “A guaranteed monotonically

convergent iterative learning control”, Asian Journal of Control, (2011), doi:

10.1002/asjc.397.

8. Schoellig, A. P., J. Alonso-Mora, and R. D'Andrea, “Limited benefit of joint

estimation in multi-agent iterative learning”, Asian Journal of Control,

(2011), doi: 10.1002/asjc.398.

9. Ruan, X., Z. Z. Bien, and Q. Wang, “Convergence properties of iterative

learning control processes in the sense of the Lebesgue-P norm”, Asian

Journal of Control, (2011), doi: 10.1002/asjc.425.

84

10. Hahn, B. and K. Oldham, “On-off iterative adaptive controller for low-power

micro-robotic step regulation”, Asian Journal of Control, (2011), doi:

10.1002/asjc.410.

11. Ouyang P.R., B.A. Petz and F.F. Xi, “Iterative Learning Control with

Switching Gain Feedback for Nonlinear Systems”, Transactions of the

ASME, Journal of Computational and Nonlinear Dynamics, Vol. 6, No. 1,

pp. 011020-1 –7, (2011).

12. Xu, J.-X. and Y. Tan, Linear and Nonlinear Iterative Learning Control,

Lecture Notes in Control and Information Sciences, Springer, New York,

(2003).

13. Fujimoto, K. and T. Sugie, “Iterative learning control of Hamiltonian

systems: I/O based optimal control approach”, IEEE Trans. on Automatic

Control, 48(10), pp. 1756–1761, (2003).

14. Delchev, K. and E. Zahariev, “Computer simulation based synthesis of

learning control law of robots”, Mechanics Based Design of Structures and

Machines Vol. 36 No 3, pp. 225-248, (2008).

15. Delchev, K. and G. Boyadjiev, “Synthesis and computer simulation of

learning control of horizontal robot arm”, Proc. of 9th IFAC Symposium on

Robot Control “SYROCO’09”, Gifu, Japan, September 9-12, pp. 301-306,

(2009).

16. S. Gopinath and I. Kar: Iterative learning control scheme for manipulators

including actuator dynamics. Mechanism and Machine Theory, 39, (2004),

1367-1384.

17. Delchev K., Iterative learning control for nonlinear systems: A bounded-

error algorithm, Asian Journal of Control 15 (2) , pp. 453-460 (2013).

18. Delchev K., Simulation-based design of monotonically convergent iterative

learning control for nonlinear systems, Archives of Control Sciences,

Volume 22(LVIII) No. 4, (2012) pp. 371–384.

19. Delchev K., Boiadjiev G., Kawasaki, H., Mouri T., Iterative learning

control with sampled-data feedback for robot manipulators, Archives of

Control Sciences, Volume 24(LX) No. 3, (2014) pp. 257–277.

20. Delchev K., Iterative learning control for robotic manipulators: A bounded-

error algorithm, International Journal of Adaptive Control and Signal

Processing, 28, pp. 1454–1473 (2014). DOI: 10.1002/acs.2454 (2013)

21. Shinji, W. and T. Mita, “A Parameter Identification Method of Horizontal

Robot Arms”, Advanced Robotics, Vol. 4, No 4, pp. 337-352, (1990).

echrk

Highlight

echrk

Highlight

85

22. Kozlov, V., V. Makarychev, A. Timofeev and E. Yurevich, Dynamics of

Robot Control. Moscow, Science, (1984), (in Russian).

23. Pertew, H. Marquez and Q. Zhao, A Direct Sampled-data Design Approach

for Robot Stabilization, in Proc.2008 American Control Conference, Seattle,

Washington, (2008), 369-374.

24. T. Oomen, van de Wijdeven and Q. Zhao, Suppressing intersample behavior

in iterative learning control, Automatica 45, (2009), 981–988.

25. Y.-J. Pan, H. Marquez and T. Chen, Sampled-data Iterative Learning Control

for a Class of Nonlinear Networked Control Systems, in Proc. 2006 American

Control Conference, Minneapolis, Minnesota, (2006), 3494-3499.

26. T.-W. Ma and H. Yang, Sampled Data Feedback-Feedforward Control of

Structures with Time Delays, Journal of Structural Engineering, 132(7),

(2006), 1129–1138. DOI: 10.1061/(ASCE)0733-9445(2006)132:7(1129)

27. B. Zhang, D. Wang, Y. Wang, Y. Ye and K. Zhou, Comparison Studies on

Anti-Aliasing/Anti-Imaging Filtering and Signal Extension in Multi-rate

ILC, in Proc.17th World Congress The International Federation of Automatic

Control, Seoul, (2008), 12468-12473.

28. D. Vassileva, G. Boiadjiev, H. Kawasaki and T. Mouri, Force Compensating

Trajectories for Redundant Robots: Experimental Results, Journal of

Robotics and Mechatronics, 21(1), (2009), 104-112.

29. J. Zhang, Advanced Pulse Width Modulation Controller ICs for Buck DC-

DC Converters, Ph.D. Thesis, University of California, Berkeley, 2006.

30. R. James and G. James, Mathematics dictionary, 5th ed., Chapman & Hall,

New York, 1992.

31. B. Armstrong, O. Khatib and J. Burdick, The explicit dynamic model and

inertial parameters of the PUMA 560 arm, in Proc. IEEE Int. Conf. Robotics

and Automation, vol. 1, San Francisco, USA, (1986), 510-518.

32. P. Corke and B. Armstrong-Helouvry, A search for consensus among model

parameters reported for the PUMA 560 robot, in Proc. IEEE Int. Conf.

Robotics and Automation, vol. 1, San Diego, USA, (1994), 1608-1613.

33. T. Tarn, K. Bejczy, S. Han and X. Yun, Inertia parameters of Puma 560 robot

arm. Tech. Rep. SSM-RL-85-01, Washington University, St. Louis, MO,

1985.

34. Smolders K, Volckaert M, Swevers J. Tracking control of nonlinear lumped

continuous-time systems: A model-based iterative learning approach.

86

Mechanical Systems and Signal Processing 2008; 22 (8): 1896-1916. doi:

10.1016/j.ymssp.2008.03.004.

35. Park HJ, Cho HS. On the realization of an accurate hydraulic servo through

an iterative learning control. Mechatronics 1992; 2(1): 75-88.

36. Kawamura S, Svinin M. Advances in Robot Control: From Everyday Physics

to Human-Like Movements. Springer, Berlin, 2006.

37. Insperger T, Stepan G. Optimization of digital control with delay by periodic

variation of the gain parameters. Proc. of IFAC Workshop on Adapt. and

Learn. in Control and Signal Processes, Yokohama, Japan, 2004; 145-150.

38. Al-Salem N, Fanni M. Design of digital high-gain PD control systems using

analytical approach. JSME international journal. Series C, Mechanical

systems, machine elements and manufacturing, 2004; 47(3). 792-802.

39. Bristow DA, Singler JR. Analysis of Transient Growth in Iterative Learning

Control Using Pseudospectra. Proc. of Symposium on Learning Control,

Shanghai, China, 2009.

40. Liegois A, Khalil W, Dumas J, Renaud M. Mathematical and computer

models of interconnected mechanical systems. Proc. of Theory and Practice

of Robots and Manipulators, Warsaw, 1977; 5-17.

41. Craig JJ. Introduction to Robotics: Mechanics and Control, Addison-Wesley,

Reading, 1985.

42. Le Boudec B, Saad M, Nerguizian V. Modeling and adaptive control of

redundant robots. Mathematics and Computers in Simulation 2006; 71(4-6):

395-403.

87

Appendix A

Lemma 2.2

If is a discrete form of , ,

, is the sampling interval and is the corresponding

continuous piecewise-constant function on , then .

Proof

The norm of is defined by

. (A1)

For continuous piecewise-constant functions it has to be mentioned that

, .

Using for the right-hand side of the definition (A1) it follows:

(A2)

On the other hand for

. (A3)

Thus, from Eq. (A2) and Eq. (A3) it follows:

( ) k

kh t h ( ), [0, ]h t t T 0

: [0, ]K

s

k k kt t kt T

/ sK T t st *, *( , )t kh h h t

[0, ]T *, ( )th h t

( ), [0, ]h t t T

[0, ]

( ) sup ( )t

t T

h t e h t

*( , )k kh h t h 1 0[ , ) [ , ], 0, 0,1,..., 1, / s

k k Kt t t t T t k K K T t

*( , )k kh h t h

1 1

1

*

[ , ) [ , )

[ , )

*

[ , ]

sup ( , ) sup ;

sup , 0,1... 1;

sup ( , ) .

k k k k

k

k k

K

K

t k k t

t t t t t t

tk t k

t t t

tt K K

t t T

e h h t h e

h e e h k K

e h h t e h

[0, ]t T

1

* * *

1

0

( , ) ( , ) , [ , ) ( , ) , [ , ]K

t k t k t K

k k K

k

e h h t e h h t t t t e h h t t t T

88

, (A4)

but , and

. (A5)

From Eq. (A4) and inequality (A5) we have

, (A6)

and using the definition (A1), we obtain from inequality (A6) that .

*

[0, ]

sup ( , ) max , 0,1,...,ktt k k

kt T

e h h t e h k K

( ) , 0,1,...,k

kh h t k K [0, ]t T

[0, ]

max ( ) sup ( )kt t

kk t T

e h t e h t

*

[0, ] [0, ]

sup ( , ) sup ( )t k t

t T t T

e h h t e h t

*, ( )th h t

89

Appendix B

This appendix presents a proof of robustness and convergence of the learning

algorithm proposed in Section 3.4. The proof is based on the convergence

Theorem 1.1 presented in Chapter 1 and [3,4] concerning a learning scheme for

the following class of multi-input-multi-output nonlinear time-varying state-

space equations:

( ) ( ( ), ) ( ( ), ) ( ) ( )

( ) ( ( ), )

l l l l l

l l

t f t t B t t t t

t g t t

x x x u

y x, (B1)

where, for },...,0{ l and all ],0[ Tt , nl t Rx )( , m

l t Ry )( , rl t Ru )(

are not necessarily continuous, and nl t R)( represents both deterministic and

random disturbances. The functions nn Tf RR ],0[: and

rnn TB RR ],0[: are piecewise continuous in ],0[ Tt and

mn Tg RR ],0[: is differentiable in x and t , with partial derivatives ),( xg

and ),( tg . In addition, the following assumptions hold [3,4]:

(1) For each fixed initial state )0(x with 0)( the output map

)],,0([)],,0([: mnr TCTCO RRR and the state map

)],,0([)],,0([: nnr TCTCS RRR are one-to-one. In this notation

))0(),(()( lll O xuy and ))0(),(()( lll S xux .

(2) The disturbance )(l is bounded on ],0[ T i.e. bt )( .

(3) The functions ),( f , ),( B , ),( xg , and ),( tg are uniformly globally

Lipschitz in x on the interval ],0[ T .

(4) The operators ),( B and ),( xg are bounded on nT R],0[ .

(5) All functions are assumed measurable and integrable.

90


))()()(),(()()()1()( 01 ttttLttt ldlll yyyuuu , (B2)

where mrm TL RR ],0[: is a bounded learning operator and )1.0[ allows

the influence of a bias term [3,4].

Theorem B1 (Theorem 1.1 ) [3,4]

Let the system described by equation (B1) satisfy assumptions (1)-(5) and use

the update law, Equation (B2). Given an attainable desired trajectory

T],0[),),(()( tttgt dd xy , )(tdu is the corresponding input and )(tdx is the

corresponding state (according to assumption (1). If

],0[),(,1 ),(),()),,(()1( TttBtgttgL nx RxxxxI , (A3)

and the initial state error )0()0( ld xx is bounded by 0xb , then, as l , the

error between )(tlu and )(tdu is bounded. In addition, the state and output

asymptotic errors are bounded. These bounds depend continuously on the bound

on the initial state error, bound on the state disturbance, and , as 0xb , b , and

tend to zero, these bounds also tend to zero.

The proof of Theorem B1 is presented in [3,4], where, in particular, it is

proven that

1( ) ( ) , 0 1l lt t

u u , (B4)

where )()()( ttt ldl uuu , constant, kk , and the norm is defined by

91

)(sup)(],0[

tet lt

Ttl uu

, (B5)

and

1

10

limsup ( ) (1 ) ,

limsup ( ) , ( (1 ) ).

ll

l y y g xl

t

t b b k b k

u

y (B6)

where ( ) ( ) ( )l d lt t t y y y , ( , ) 0k k T , and combines 0xb , b and

0 u , and gk is the Lipschitz constant for ( ( ), )lg t tx .

If 00 xb , 0b , and 0 , then 0 , and 0yb .

Corollary B1 (Corollary 1.1 )[3]

If the update law (B2) is replaced with

1 0( ) (1 ) ( ) ( ) ( ( ), )( ( ) ( ))

( ( ), )( ( ) ( ))

l l l d l

l d l

t t t L t t t t

K t t t t

u u u y y y

y y y (B7)

with ),( K bounded, then Theorem B1 still holds.

The proof of Corollary B1 is presented in [3].

For T],0[),()( ttt ll xy , T))(),(()( ttt lll qqx , )))((,0())(( ttL ll qLy

and ),))((())(( vpll LLttK qLy from equation (B7) we obtain the update law,

Eq. (2), considered in Section 3.3.

Applying the proposed algorithm, Eq. (15), to the dynamic model described

by equation (B1) from equation (B7) we obtain the following update law:

*1 1 0 0 1 0( , ) (1 ) ( , ) ( ) ( , ), ( , ) ( ),s s s s

l l l l l lt T t T t t T t T t u u u u u u (B8)

92

],(,0

];,0(],,0[

)),()()(),(())()()(),((

),(*

TTt

TTTt

ttttKttttL

Tt

sl

sl

sl

ldlldl

sll

yyyyyy

u

,

where ˆ: ( ) ( ) , [0,1,..., )s s sl d l l lT T T l y y .

Corollary B2 (Corollary 1.2 in Chapter 1)

If ( 0)Tye b and the update law (B2) is replaced with (B8), then

Theorem B1 still holds.

The proof of Corollary B2 is presented in Chapter 1 and [17].

It has to be mentioned that according to Theorem B1, the tracking accuracy

of the output trajectory of the classical (update law (B2)) ILC process is

[0, ]

limsup sup ( ) Tl y

l t T

t e b

y (see Eqs. (B5) and (B6)) and obviously, the

radius of the hypertube must be greater than this accuracy.

Thus, we are in a position to prove the robustness and convergence of the

learning algorithm proposed in Section 3.4 for robotic manipulators, i.e. it has to

be shown that the Theorem B1 is applicable to the dynamic equation of robot

motion (1):

1c l A(q)q b(q,q) Dq g(q) f u u , (B9)

where from Eq. (6) in Section 3.3 we have

ˆ ˆˆ ˆ ˆ[ ( ) ( )]c d v l d p l d l u A q K q q K q q b Dq g f , (B10)

93

and Eq. (15) in Section 3.4 can be rewritten as

*1 1 0 1

*

( , ) ( , ) ( , ), ( , ) 0,

ˆ ( )[ ( ) ( )],

( , ) [0, ], (0, ];

0, ( , ]

s s s sl l l l l l

l d l v d l p d l

s s sl l l l

sl

t T t T t T t T

L L

t T t T T T

t T T

u u u u

A q q q q q q q

u (B11)

Using equations (B9) and (B10) and taking into account that )(qA is an n n

symmetric positive-definite matrix, the equation (B9) can be written as:

1 1 1 1

0 0 0

( )

llc l

l l

qqu u f

q A b Dq g A A A (B12)

Assuming l

ll

qx

q, * **f f f where

*

1( )

l

l

f

q

A b Dq g and

**1

0

ˆ ˆ ˆ ˆ[ ( ( ) ( )) ]d v l d p l d l

f

A A q K q q K q q b Dq g

, 1

0B

A

,

and l ly x , and treating the Coulumb friction as a disturbance

1

0ˆ( )l

f fA

, from equation (B12) we obtain equation (B1). In [3] it is

proven that lq is bounded, implying that *f and B satisfy assumptions I-V, and

consequently **f also satisfies assumptions I-V.

Assuming ˆ( ( ), ) 0, ( )l lL t t y A q and ˆ ˆ( ( ), ) ( ), ( )l p l v lK t t L Ly A q A q , and

taking into account that 0 from equation (B8) we obtain equation (B11). It

has to be mentioned that in [3], as an example of a robotic manipulator, the

94

sufficient condition for robustness and convergence (4) validated in Section 3.3

is obtained from equation (B3).

Thus, the application of Theorem B1, Corollary B1 and Corollary B2 to the

system (B12) with the feedback controller described by Eq. (B10) and the

learning update law (B11) results in robustness and convergence of the proposed

bounded-error NILC algorithm for robotic manipulators.

New Iterative learning control for nonlinear time-variant systems … · 2018. 10. 3. · getting...

Documents

Transcript of New Iterative learning control for nonlinear time-variant systems … · 2018. 10. 3. · getting...