Analysis and Causal Formulation Proof of an Optimal Iterative Learning Algorithm

doi:10.4236/ica.2010.12010

Paper Menu >>

Journal Menu >>

Intelligent Control and Automation, 2010, 1, 90-95

doi:10.4236/ica.2010.12010 Published Online November 2010 (http://www.SciRP.org/journal/ica)

Analysis and Causal Formulation Proof of an Optimal

Iterative Learning Algorithm

Vassiliki Vita1, Marina Vassilaki1, Panagiotis Karampelas1,2, George E. Chatzarakis1

1Department of Electrical Engineering Educators,

School of Pedagogical and Technological Education (ASPETE), Athens, Greece

2IT Faculty, Hellenic American University, Athens, Greece

E-mail: vasvita@yahoo.co.uk, pkarampelas@hau.gr, geaxatz@otenet.gr

Received August 12, 2010; revised September 6, 2010; accepted September 19, 2010

Abstract

Iterative learning control (ILC) is used to control systems that operate in a repetitive mode, improving track-

ing accuracy of the control by transferring data from one repetition of a task, to the next. In this paper an op-

timal iterative learning algorithm for discrete linear systems is analyzed and a solution for its attainment is

proposed. Finally the mathematical proof of the algorithm’s causal formulation is also provided in its com-

plete form, since its implementation requires its causal formulation.

Keywords: Casual Formulation, Discrete Systems, Iterative Learning Control

1. Introduction

Iterative Learning Control (ILC) is a relatively new con-

cept in the control theory. It evolved from the need to

control dynamical systems that are supposed to carry out

a given task repetitively. More specifically for systems

where the desired output values are a function of time

and the task is carried out repeatedly. A classical exam-

ple is the robot in the car industry that follows a given

trajectory and welds at specific points along the trajec-

tory. Normally the robot would be tuned once through

feedback or feedforward control or even a combination

of both. After the tuning, it would carry out the repeti-

tions performing in the same way. The obvious drawback

of this approach is the fact that if there is an error be-

tween the measured trajectory and the reference trajec-

tory due to a wrong selection of the control input trajec-

tory, the robot will repeat this error at each trial, i.e., if

there is an error in the performance it will repeat the

same error at each iteration.

To overcome this problem, Arimoto, one of the in-

ventors of ILC [1,2], suggested that both the information

from the previous tasks or “trials” and the current task

should be used to improve the control action during the

current trial. In other words, the controller should learn

iteratively the correct control actions in order to mini-

mize the difference between the output of the system and

the given reference signal. He called this method “bet-

terment process” [3]. This approach is more or less an

imitation of the learning process of every intelligent be-

ing. Intelligent beings tend to learn by performing a trial

(i.e. selecting a control input) and observing what was

the end result of this control input selection. After that

they try to change their behaviour (i.e. to pick up a new

control input) in order to get an improved performance

during the next trial. Because, a) the overall idea of ILC

and the procedure results in a controller that learns the

correct control input, and b) learning is done through

iteration, the term Iterative Learning Control is nowa-

days used to describe control algorithms that result in the

“betterment process” as suggested by Arimoto [3]. In this

work an optimal iterative learning algorithm for discrete

linear systems is analyzed and a solution for its attain-

ment is proposed. Finally the mathematical proof of the

algorithm’s causal formulation is also provided in its

complete form, since its implementation requires its

causal formulation.

2. The Iterative Learning Control Aim

The aim of the ILC algorithm [4,5], is to find iteratively

an optimal input *

ufor the plant under investigation.

This input, when applied to the plant, should generate an

output *

that tracks the desired output d

y “as accu-

rately as possible”. The use of the phrase “as accurately

as possible” states the significance of obtaining the

V. VITA ET AL.

smallest possible difference between the actual *

and

desired d

y output. This difference is actually the error e

and the above lead to the conclusion that the error e is

desired to be minimal. The ultimate aim of the ILC is to

push the error e to zero. Thus it is clear that the applying

an ILC algorithm will result in a two-dimensional system

because the information propagates both along the time

axis t and the iteration axis k.

The 2-dimensionality of the iterative learning control

introduces problems in the analysis as well as the design

of the system and hence the 2-dimensions are an expres-

sion of the systems dependent dynamics: a) the trial in-

dex k and b) the elapsed time t during each trial. In order

to overcome such difficulties an algorithm for Iterative

Learning Control is introduced, which has the property

of solving the 2-dimensionality problem.

3. Proposed Cost Function

One way of solving this problem is to introduce the cost

function for the system under investigation. The mini-

mization of the cost function will provide an effective

algorithm for iterative learning control (ILC).

The following cost function is proposed [4]:



11 1kkkk

Juu ue

 

 (1)

where:



 is the cost function with respect to the current

trial input,

1kk

 is the norm of the difference between the

current and previous trial inputs,

e is the norm of the current error and

R, Q are symmetric and positive definite weighing ma-

trices. It is assumed that R and Q are diagonal matrices,

and for simplicity r

R and QqI, where q and r

are positive real numbers



,qr R.

In ILC literature, researchers have suggested different

cost functions to solve the ILC problem. The reasons that

lead to the realization that the selected cost function (1)

is effective and appropriate are the following:

1) The 2

1kk

 factor represents the importance

of keeping the deviation of the input between the trials

small. Intuitively this should result in smooth conver-

gence behavior. This requirement could also be stated as

the need for producing smooth control signals, in order

to obtain smooth manipulation of actuators.

2) The 2

e factor represents the main objective of

reducing the tracking error, at each iteration.

3) In order to state which of the above two factors

plays a more significant role in the cost function the

weighting matrices R and Q are used. If the interest is

focused in retaining the deviation of the input between

the trials small, then the ratio rq



 has to be “large”.

On the other hand, if keeping the error small is more

significant, then the ratio rq



 has to be “small”.

The actual meaning of “small” and “large” depends on

the system being considered and the units measured.

4) Τhe optimal value of the cost function is bounded.

If the cost function is evaluated with 1kk

 then (1)

becomes:



22 2

kkk kk

uuu ee  and hence

the optimal value:



1kk

. It is also clear that

the optimal cost function has a lower bound:



22 2

111 1kkkk k

Juu uee

 

 . Hence com-

bining the above, the upper and lower bounds are ex-

pressed with (2).



111kkkk

eJue



 (2)

4. The Norm-Optimal ILC Algorithm

The differentiation of the cost function (1) with respect

to 1k



produces the solution used to update the input,

the norm-optimal ILC algorithm, which is investigated in

this work. The input up-date law is the following [4]:









111

kkkk k

uuRGQeuGe







 (3)

kk k

uuRGQe





 (4)

where:



is the current trial input,

uis the previous trial input,



is the current trial error and

GRGQ



 is the gain matrix-adjoint of the plant,

that represents the relative weighting of the cost function

requirements (error-input deviation).

5. Characteristics and Properties of the ILC

Algorithm

The main characteristics of the update law for the pro-

posed Iterative learning control scheme (3) can be sum-

marized to the following:

1) Non-causalily. The dependence of the input update

law (3) on the adjoint of the plant G, where G



is a

function of the error in future sample times than the cur-

rent one, implies non-causality.

2) Integrator-type of approach. The algorithm utilizes

the input change instead of the input itself to evaluate the

performance of the system.

3) Feedback approach. This characteristic is one of the

V. VITA ET AL.

most important ones. Several authors [2] suggested

feedforward approaches where the error of the previous

trial was used in the input update law: 1kkk

uuKe

 .

On the contrary the update law in (3) is of the form:

11kkk

uuKe



 , which means that it uses feedback of

the current trial error. The feedback operator K is in this

case the adjoint of the plant G. The feedforward ap-

proaches face the problem of ignoring the current trial

effects such as plant modeling errors and possible dis-

turbances. This means that the algorithm is not sensitive

to these effects. On the contrary the feedback approach

of (3) takes into account all of this producing in this way

a more robust solution and this is likely to be a more

effective one in the plant stabilizing sense.

4) Descent algorithm. With the help of the equation:

11kkk

eJe



 , it is deduced that the norm of the

error 1k

e is monotonically decreasing from trial to

trial.

Two of the most important properties of the algorithm

are presented below:

1) Convergence of the error sequence to zero. It can be

shown not only that the error e converges to a final value

e as the number of iterations kbut also that

0e. It is significant to mention that the convergence

is exponential. It is also significant to be mentioned at

this point, that the error sequence convergence can be

proved even for an arbitrary chosen initial error 0

e. This

is a form of robustness, since it is ensured that the error

will converge to zero, independent of the initial error

value.

2) Convergence of the input sequence. It can be shown

that the input of the algorithm converges to a final value

u as the number of trials k.

6. Causal Formulation of the ILC Algorithm

One of the main characteristics of the ILC algorithm (3)

is that the algorithm is seems to be non-causal. Because

of its characteristics it is a very efficient algorithm in

terms of plant control. The main drawback is the fact that

enormous amount of data and computational effort is

required due to the matrices size. A solution to this

problem would be the implementation of the algorithm.

On the other hand, in order to be able to implement the

algorithm it is essential to derive it in a causal form.

6.1. The Proposed Causal Algorithm

The causal solution to the above problem is given by

introducing the following proposed algorithm for Norm-

Optimal ILC, which consists of the following terms:

Term 1: The gain matrix K(t). Given in the form of the

discrete Ricatti equation [6,7]:









 

11 1

TT T

Kt Kt

tKtRKtCQC



 



 



(5)

for [0, 1]tN



 and with the terminal condition





0KN



Term 2: The feedforward (predictive) term.

 



111

tIKtR

tCQet









 





 (6)

for [0, 1]tN



 with the terminal condition











Term 3: The input update law.











 



utut

KtRKtxtx t

















 





(7)

Easily can be observed that Term I is independent of

the inputs, outputs and states of the system, Term II, the

predictive term





1kt



 is dependent on the previous

trial error







and Term III, the input update law

depends on the previous trial input



ut, the current

state





, the previous trial state



t, and the

predictive term





1kt



. This is hence a causal iterative

learning control algorithm consisting of current trial-full

state-feedback along with feedforward of the previous

trial error data.

6.2. Mathematical Proof of the Causal

Formulation

In this section a complete mathematical proof of the al-

gorithm is provided, demonstrating all the necessary

steps from the optimization problem to the conception,

development and finally the acquisition of the three

terms of the algorithm’s causal form. The proof is based

on the general idea proposed in [8] but it is for the first

time that it is in its complete form, as it includes all the

intermediate steps and calculations in an analytic way.

The proof is as follows: Rewriting the input update

law as a difference between the current trial input and

the previous trial input gives the relationship (4):









kk k

ututRGQe





 . This form of the input

update law, when written in super-vector notation results

in relationship (8).

V. VITA ET AL.

 



 



1 1

00 1

11 2

22 3

00 0

TTT TTTTTTTT

kk k

TTT TTTTT

kk k

TTT TT

kk k

CC CC

uu e

CC C

RQuu e

uN uNeN



 



 



 



 



 



 

 



 



 

 

 





 



 

 

 









 

 



(8)

The difference of the control input

 

1kk

ututu t



 at the tth sample (by examining

the



1th

t row of (8)) with 0tN , can be written:

 



Njt

TT T

utut RCQej







 

 (9)

 



Njt

TTT

ututR CQej









 (10)

In order to derive a state space description of the ad-

joint system, the discrete co-state



pt is introduced,

such that:

 

ututRpt



 (11)

Comparing (10) and (11) it follows that the co-state

satisfies:



Njt

ptCQe jtN





 

 (12)

By expanding the sums in (12):



 



 



 



3 ....

3....

Njt

TT TT

TTT T

TTT

TTT T

TTT

TTT T

ptCQe j

ptCQet CQet

CQet CQeN

CQet CQet

CQet CQeN

CQet CQet

CQet CQeN







 

 



 



 











(13)

but

 



12 3....

TTT

pt CQetCQet

CQeN



 

 (14)

Hence the above yields the recursion relation for



pt:

 

110

ptCQetptt N  (15)

with

 

pN CQet  or equivalently





0pN



Consequently, the complete discrete input update law

is as follows:

 

kk k

ututR pttN





  (16)









kk k

ututRpt





 (17)

and













11 11

11,0

kk kk

ptCQetpt pN

 





(18)

The main observation is the fact that the adjoint sys-

tem has a terminal condition (at t = N) instead of an ini-

tial condition; this is a sign of a non-causal solution.

Equation (18) is an expression for the current trial

co-state. Another expression for the co-state







can be calculated as follows: A standard technique [6,7]

is to set the discrete co-state equal to:





 



 

kkk

tIRKtxt xtt











 





(19)

with a gain matrix K(t) and a correction term ξk+1(t).

Both the terms K(t) and ξk+1(t) should be computed be-

fore each trial. Lets set:

 

MtKtIRKt











 (20)

and then substitute this into equation (19):

 





 



kkk

ptKtIRKtxtx t









 



















111kkkk

ptMt xtxtt





 



(21)

Setting the functions f(t) and g(t) as follows easily can

be observed that these are independent of the states.

 

 

tMtIKtRCQC

IKtR Mt







 





 



and

 

 

gtI KtRt

KtRCQett















 









1) Calculating the (feedforward) predictive term









By setting g(t) = 0:

V. VITA ET AL.

 

 

gtI KtRt

IKt RCQett



















 



Solving the above relationship for the predictive term





 results in:

 



tIKtR t

CQe t













 (22)

which is simply the desired expression for the (feedfor-

ward) predictive term.

2) Calculating the gain matrix K(t)

By setting f(t) = 0:

 

 

110

tMtIKtRCQC

IKt RMt





 





 



 

 

110

tIKtRCQC

IKtRMt





 





 



Multiplying both sides of the above relationship with



IKtR









yields:

 



Kt RMtCQC









  (23)

Substitute M(t) with its equivalent from equation (20),

equation (23) becomes:

 

 

110

TT T

I KtRKtIRKt

CQCKtIRKt









 





 



 

 

110

TT T

Kt RIKt RKt

CQCKtIRKt









  





 





 

110

Kt CQC

KtI RKt



 







Solving the above term for the gain matrix K(t) results in

the following relationship:

  

TT T

Kt CQCKtIRKt















 



Kt CQCKt

IR KtIR Kt



 



 



further manipulations and expanding the brackets of the

above equation result in the desired relationship of the

gain matrix (5) which is the usual discrete Matrix Riccati

equation:









 

11 1

TT T

Kt Kt

tKtRKtCQC



 



 



3) Calculating the input update law for the new input







In order to prove the desired relationship for the new

input, equation (16) is used, which is:





kkk

utR ptut





 

Substituting in the above equation for the discrete

co-state





 with its equivalent from equation (18)

the following: equation is produced:

 







 

{

}

kk k

utRKtIR Ktxt

xttut











 





 



By expanding the brackets:

 







 

kkk

utRKtIRKtxt

xtRt ut

















 





 





 

 

kkk

utIR KtR Ktxt

xtRt ut













 



 





Having in mind that the multiplication at any desired

point within the equation with the identity matrix I there

is no loss of the equality:

 







 

kk kk

utIR KtIR Kt

xtxt Rtut















 





 





  

111

kk kk

utIR KtRRR

Ktxtx tRtu t





















 







 

kk kk

utRRRKt RRKt

xtxt Rtut





















 







  

kk kk

utR KtKt

txt Rtut











 





After the above manipulations and simplifications the

following relationship comes out, which is simply the

expression for the causal Term III (7), the input up-date

law:

 







 

kk k

ututKt RKt

xtxt Rt













 







V. VITA ET AL.

7. Conclusion

In this paper an optimal iterative learning algorithm for

discrete systems is analyzed. The algorithm’s properties

and characteristics were presented with the most impor-

tant feature to be that the error sequence converges to

zero with an exponential rate for the case of discrete time

plants. Finally, the complete causal formulation of the

algorithm was derived in detail, consisting of optimal

state feedback and optimal prediction feedforward.

8. References

[1] Y. Chen and C. Wen, “Iterative Learning Control-Conver-

Gence, Robustness and Applications,” Springer Verlag,

London, 1999.

[2] Z. Bien and J. X. Xu, “Iterative Learning Control-Analysis,

Design, Integration and Applications,” Kluwer Academic

Publishers, Dordrecht, 1998.

[3] S. Arimoto, S. Kawamura and F. Miyazaki, “Bettering

Operation of Dynamic Systems by Learning: A New Con-

trol Theory for Servomechanism or Mechatronic Sys-

tems,” Proceedings 23rd IEEE Conference on Decision

and Control, Las Vegas, Nevada, 1984, pp. 1064-1069.

[4] N. Amann, D. H. Owens and E. Roger, “Iterative Learning

Control for Discrete Time Systems with Exponential Rate

of Convergence,” IEE Proceeding of the Institution of

Electrical Engineers on Control Theory and Applications,

Vol. 143, No. 2, 1996, pp. 217-224.

[5] K. L. Moore, “Iterative Learning Control for Deterministic

Systems,” Advances in Industrial Control Series, Springer-

Verlag, London, 1993.

[6] K. J. Astrom and B. Wittenmark, “Computer controlled

systems,” 2nd Edition, Prentice Hall, Englewoods Cliffs,

1990.

[7] T. Kailath, “Linear Systems,” Prentice Hall, Englewoods

Cliffs, 1980.

[8] N. Amann, “Optimal Algorithms for Iterative Learning

Control,” Ph.D. Dissertation, University of Exeter, UK,

1996.