 Intelligent Control and Automation, 2010, 1, 90-95 doi:10.4236/ica.2010.12010 Published Online November 2010 (http://www.SciRP.org/journal/ica) Copyright © 2010 SciRes. ICA Analysis and Causal Formulation Proof of an Optimal Iterative Learning Algorithm Vassiliki Vita1, Marina Vassilaki1, Panagiotis Karampelas1,2, George E. Chatzarakis1 1Department of Electrical Engineering Educators, School of Pedagogical and Technological Education (ASPETE), Athens, Greece 2IT Faculty, Hellenic American University, Athens, Greece E-mail: vasvita@yahoo.co.uk, pkarampelas@hau.gr, geaxatz@otenet.gr Received August 12, 2010; revised September 6, 2010; accepted September 19, 2010 Abstract Iterative learning control (ILC) is used to control systems that operate in a repetitive mode, improving track-ing accuracy of the control by transferring data from one repetition of a task, to the next. In this paper an op-timal iterative learning algorithm for discrete linear systems is analyzed and a solution for its attainment is proposed. Finally the mathematical proof of the algorithm’s causal formulation is also provided in its com-plete form, since its implementation requires its causal formulation. Keywords: Casual Formulation, Discrete Systems, Iterative Learning Control 1. Introduction Iterative Learning Control (ILC) is a relatively new con-cept in the control theory. It evolved from the need to control dynamical systems that are supposed to carry out a given task repetitively. More specifically for systems where the desired output values are a function of time and the task is carried out repeatedly. A classical exam-ple is the robot in the car industry that follows a given trajectory and welds at specific points along the trajec-tory. Normally the robot would be tuned once through feedback or feedforward control or even a combination of both. After the tuning, it would carry out the repeti-tions performing in the same way. The obvious drawback of this approach is the fact that if there is an error be-tween the measured trajectory and the reference trajec-tory due to a wrong selection of the control input trajec-tory, the robot will repeat this error at each trial, i.e., if there is an error in the performance it will repeat the same error at each iteration. To overcome this problem, Arimoto, one of the in-ventors of ILC [1,2], suggested that both the information from the previous tasks or “trials” and the current task should be used to improve the control action during the current trial. In other words, the controller should learn iteratively the correct control actions in order to mini-mize the difference between the output of the system and the given reference signal. He called this method “bet-terment process” . This approach is more or less an imitation of the learning process of every intelligent be-ing. Intelligent beings tend to learn by performing a trial (i.e. selecting a control input) and observing what was the end result of this control input selection. After that they try to change their behaviour (i.e. to pick up a new control input) in order to get an improved performance during the next trial. Because, a) the overall idea of ILC and the procedure results in a controller that learns the correct control input, and b) learning is done through iteration, the term Iterative Learning Control is nowa-days used to describe control algorithms that result in the “betterment process” as suggested by Arimoto . In this work an optimal iterative learning algorithm for discrete linear systems is analyzed and a solution for its attain-ment is proposed. Finally the mathematical proof of the algorithm’s causal formulation is also provided in its complete form, since its implementation requires its causal formulation. 2. The Iterative Learning Control Aim The aim of the ILC algorithm [4,5], is to find iteratively an optimal input *ufor the plant under investigation. This input, when applied to the plant, should generate an output *y that tracks the desired output dy “as accu-rately as possible”. The use of the phrase “as accurately as possible” states the significance of obtaining the V. VITA ET AL. Copyright © 2010 SciRes. ICA 91smallest possible difference between the actual *y and desired dy output. This difference is actually the error e and the above lead to the conclusion that the error e is desired to be minimal. The ultimate aim of the ILC is to push the error e to zero. Thus it is clear that the applying an ILC algorithm will result in a two-dimensional system because the information propagates both along the time axis t and the iteration axis k. The 2-dimensionality of the iterative learning control introduces problems in the analysis as well as the design of the system and hence the 2-dimensions are an expres-sion of the systems dependent dynamics: a) the trial in-dex k and b) the elapsed time t during each trial. In order to overcome such difficulties an algorithm for Iterative Learning Control is introduced, which has the property of solving the 2-dimensionality problem. 3. Proposed Cost Function One way of solving this problem is to introduce the cost function for the system under investigation. The mini-mization of the cost function will provide an effective algorithm for iterative learning control (ILC). The following cost function is proposed : 2211 1kkkkRQJuu ue  (1) where: 1kJu is the cost function with respect to the current trial input, 21kkuu is the norm of the difference between the current and previous trial inputs, 21ke is the norm of the current error and R, Q are symmetric and positive definite weighing ma-trices. It is assumed that R and Q are diagonal matrices, and for simplicity rIR and QqI, where q and r are positive real numbers ,qr R. In ILC literature, researchers have suggested different cost functions to solve the ILC problem. The reasons that lead to the realization that the selected cost function (1) is effective and appropriate are the following: 1) The 21kkuu factor represents the importance of keeping the deviation of the input between the trials small. Intuitively this should result in smooth conver-gence behavior. This requirement could also be stated as the need for producing smooth control signals, in order to obtain smooth manipulation of actuators. 2) The 21ke factor represents the main objective of reducing the tracking error, at each iteration. 3) In order to state which of the above two factors plays a more significant role in the cost function the weighting matrices R and Q are used. If the interest is focused in retaining the deviation of the input between the trials small, then the ratio rq has to be “large”. On the other hand, if keeping the error small is more significant, then the ratio rq has to be “small”. The actual meaning of “small” and “large” depends on the system being considered and the units measured. 4) Τhe optimal value of the cost function is bounded. If the cost function is evaluated with 1kkuu then (1) becomes: 22 2kkk kkRQQJuuu ee  and hence the optimal value: 21kkQJue. It is also clear that the optimal cost function has a lower bound: 22 2111 1kkkk kRQQJuu uee  . Hence com-bining the above, the upper and lower bounds are ex-pressed with (2). 22111kkkkeJue (2) 4. The Norm-Optimal ILC Algorithm The differentiation of the cost function (1) with respect to 1ku produces the solution used to update the input, the norm-optimal ILC algorithm, which is investigated in this work. The input up-date law is the following : 10kJu 1111Tkkkk kuuRGQeuGe (3) or 111Tkk kuuRGQe (4) where: 1ku is the current trial input, kuis the previous trial input, 1ke is the current trial error and 1TGRGQ is the gain matrix-adjoint of the plant, that represents the relative weighting of the cost function requirements (error-input deviation). 5. Characteristics and Properties of the ILC Algorithm The main characteristics of the update law for the pro-posed Iterative learning control scheme (3) can be sum-marized to the following: 1) Non-causalily. The dependence of the input update law (3) on the adjoint of the plant G, where Gis a function of the error in future sample times than the cur-rent one, implies non-causality. 2) Integrator-type of approach. The algorithm utilizes the input change instead of the input itself to evaluate the performance of the system. 3) Feedback approach. This characteristic is one of the V. VITA ET AL. Copyright © 2010 SciRes. ICA 92 most important ones. Several authors  suggested feedforward approaches where the error of the previous trial was used in the input update law: 1kkkuuKe . On the contrary the update law in (3) is of the form: 11kkkuuKe , which means that it uses feedback of the current trial error. The feedback operator K is in this case the adjoint of the plant G. The feedforward ap-proaches face the problem of ignoring the current trial effects such as plant modeling errors and possible dis-turbances. This means that the algorithm is not sensitive to these effects. On the contrary the feedback approach of (3) takes into account all of this producing in this way a more robust solution and this is likely to be a more effective one in the plant stabilizing sense. 4) Descent algorithm. With the help of the equation: 2211kkkeJe , it is deduced that the norm of the error 1ke is monotonically decreasing from trial to trial. Two of the most important properties of the algorithm are presented below: 1) Convergence of the error sequence to zero. It can be shown not only that the error e converges to a final value e as the number of iterations kbut also that 0e. It is significant to mention that the convergence is exponential. It is also significant to be mentioned at this point, that the error sequence convergence can be proved even for an arbitrary chosen initial error 0e. This is a form of robustness, since it is ensured that the error will converge to zero, independent of the initial error value. 2) Convergence of the input sequence. It can be shown that the input of the algorithm converges to a final value u as the number of trials k. 6. Causal Formulation of the ILC Algorithm One of the main characteristics of the ILC algorithm (3) is that the algorithm is seems to be non-causal. Because of its characteristics it is a very efficient algorithm in terms of plant control. The main drawback is the fact that enormous amount of data and computational effort is required due to the matrices size. A solution to this problem would be the implementation of the algorithm. On the other hand, in order to be able to implement the algorithm it is essential to derive it in a causal form. 6.1. The Proposed Causal Algorithm The causal solution to the above problem is given by introducing the following proposed algorithm for Norm- Optimal ILC, which consists of the following terms: Term 1: The gain matrix K(t). Given in the form of the discrete Ricatti equation [6,7]:  1111 1TTT TKt KtKtKtRKtCQC   (5) for [0, 1]tN and with the terminal condition 0KN. Term 2: The feedforward (predictive) term.   111111TkTTkktIKtRtCQet  (6) for [0, 1]tN with the terminal condition 10kN. Term 3: The input update law.  11111kkTTkkTkututKtRKtxtx tRt  (7) Easily can be observed that Term I is independent of the inputs, outputs and states of the system, Term II, the predictive term 1kt is dependent on the previous trial error 1ket and Term III, the input update law depends on the previous trial input kut, the current state 1kxt, the previous trial state kxt, and the predictive term 1kt. This is hence a causal iterative learning control algorithm consisting of current trial-full state-feedback along with feedforward of the previous trial error data. 6.2. Mathematical Proof of the Causal Formulation In this section a complete mathematical proof of the al-gorithm is provided, demonstrating all the necessary steps from the optimization problem to the conception, development and finally the acquisition of the three terms of the algorithm’s causal form. The proof is based on the general idea proposed in  but it is for the first time that it is in its complete form, as it includes all the intermediate steps and calculations in an analytic way. The proof is as follows: Rewriting the input update law as a difference between the current trial input and the previous trial input gives the relationship (4): 111Tkk kututRGQe . This form of the input update law, when written in super-vector notation results in relationship (8). V. VITA ET AL. Copyright © 2010 SciRes. ICA 93     211 121 1131 11 100 111 2022 3001100 0NTTT TTTTTTTTkk kNTTT TTTTTkk kNkk kTTT TTkk kTTCC CCuu euu eCC CRQuu eCCuN uNeNC                   (8) The difference of the control input  1kkututu t at the tth sample (by examining the 1tht row of (8)) with 0tN , can be written:  1111NjtTT Tkkjtutut RCQej  (9)  1111NjtTTTkkjtututR CQej (10) In order to derive a state space description of the ad-joint system, the discrete co-state pt is introduced, such that:  11TkkututRpt (11) Comparing (10) and (11) it follows that the co-state satisfies: 110NjtTTjtptCQe jtN  (12) By expanding the sums in (12):    11021212123 ....123 ....123....NjtTTjtTT TTNtTTT TTTTNtTTT TTTTNtTTT TptCQe jptCQet CQetCQet CQeNCQet CQetCQet CQeNCQet CQetCQet CQeN     (13) but  212 3....TTTNtTTpt CQetCQetCQeN  (14) Hence the above yields the recursion relation for pt:  110TTptCQetptt N  (15) with  11TpN CQet  or equivalently 0pN. Consequently, the complete discrete input update law is as follows:  111,0Tkk kututR pttN  (16) 111Tkk kututRpt (17) and 11 1111,0TTkk kkptCQetpt pN  (18) The main observation is the fact that the adjoint sys-tem has a terminal condition (at t = N) instead of an ini-tial condition; this is a sign of a non-causal solution. Equation (18) is an expression for the current trial co-state. Another expression for the co-state 1kpt can be calculated as follows: A standard technique [6,7] is to set the discrete co-state equal to:   11111kTkkkptKtIRKtxt xtt  (19) with a gain matrix K(t) and a correction term ξk+1(t). Both the terms K(t) and ξk+1(t) should be computed be-fore each trial. Lets set:  11TMtKtIRKt (20) and then substitute this into equation (19):   11111TkkkkptKtIRKtxtx tt  111kkkkptMt xtxtt  (21) Setting the functions f(t) and g(t) as follows easily can be observed that these are independent of the states.   11111TTTTftMtIKtRCQCIKtR Mt   and   11111111TTkTTkkgtI KtRtIKtRCQett  1) Calculating the (feedforward) predictive term 11kt By setting g(t) = 0: V. VITA ET AL. Copyright © 2010 SciRes. ICA 94   111111110TTkTTkkgtI KtRtIKt RCQett  Solving the above relationship for the predictive term 11kt results in:  1111{11}TTkkTktIKtR tCQe t (22) which is simply the desired expression for the (feedfor-ward) predictive term. 2) Calculating the gain matrix K(t) By setting f(t) = 0:   111110TTTTftMtIKtRCQCIKt RMt     111110TTTTMtIKtRCQCIKtRMt   Multiplying both sides of the above relationship with 1TIKtRyields:  110TTTIKt RMtCQCMt  (23) Substitute M(t) with its equivalent from equation (20), equation (23) becomes:   11111110TTTT TI KtRKtIRKtCQCKtIRKt     11111110TTTT TIKt RIKt RKtCQCKtIRKt     11110TTTKt CQCKtI RKt  Solving the above term for the gain matrix K(t) results in the following relationship:   1111TT TKt CQCKtIRKt  111111TTTTKt CQCKtIR KtIR Kt   further manipulations and expanding the brackets of the above equation result in the desired relationship of the gain matrix (5) which is the usual discrete Matrix Riccati equation:  1111 1TTT TKt KtKtKtRKtCQC   3) Calculating the input update law for the new input 1kut In order to prove the desired relationship for the new input, equation (16) is used, which is: 111TkkkutR ptut  Substituting in the above equation for the discrete co-state 1kpt with its equivalent from equation (18) the following: equation is produced:   111111{}TTkkkk kutRKtIR Ktxtxttut   By expanding the brackets:   1111111TTkkTkkkutRKtIRKtxtxtRt ut     1111111TTkkTkkkutIR KtR KtxtxtRt ut   Having in mind that the multiplication at any desired point within the equation with the identity matrix I there is no loss of the equality:   1111111TTkTkk kkutIR KtIR Ktxtxt Rtut     11111111TTkTkk kkutIR KtRRRKtxtx tRtu t   1111111TTkTkk kkutRRRKt RRKtxtxt Rtut    11111TTkTkk kkutR KtKtxtxt Rtut  After the above manipulations and simplifications the following relationship comes out, which is simply the expression for the causal Term III (7), the input up-date law:   11111TTkkTkk kututKt RKtxtxt Rt  V. VITA ET AL. Copyright © 2010 SciRes. ICA 957. Conclusion In this paper an optimal iterative learning algorithm for discrete systems is analyzed. The algorithm’s properties and characteristics were presented with the most impor-tant feature to be that the error sequence converges to zero with an exponential rate for the case of discrete time plants. Finally, the complete causal formulation of the algorithm was derived in detail, consisting of optimal state feedback and optimal prediction feedforward. 8. References  Y. Chen and C. Wen, “Iterative Learning Control-Conver- Gence, Robustness and Applications,” Springer Verlag, London, 1999.  Z. Bien and J. X. Xu, “Iterative Learning Control-Analysis, Design, Integration and Applications,” Kluwer Academic Publishers, Dordrecht, 1998.  S. Arimoto, S. Kawamura and F. Miyazaki, “Bettering Operation of Dynamic Systems by Learning: A New Con-trol Theory for Servomechanism or Mechatronic Sys-tems,” Proceedings 23rd IEEE Conference on Decision and Control, Las Vegas, Nevada, 1984, pp. 1064-1069.  N. Amann, D. H. Owens and E. Roger, “Iterative Learning Control for Discrete Time Systems with Exponential Rate of Convergence,” IEE Proceeding of the Institution of Electrical Engineers on Control Theory and Applications, Vol. 143, No. 2, 1996, pp. 217-224.  K. L. Moore, “Iterative Learning Control for Deterministic Systems,” Advances in Industrial Control Series, Springer- Verlag, London, 1993.  K. J. Astrom and B. Wittenmark, “Computer controlled systems,” 2nd Edition, Prentice Hall, Englewoods Cliffs, 1990.  T. Kailath, “Linear Systems,” Prentice Hall, Englewoods Cliffs, 1980.  N. Amann, “Optimal Algorithms for Iterative Learning Control,” Ph.D. Dissertation, University of Exeter, UK, 1996.