Intelligent distributed cooperative control for multiple nonholonomic mobile robots subject to unknown dynamics and external disturbances - Nguyen Tan Luy

This paper provides a new distributed optimal cooperative tracking control method with disturbance rejection for multi-mobile robots. The method removed the phenomenon of separating the main controller and transformed a structure ACD with three NNs into a structure with only one NN for which the novel NN weight-tuning laws were designed. The online optimal cooperative control algorithms of the scheme, in which the knowledge of internal dynamics was relaxed, were proposed to approximate the Nash equilibrium solutions of the HJI equations. The algorithms guaranteed that the value functions and the control and disturbance laws simultaneously converged to the optimal values and that the cooperative tracking errors of the closed-loop systems and approximation errors of NN weights were uniformly ultimately bounded. The compared simulations were carried out, and the experimental results on the testbed equipped with an omnidirectional vision system were consistent with the simulation. Based on the experimental results, it can be inferred that our method is effective for a certain practical aspect of control systems technology including multiple nonholonomic mobile mechanic agents or autonomous vehicles, which track both positions and velocities.

23 trang | Chia sẻ: honghp95 | Lượt xem: 489 | Lượt tải: 0

Bạn đang xem trước 20 trang tài liệu Intelligent distributed cooperative control for multiple nonholonomic mobile robots subject to unknown dynamics and external disturbances - Nguyen Tan Luy, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

advantages. First, they included two iterative loops, i.e., as the parameters of Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 142 the disturber NN were updated in an iterative loop, the parameters of the actor NN had to wait for updating in the other loop. Second, the knowledge of system internal dynamics was required. Finally, the initial stability of the system strongly depended on how the initialization of the three-NN weights was performed. For those reasons, an increase in computational complexity in addition to wasted resources were inevitable [21, 22]. In our previous work [23], the design of an optimal cooperative control of multiple MIMO nonlinear systems overcame drawbacks of using many NNs. However, disturbance rejection was only considered for each agent, not for neighborhoods. To the best of our knowledge, optimal cooperative tracking control schemes with disturbance rejection for multiple nonlinear agents in the presence of no knowledge of internal dynamics with application to nonholonomic robot systems has not yet been considered. In this paper, we provide such a scheme with the following main contributions. 1. A bounded 2L -gain synchronization problem of a multi-NMR system in a distributed communication graph is formulated. In contrast with the work in [18], we avoid separating kinematics and dynamics when designing the control scheme. Thus, a performance index function subject to all signals will be minimized. 2. The design of an optimal cooperative tracking control scheme is proposed. This work extends the work of [14] to three cases: (i) Consideration of nonlinear agents instead of linear agents; (ii) Use of only one NN for each agent instead of three to overcome some disadvantages caused by the large number of NNs; (iii) No prior knowledge of system internal dynamics for analyzing and designing control algorithms. We prove that the system parameters converge to the approximately optimal values, and the cooperative tracking errors and the NN approximation errors are uniformly ultimately bounded. 3. Through simulations, the proposed algorithms with other algorithms are compared to demonstrate effectiveness. To test our algorithms in practical applications, a hardware testbed, consisting of NMRs equipped with an omnidirectional vision system, is designed and constructed. Based on the experimental results, it can be inferred that our method is effective for a certain practical aspect of control systems technology including multiple nonholonomic mobile mechanic agents or autonomous vehicles. The paper is organized as follows. Section 2 provides the theoretical background of graph and nonholonomic mobile robots from which integrated cooperative control is derived. Section 3 designs an optimal cooperative tracking control scheme. Section 4 shows the results of the simulation and experiment. A brief conclusion is given in Section 5. 2. BACKGROUND AND PRELIMINARIES 2.1. Distributed Communication Graph Theory Consider m robots in a cooperative system. The distributed communication of the system can be represented by a directed graph ( , , ) , where the robots are characterized by the set of nodes 0 ms , ,s , where 0s is a leader node. Relationships among the robots are determined by the set of edges with a connectivity weight matrix [ ]ija , where 0iia , 0ija for ija and 0ija , otherwise. If the states of the robot i are available to the Intelligent distributed cooperative control for multiple nonholonomic mobile robots 143 robot j then js is a neighborhood of is . All neighborhoods of is give a set : ,( , )i j i jj s s s . Define a graph Laplacian matrix ( 1) ( 1)m m , where ( )idiag b , i i ijj b a . Note that row sums of are equal to zero. A directed path is a sequence of ordered edges 1( , )i is s , 0, , 1i m . If a directed path from is to js exists such that ( , )i js s , i js s , then the directed graph is strongly connected. The graph is directed spanning tree if the set 1, , ms s exits at least one node with a directed path to all other nodes. The connectivity matrix between the ith robot and its leader is defined as 1 2, ,..., mdiag c c c (1) where 1ic if the ith robot connects to its leader, and 0ic , otherwise. 2.2. Nonholonomic Mobile Robot and Integrated Cooperative Control Problem Consider a NMR presented by the node is . Its mass im , including the mass of the platform without wheels and the mass of wheels, is focused on the center point. A distance of driven wheels is ib . A radius of each wheel is ir . A distance from the center point to the driving axle is il . Without loss of generality, il can be equal to zero. NMR is a mechanical system with n generalized configuration variables 1 2( , , , )i i inq q q suffered p nonholonomic constraints [19]. The kinematics and dynamics of the ith NMR are written as ( ) ( ) ( ) ( , ) ( ) ( ) i i i i i i i i i i i i i di i i i q S q v t q v C q q v F B qM q η η (2) where 1 2[ , , , ] n i i i inq q q q are position vectors, and ( ) [ , ] n p i i iv t ω are velocity vectors, where i and iω are translational and rotational velocities. ( )n n p iS are full-rank matrices, and i i iB S B with ( )n n p iB are the input transformation matrices. i i i iM S M S with n n iM are inertia matrices consisting of the total mass im and moment of inertia iI . Centripetal and Coriolis matrices are defined as i i i i i i iC S M S S C S with n n iC , and surface friction and gravitational vectors are defined as i i iF S F with n iF . Bounded disturbances including unstructured unmodeled dynamics and external disturbances are denoted by di i diη B η with n p diη , and control torque vectors are denoted by [ , ] n p i li riη η η , where liη and riη are left and right torques, respectively. Properties 1: iM are asymmetric positive definite matrices. The parameters in (2) are bounded Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 144 (Boundedness [19, 24]), i.e., min maxii im mM , min maxii ic cC , maxdi idη η , min maxqii is sg , 1 1 max minvi i ii im gB m B , 1 1 max minvi iim mk , and max max maxi i i i i iF f q f s v max max maxi i i i i iF f q f s v , with positive constant scalars minim , maxim , maxic , minic , max min,di iη s , maxis and maxif . From (2), the nonlinear dynamics of NMR i with unknown internal dynamics vif and disturbances diη can be written as ( ) ( ) ( , ) ( , ) ( , ) qi i qi i i vi i i vi i i i vi i i i di i f q g q v f q v g q v q v η k q v η (3) where 0qif , qi ig S , 1 vi i i i if M C v F , 1 ivi iMg B , 1 vi ik M . Remark 1: vif is Lipschitz on the compact set n p iΩ such that 1 min max max max max( , ( )) i i i i i i ivi i if q s μv m c f v v for a positive constant scalar maxiμ [25]. diη has 2L -gain with 2 0 diη dt [26]. Definition 1: It is assumed that a leader robot (virtual robot) generates the bounded smooth trajectory 0 nq that holds 0 0 0 0 0 0 0 0 ( ) ( , )v q S q v v f q v (4) where 0v is the desired velocity, and 0vf is an acceleration function which satisfies Lipschitz assumption [27]. Then the cooperative tracking control problem of the multi-NMR system is to design iη in (3) so that when 0diη , if each NMR directly connects to his leader, 0( ) ( ) 0iq t q t and 0( ) ( ) 0iv t v t , or directly connects to its neighborhoods, ij : ( ) ( ) 0i jq t q t , ( ) ( ) 0i jv t v t . For the ith NMR, the local tracking error functions are defined as [7] 0( ) ( ) i qi ij i j i ij e a q q c q q (5) 0( ) ( ).i i v ij i j i ij e a v v c v v (6) Furthermore, to avoid collisions, (5) is written as 0( ) ( ). i qi ij i i j j i i ij N e a q δ q δ c q δ q (7) where n iδ are coordinates of the front points on NMRs i and j . Taking the derivative of (5) or (7) and (6), with notation of (3) and (4), functions of tracking dynamics are rewritten as 0 ( ) i qi i i i qi i ij qj jj e c q b c g v a g v (8) Intelligent distributed cooperative control for multiple nonholonomic mobile robots 145 ( ) ( )( ) ( ) i vi evi i i vi i vi di ij vj j vj djj e f t b c g η k η a g η k η (9) where 0( ) ( ) ( ) i evi ij vi vj i vi vj f t a f f c f f . For m NMRs, the overall functions of tracking dynamics are given by 0 ( ) ( )q n n qe I q I g q v (10) ( ) ( ) ( , ) ( , )( )v ev n p v v de f t I g q v η k q v η (11) where is Kronecker product operator, and 0 01 nq I q , with 1 1,...,1 m , and ,n n pI I are identify matrices, 1 ,..., mq q q , 1,...,q q qme e e , 1( ) ,...,q q qmg q diag g g , 1 ,..., mv v v , 1,...,v v vme e e , 1( ) ,...,ev ev evmf t f f , 1( , ) , ,v v vmg q v diag g g , ( , )vk q v 1, ,v vmdiag k k , 1 ,..., mη η η , 1,...,d d dmη η η . , where and are sub-matrices of and , formed by removing the elements of the leader. Note that, (10) and (11) denote kinematic and dynamic equations. In almost reported studies, kinematic and dynamic controllers were designed successively based on these equations. In this paper, the objective of our design is to gain integrated controllers without separating kinematics and dynamics. To this end, the following transformations are performed. Adding and subtracting ( )n q vI g q e to (10), yields ( )( ) ( )( ) ( )q n q a q ve I g q v v g q e (12) where 1[ ,..., ]a a amv v v , holds 0( ) ( ) .( ) ( )n q a n n q vI g q v I q I g q e (13) Adding and subtracting ( )n q qI g q e to (11), yields ( ) ( ) ( , )( ) ( , ) ( )( )v ev n p v a v d n q qe f t I g q v η η k q v η I g q e (14) where 1[ ,..., ]a a amη η η , holds ( ) ( , ) ( ) .n p v a n q qI g q v η I g q e (15) Next, the pseudo-control inputs of (10) and the real-control inputs of (11) are defined as * av v v (16) * aη η η (17) where the bounded 2L -gain optimal control inputs * * * 1 ,..., mv v v and * * * 1 ,..., mη η η will be designed in the next section. Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 146 Now, we introduce Lemma 1 to convert the cooperative tracking control problem of the multi-NMR system into the stabilization of an integrated cooperative dynamical system. Lemma 1: Let the integrated cooperative control inputs *[ , ]r au v η u u hold (16) and (17), where the inputs [ , ]a a au v η hold (13) and (15), and the bounded 2L -gain optimal control inputs * * *[ , ]u v η stabilize the following integrated cooperative dynamical system *2( ) ( ) ( ) ( )( )e n pe f t I g x u k x d (18) where [ , ]q ve e e , ,x q v , 1[0 , ]mn dd η , 1( ) 0 ,e mn evf t f denotes unknown internal dynamics, ( )( ) 0 ,mn m n p vk x diag k and ( ) [ , ]q vg x diag g g are input matrices. Then, the optimal cooperative tracking control scheme of the multi-NMR system with dynamics (3) is equivalent to the bounded 2L -gain optimal control scheme of (18). In other words, if the dynamical systems (3) are applied by the control law * r au u u , the tracking error dynamics are transformed into the cooperative dynamical system (18). Proof: To evaluate the stability of the system (3) when applying ru , we rewrite the tracking error dynamic equations of (3) by substituting the control laws (16) and (17) into (12) and (14) with noting (13) and (15): * * ( ) 0 0 00 0 ( ) ( ) 0 ( , )( ) 0 ( , ) ( ) 0 ( ) 0 ( ) n qq n n p vev dn p vv n q v qn p q I g qe Iv I k q vf t ηI g q ve η I g q e eI g q (19) Then, we choose the candidate Lyapunov function / 2J e e , and take derivative through (19): * * ( ) 0 0 00 0 ( ) ( ) 0 ( , )( ) 0 ( , ) ( ) 0 ( ) . 0 ( ) n q n n p vev dn p v n q v q v qn p q I g q Iv J e I k q vf t ηI g q v η I g q e e e eI g q (20) One may easily recognize that the last term in the right-hand side of equation (20) is equal to zero. Thus, Eq. (21) can be rewritten in the reduced form as *2( ) ( ) ( ) ( )( ( )).e n pJ e f t I g x u k x d (21) On the other hand, if we also choose the Lyapunov candidate J for closed-loop dynamical system (18), after taking derivative, we obtain the result as (21). One can conclude that the existence of a bounded 2L -gain optimal tracking controller to make (21) is negative to stabilize the dynamical system (18) is sufficient to make dynamical systems (3) stable. Intelligent distributed cooperative control for multiple nonholonomic mobile robots 147 Remark 2: Because system internal dynamics ( )ef t and external disturbance d in (18) are derived from (3), which obviously hold the facts in properties 1 and remark 1, knowledge of ( )ef t is completely unknown and disturbance 2[0, )d L . 3. DESIGN OF OPTIMAL COOPERATIVE TRACKING CONTROL SCHEME WITH DISTURBANCE REJECTION In this section, motivated by the design of the optimal cooperative tracking control scheme with disturbance rejection but only applied to multiple linear agents [14], we propose a novel control scheme to apply to multiple nonlinear agents, of which the multi-NMR system in the paper is one. 3.1. Bounded 2L -Gain Problem for Multi-NMR Systems Consider nonlinear cooperative dynamics of each NMR, derived from (18), with the measured outputs q iy ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) i i ei i i i i i i i i ij j j j j j jj i i i e f t b c g x u k x d a g x u k x d y h e (22) where ( )i ih e are continuous smooth functions. Define the general disturbances i i iω d d with ,i j id d j , and performance outputs ( ) ( ) ( )i i i iz e t u t u t , ,i j iu u j , satisfying the following inequality of the bounded 2L -gain for all NMRs when disturbances 0iω 2 2 2 0 0 0 2 0 ( ) ( (0)) ( (0)) ( ) i i T T T i ii i i ii i i i T i ii i j ij j jj jj i ijz dt Q e u R u dt γ ω dt β e γ d T d d T d d u R u t β e (23) for some bounded functions β such that (0) 0β [28], where 0 ( ) 0i ii ie Q e and 0 (0) 0i iie Q , 0iiR , 0ijR , 0iiT and 0ijT . *γ γ is the prescribed disturbance attenuation level, where *γ is the minimum gain of γ for which the bounded 2L -gain condition (23) is satisfied. The objective in this section is to design an optimal cooperative tracking control scheme for each NMR subject to unknown internal dynamics eif and external disturbances id and jd to make all signals in (22) 2L -bounded. Define the local infinite horizontal tracking performance function for each NMR 2 2 0 ( (0), , , , ) ( ) .( ) i i j ij j ji i i i i i ii i i ii i i ii i ij jj j J e u u d d Q e u R u γ du R u dT d γ T dd t (24) Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 148 Two-player zero-sum differential game theory [29] can extensively be studied to find solutions of the bounded 2L -gain problem for the system (22) subject to (24) * (0) minmax (0), , , , ,( ) ( ) i i i i i i i i i i u d V e J e u u d d (25) i.e., the saddle point ( * *,i iu d ), where * iu and * id are the optimal control law and the worst disturbance law, respectively, holds * * * * *(0) minmax (0), , , , max min (0), , , , .( ) ( ) ( ) i ii i i i i i i i i i i i i i i i u dd u V e J e u u d d J e u u d d (26) * iV is known as the Nash equilibrium value of the multi-player game holding the following constraints for all laws iu and id . * * * * * * * * * *, , , , , , , , , .( ) ( ) ( )i i i i i i i i i i i i i i iJ u u d d J u u d d J u u d d (27) Followed by ADP principle, for state feedback control laws iu and id , we define a local cooperative value function for the ith NMR as [14] 2 2( ) ( )( ) ( ) i i i i ii i i ii i j ij j i ii i j ij jj jt V e t Q e u R u u R u γ d T d γ d T d dt (28) Using Leibnizs formula, a differential equivalent to (28) is given by the Hamiltonian 2 2 , , , , , ( ) ( ) ( ) ( ) ( ) ( ) 0. ( ) ( ( ) ) i i i i i i i i i i ei ei i i i i i i i i ij j j j j j j i j ii i i ii i i ii ij ij j j ij jj j V H e u u d d V f b c g x u k x d a g x u R u d T u k x d e Q e u R u γ d dT d γ (29) Using the stationary condition for (29), one obtains 1 1 0 ( ) ( ) ( ) 2 i i i i i i ii i i i i H V u e b c R g x u e (30) 1 2 1 0 ( ) ( ) ( ) , 2 i i i i i i ii i i i i H V d e b c T k x d eγ (31) with boundary condition (0) 0iV . The following coupled HJI equations for the cooperative tracking problem are obtained by substituting (30) and (31) into (29): 2 1 2 1 1 2 1 1 2 2 2 1 1 ( ) ( ) ( ) ( ) ( ) ( ) 4 4 1 1 ( ) ( ) ( ) ( ) ( ) 4 4 ( i i jci i i ii i ei i i i i ii i i j j j j jj ijj i i i j j j j i jj j j j j j j jj ij jj j j i ij j j j i i i VV V V Q e f b c g x R g x b c g x R R e e e e V V V V R g x b c k x T T T k x b c e e e eγ γ k x 1) ( ) 0iii i i i V T k x e (32) where the closed–loop system corresponding to the ith NMR is defined as Intelligent distributed cooperative control for multiple nonholonomic mobile robots 149 2 1 1 2 2 1 1 2 1 1 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2 2 1 ( ) ( ) ( ) ( ) ( ) . 2 1 2 i i jc i ei ei i i i i ii i i j j j j jj j j i i i j ji i i ii i i i j j j j jj j j i jj ijj j VV f f t b c g x R g x b c g x R g x b c e e γ VV k x a T k x b c k x T k x e γ a e (33) Let all optimal control laws and worst disturbance laws be functions of given solution iV , i.e., * ( )i i iu u V , * ( )i i iu u V , * ( )i i id d V and * ( )i i id d V , then the HJIs (32) become * * * *, , , , , 0, (0) 0.( )ii i i i i i i i V H e u u d d V e (34) Lemma 2: Assume that ( )i iV e , 1,...,i m is smooth, ( ) 0i iV e , and ( )i iV e is a the solution to the coupled HJI equation (32). Let optimal control laws of neighborhoods be given. Then, for every * iu and * id , the following condition holds * * * * * * 2 * *, , , , , ( ) ( ) ( ) ( ).( )ii i i i i i i i ii i i i i ii i i i V H e u u d d u u R u u γ d d T d d e (35) Proof: Complete the squares in (32) to obtain (35). Lemma 3: Choose *γ γ . Assume that * iV , 1,...,i m is smooth, * 0iV , and * iV is a the solution to the coupled HJI equation (32). Let optimal control laws of neighborhoods be given. Then the equilibrium point of the closed-loop system * *( ) ( ) ( ) ( ) i i ei i i i i i ij j j jj e f t b c g x u a g x u (36) is asymptotically stable with control inputs * *( )i i iu u V given by (30) in terms of * iV . In addition, in the presence of disturbances *,i id u makes the bounded 2L -gain condition (23) satisfied, where ii i i iQ h Q h with iQ is a positive-definite matrix. Proof: The proof is similar to the proof of Theorem 1 in [14] and is omitted here. Lemma 4 (Solution to Multi–player Zero-sum Game [14]): Choose *γ γ . Assume that the value of the game (33) is finite and optimal control laws of neighborhoods are available. Let *,iV i m be smooth, * 0iV , and be the solution to the coupled HJI equation (32), such that the closed-loop system * * * *( ) ( ) ( ) ( ) ( ) ( ) ( ) i i i ei i i i i i ij j j j i i i i i ij j j jj j e f t b c g x u a g x u b c k x d a k x d (37) is asymptotically stable around its equilibrium point. Then, the Nash condition (27) is satisfied for control and disturbance laws, * *( )i i iu u V and * *( )i i id d V , given by (30) and (31) in terms Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 150 of * iV . Further, there exists the value of the game, i.e., the solution *( (0))i iV e , i m of the HJI equation (32). Proof: The proof is similar to the proof of Theorem 2 in [14] and is omitted here. It is shown in [29] that the nonlinear bounded 2L -gain optimal control problem relies on values of multi-player zero-sum games; in particular, as shown in Lemma 4, the values are the solutions of the coupled HJI equations (32). However, the HJI equations are impossible to solve analytically. 3.2. Optimal Cooperative Tracking Control Scheme with Disturbance Rejection and Algorithms in Real Time In this section, we design a control scheme and online algorithms to solve the HJI equations (32) based on reinforcement learning techniques of [12, 14]. However, in contrast to most existing schemes using ADP for learning the solutions of the HJI equations, the scheme in this paper uses only one critic NN and does not need knowledge of system internal dynamics. Moreover, the online algorithms synchronously update parameters in one iterative loop. According to the Weierstrass higher order approximation Theorem [30], there exists an NN such that the smooth value function iV , i m is approximated by *( ) ( ) ( )Ti i i i i i iV e W e e (38) where (2 )( ) : n pi ie is the activation function vector of neurons in the hidden layer, ( )i ie the NN approximation error, and iW R the ideal weight vector. Properties 2 (Approximation [31]): ( )i ie , 1,...,i m can be selected as a complete independent basis set so that when , ( ) 0i ie and ( ) ( ) / 0ei i i i ie e e , and for fixed , max( )i i ie , max( )ei i eie , where maxi and maxei are positive constants. Substituting (30), (31) and (38) into (32), the NN-based coupled HJI equations are obtained 2 2 1 1 ( ) ( ) ( ) ( ) 4 2 1 ( ) 0 4 ( ) ( ) ( ) i ii ii i i ei ei i i i ei i i ei i ij j j i ei j j ej jj j j j ej i i ej j Hj Q e W f t b c W g k W a b c W g k W b c W g k W (39) where { , }l i j , 1 l l ll lg g R g , 2 1 l l ll lk k T k , ( ) /el l l le e , /el l le , 1 1 i j jj ij jj jg g R R R g , 1 1 2 1 i j jj ij jj jk k T T T k . The residual errors iH e , caused by the function approximation errors, are computed as Intelligent distributed cooperative control for multiple nonholonomic mobile robots 151 * * 2 * * 1 * 1 * 2 1 ( ) ( )( ) ( ) 2 4 1 ( ) ( ) 2 1 ( ) . 4 ( ) ( ) ( ) ( ) ( ) ( ) i i i i i H ei ei i i i i i i i i ei i i ei ei ij j j j jj ij j j ei j j ej j j ej j jj ij j j jj ij jj j j j ej i i ejj ε ε f t b c g u k d b c ε g k ε ε a g u k d a b c ε g k ε b c ε g R R u k T T d b c ε g k ε (40) Remark 3: According to Properties 1, { , , }l i j i , the functions lg and lk are positive definite and bounded, e.g., min max0 i i ig g g , where 2 min min max ( )i i iig g R and 2 max max min( )i i iig g R with min and max are the functions of largest and smallest eigenvalues, respectively. Then, using Property 2, Hi is bounded on a compact set. That is max max max0, ( ) : sup iHi Hi e Hi HiN . Moreover, if , Hie converges uniformly to zero [30]. The ideal weight vectors iW (38) are unknown, thus ( )i iV e are approximated by ˆiW : ˆ ˆ( ) ( ).i i i i iV e W e (41) Then, the estimated control and disturbance laws become 11 ˆˆ ( ) ( ) ( ) , 2 i i i i ii i i ei iu e b c R g x W (42) 1 2 1ˆ ˆ( ) ( ) ( ) . 2 i i i i ii i i ei id e b c T k x W (43) The approximate Hamiltonian are obtained by substituting (43), (42) and (41) into (29): 2 2 1ˆ ˆˆ ˆ ˆ ˆ ˆˆ ˆ, , , , , ( ) ( ) ( ) 4 1 1ˆ ˆ ˆ ˆ( ) ( ) . 2 4 ( ) ( ) ( ) ( ) i i i i i i i i i ii i i ei ei i i i ei i i ei i ij j j i ei j j ej j j j j ej i i ej jj j H e W u u d d Q e W f t b c W g k W a b c W g k W b c W g k W (44) It is desired that besides turning ˆiW to minimize residual error functions of (44) such that ˆ i iW W , assumption of identifying knowledge of internal dynamics should be removed. The residual error functions are chosen as the square integral functions 1 2 i i i H HE e e , where ˆ ˆˆ ˆ ˆ ˆ, , , , ,( ) i t H i i i i i i i t T e H e W u u d d dη (45) where T > 0 is a chosen time interval. Based on the normalized gradient descent scheme that is modified from Levenberg-Marquardt algorithm, we propose the following weight-tuning laws Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 152 ( ) ( )2 2 ( 1) ˆ ( 1) i i i it it i t-T i t-T i i i i i i i i i α ζ Ψ if e e e e ζ ζ W α ζ Ψ Ξ otherwise ζ ζ (46) where ( )it ie e t , ( ) ( )i t T ie e t T , and i , i and i are defined in (47), (48) and (49), respectively, with positive constants i and i . 21 1ˆ ˆ( ) ( ) ( )( ) ( ) 2 2 ( ) ( ) ( ) . ( ) ( ) ( ) ( ) i t t i ei ei i i i i ei i ij j j j j ej j ei ijt T t T i i i i i i f b c g k W a b c g k W d e s d e t e t T e t (47) 2 2( ) 1 1ˆ ˆ ˆ ˆ ˆ( ) ( ) ( ) . 4 4 ( ) i t i i i ii i i i ei i i ei i j ej i i ej j t T j jj W Q b c W g k W W g kc db W (48) ( ) ( ) ( ) ( ) . 2 ( ) i t i i i i ei i i i ij j j ej j j jjt T b c g k e a b c g g e d (49) Later, Theorem 1 shows that if the tuning laws (46) are used in online algorithms to learn the solutions of (39), the approximated NN weights will converge and be stable. Remark 4: Knowledge of system internal dynamics ( )eif t is relaxed for the weight-tuning laws (46). Remark 5: If ( ) ( ) 0i je t e t , the values in the right side of the weight-tuning laws (46) are zeros. Thus, ˆ iW are not tuned any more. To guarantee that ˆ iW converge to the true values, we apply the Persistence of Excitation (PE) condition [32] though the following lemma. Lemma 5: Let iu and id , i m be any given bounded stable laws so that the value function (28) can be written as 2 2( ) ( ) ( ) .( ) ( ) ( ) ii j i t i i ii i i ii i i ii i j ij j i ijt j jjT V e t T Q e u R u γ d T d γ d T d dtu R u V e t (50) Using NN (36) for (50), 2 2( )( ) ii i t T ii i i ii i j ij j i ii i j ij j i i i Bj jt T Q e u R u u R u d T d d T d dt W e e (51) where iB e is the reconstruction error. If PE condition be satisfied in the interval [ , ]pt T t , 0pT 1 2( ) ( ) p t T i i i i t T β I ζ η ζ η dη β I (52) where 1i , 2i are positive constants, / ( 1) T i i i i and I is the identity matrix with the appropriate dimension. Then, Intelligent distributed cooperative control for multiple nonholonomic mobile robots 153 For 0 iB e (no reconstruction error), the NN weight approximation error converges to zero exponentially fast; For maxiB i e e , the NN weight approximation error converges exponentially fast to a residual set. Proof: From (42) and (43), Eq. (47) can be written as 21 1ˆ ˆ ˆ ˆ( ) ( ) ( )( ) . 2 2 ( ) i p t T i i i i ei ei i i i i ei i ij j j j j ej jj t T e W W f b c g k W a b c g k W d (53) With noting that ˆi i iW W W ˆ( )i iW W and pT T , substituting (53) into (46), function approximation error dynamics is obtained as 2 2 2 1 1ˆ ˆ ˆ ˆ( ) ( ) ( ) ( ) ( 1) 4 2 1ˆ ˆ ˆ( ) ( ) ( ) . 4 ( ) i p i t i i i ii i i ei ei i i i ei i i ei i ij j j i eiT j i i t T j j ej j j j j ej i i ej jj W Q e W f b c W g k W a b c W g k W b c W g k W d (54) Note that, from (44), iH e in (45, i m can be written as 2 2 1 1ˆ ˆ ˆ ˆ( ) ( ) ( ) ( ) ( ) 4 2 1ˆ ˆ ˆ( ) ( ) ( ) . 4 ( ) i ip i t H ii i i ei e i i i ei i i ei i ij j j ijt T ei j j ej j j j j ej i i ej jj e Q e W f t b c W g k W a b c W g k W b c W g k W (55) Using (53) and (55), one obtains 2 2 0 ˆ( ) .( ) i i i t T ii i i ii i j ij j i ii i j ij j i i i Hj j Q e u R u u R u γ d T d γ d T d dt W Δ e e (56) Notice that, using pT for (51), and then subtracting to (56), . i iH i i i B e W e e (57) On the other hand, comparing (55) with (54), we obtain: 2 . 1 i i i i H i i ζ W α e ζ ζ (58) Inserting (57) into (58), note that i i (47), iW becomes i T i i i i i i i B i ζ W α ζ ζ W α e m (59) where 1 T i i im . This approximation error is the same as the approximation error in [33], and the reminder of the proof is followed by the proof of Theorem 1 in [33]. Online Algorithms: Based on (46), we design online algorithms for optimal cooperative tracking control (OCTC) (as shown in Algorithm 1), where all parameters of control and Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 154 disturbance laws are updated simultaneously in one iterative loop, in contrast to policy iteration algorithms using three NNs for each agent in [14]. Algorithm 1. Online OCTC Step 1: 1,...,i m , ij select iiQ , iiR , iiT , ijR , ijT , , the activate functions vector i , i , i , choose the sampling interval T , initialize stable values of (0)ˆ iW and (0)ˆ jW . Compute (0)ˆ iV (41), (0)ˆ iu (42) and (0)ˆ id (43), choose probing noise i for the PE condition (52). Assign 0l , stopl (time to stop the algorithm), (the small positive real number) for the convergence criteria. Step 2: 1,...,i m add the probing noise to ( )ˆ l iu and ( )ˆ l id to excite the system: ( ) ( )ˆ ˆl l i i iu u and ( ) ( )ˆ ˆl l i i id d ; observe ( ), ( )i ie t e t T for i (47); update ( 1)ˆ l iW (46); compute ( 1)ˆ l iV (41) update both ( 1)ˆ l iu (42) and ( 1)ˆ l id (43), simultaneously. Step 3: 1,...,i m if ( 1) ( )ˆ ˆl l i iW W then assign 0i . If stopl l and 0i then stop the algorithm, else 1l l , go back to Step 2. 3.3. Stability and Convergence Analysis Stability and convergence of the closed–loop systems (22) when the ith NMW performs the online OCTC with the NN weight tuning law (46), control law (42) and disturbance law (43), are stated and proven by the following theorem. Theorem 1: Let the cooperative dynamics of the multi–NMR systems be defined in (18), which gives the ith dynamics in (22), for all i, i = 1, ..., m. Let the cooperative value function of NMR i be chosen as (28) and the coupled HJI equations as (32). Let the NN weight–tuning law be defined in (46), the control law in (42) and the worst case disturbance law in (43). Let i be the PE condition (52). Assume that NMR i performs the online OCTC and that the control law and disturbance law of the neighborhoods in the previous steps were updated and stable with * . Then, the online OCTC guarantees that (Stabilization) The cooperative tracking errors ie of the closed–loop systems and the NN approximation errors iW are uniformly ultimately bounded (UUB). (Convergence) After a limited number of iterative steps, the value function, the control law as well as the worst disturbance law are synchronously converged to the approximately optimal values, i.e., * ˆ ii i v V V e , * ˆ ii i u u u e and * ˆ ii i d d d e for small positive constants iv e , iu e and id e . 4. HARDWARE TESTBED AND RESULTS 4.1. Hardware Testbed with Omnidirectional Vision Intelligent distributed cooperative control for multiple nonholonomic mobile robots 155 Using the graph theory in Section II, the communication of the multi-NMR system is chosen (Fig. 2), where the virtual leader is indexed by 0. The information exchange between NMR i and its neighborhoods, including positions [ , , ]i i i iq x y , velocities [ , ]i i iv and torques [ , ]i il ir , is represented by arrows. It is desired that information of the virtual leader is only available to NMR 1 . To verify the effectiveness of the proposed algorithm for practical applications we developed the hardware testbed thatconsists of three experimental NMRs as shown in Fig. 1. The geometric parameters of the NMRs are 1 0.05r m, 1 0.5b m, 1 0l , 2 3 0.025r r m, 2 3 0.2b b m, and 2 3 0l l . The total mass parameters of the NWRs are 1 5m kg and 2 3 0.5m m kg. Then, using these values, the parameters of iM and iB in (2) are obtained. Figure 2. Communication graph of multi– NMR system. Figure 1. Experimental NMRs; (a): Rear view, (b): Front view. The hardware diagram of NMR 1 , shown in Fig. 3, consists of three main parts: the mechanical part (a mechanical frame, DC motors with stall torques of 0.73 Nm, and digital quadrature encoders with 400 divisions), a control board (a PIC micro-controller, a power circuit, and a XBee wireless module), and an embedded computer with an Intel Atom D510@1.66 GHz CPU executing the online OCTC. An omnidirectional vision system is constructed to identify feedback states of positions and linear velocities for all NMRs. Each neighbor of NMR 1 has only one frame and a control board. Communication with others is achieved using XBee radio transceivers. At each sample point, the neighbors send their encoder pulses to and receive torques from NMR 1 . In the case of a dropped packet, the previous data packet is used. In the embedded computer, software based on the VC++ programming language running on the Windows platform is programmed to implement image processing via the OpenCV software to identify the positions and linear velocities of all NMRs, read the encoder pulses to compute the rotation velocities, process communication, execute the online algorithm, and send the torque signals to the micro-controllers for controlling the DC motors through the pulse width modulation (PWM) technique with a frequency of 20 kHz. The upper bounds of the torques are selected as 0.2i Nm. In addition, the software generates reference trajectories of positions and the velocities. The embedded computer and the micro-controller communicate to each other via the RS232 protocol. Users can remotely start or stop the testbed through interfacing tools on the remote computer connected to the embedded computer through the wireless network. The data generated by the NMRs during movement are stored and plotted in Matlab. Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 156 Figure 3. Structural diagram of NMR 1. Figure 4. Local and global Cartesian coordinate systems. Only NMR 1 is equipped with the omnidirectional vision system (OVS), which is shown in Fig. 3. OVS consists of a camera and a hyperbolic curve mirror with the optical center C and a bend radius R . The mirror is placed on a glass tube with a diameter 0.2D m. The camera, with a resolution of 1280 720 pixels and a frame rate of 30 fps, is fixed to the top of the robot platform at the geometrical center. The vertical center axis of the glass tube goes through the focus point of the camera and the mirror center. The distance between the origin O and the bottom point of the mirror 1C , which is measured along the vertical axis, is 0.5H m. The OVS can recognize the colored landmarks in any direction; therefore, any mechanism to adjust the rotation of the camera around the pan, tilt and yaw axes is not needed. Consider the Cartesian coordinate system OXY fixed at the surface of NMR 1, where the origin O coincides with the geometrical center and OX aligns with the vertical axis of symmetry. Via OVS and image processing, it is not difficult to measure the center coordinate of any landmark placed around the robot in the image space OXY in units of pixels. However, when the robot moves, to identify every coordinate in the world space OXY in units of length, we need an identity operation [25]. Here, we use an NN with radial basis functions (RBFs). The RBFs are trained offline using samples, which are the center coordinate of the landmark measured in the image space (pixels) as inputs and in the world space (meters) as desired outputs. Using the approximation ability of the RBF, the centers of any landmarks surrounding the robot can be recognized and transformed into the world coordinates. To determine the center position coordinates of NMR 1 in the horizontal plane, we choose the Cartesian coordinate system Oxy based on two differently colored landmarks (as shown in Fig. 4). Without loss of generality, the origin O coincides with the center of the landmark (blue), and the axis Ox goes through the center of the other landmark (red). Suppose that the center coordinates of the two landmarks in OXY and the axis OXY are 1 1( , )X Y and 2 2( , )X Y ; they can be transformed to Oxy to obtain the robot center position vector 1 1 1[ , , ]x y [25]: Intelligent distributed cooperative control for multiple nonholonomic mobile robots 157 2 1 1 2 2 2 1 2 1 1 2 1 1 2 1 1 2 2 2 1 2 1 1 2 1 1 2 1 1 2 2 2 1 2 1 argcos ( ) ( ) ( ) ( ) . X X θ X X Y Y Y Y Y X X X x X X Y Y Y X X X Y Y y X X Y Y (60) In Figure 4, we can determine the coordinates of the neighborhoods, e.g., NMR 2, in Oxy. On the surface of NMR 2, we place two color objects. The center of the green object coincides with the center of the robot, and the center of the yellow object is located on the longitudinal axis. The coordinates of the robot in Oxy can be identified by the following formulas: 2 1 12 12 1 2 1 12 12 1 2 1 12 ( ) ( ) . x x l cos ψ θ y y l sin ψ θ θ θ λ (60) where 1 2 1 2 12 2 2( ) ( )l X Y , 12 12 12 , 1 2 2 2 12 1 2 2 1 2 2 2 2 2 2 argcos ( ) ( ) X X X X Y Y , 1 12 2 12argco /s X l . 4.2. Simulation and Experimental Results In this subsection, first, to evaluate the effectiveness of the proposed method, simulations of the OCTC algorithm with one NN and the ACD algorithm [14], extended for the multi–NMR system, with three NNs, are performed and compared. Then, based on the simulation results, the experiment is implemented. The parameter values of the NMRs in the simulation are assigned to be exactly as those in the experiment so that the converged NN weights after the simulation can be used to initialize the NN weights in the experiment to speed up the learning process in practice. The velocity vector that the virtual leader uses to generate the smooth reference postures is chosen as 2 2 1 1 1 2 2 2 2 2 1 1 0 0 0 1 1 2 2 2 2 1 1 2 2 sin( ) cos( ) sin( ) cos( ) , ( cos( )) ( cos( )) , ( cos( )) ( cos( )) T T Aω ω t A ω t A ω ω t A ω t v ω A ω t A ω t A ω t A ω t (62) where 1 0.04 rad/s, 2 0.02 rad/s, 1 0.022A m/s, 2 0.02A m/s. The distance vectors (3) between NMR i and its neighbors j , i j , , 1,2,3i j , are introduced as 1 2 [0.5,0,0] , 3 2 [1.0,0,0] . For both algorithms, the NN weights, with 15 elements for each NMR, are defined as 1 2 15 ˆ ˆ ˆ ˆ[ , , , ]i i i iW W W W , of which the initial values are zeros in OCTC but are properly chosen for three NNs in ACD. Note that the total number of NN weights of OCTC is 45, but that of Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 158 ADC is 135. The adaptive gains are selected as 25i and 0.01i j . The activation functions ( )i ie are chosen as 2( ) [ ,i i xie e , ,xi yi xi i xi vie e e e e e 2, , , ,xi i yi yi ie e e e e yi vie e , ,yi ie e 2 2 2, , , , , T i i vi i i vi vi i ie e e e e e e e e .Select 1( ) T ii i i i iQ e e Q e , 1iQ 5 5I , 1ii ij ii ijR R T T , and 1 . The PE condition is guaranteed by adding the probing noise 0.008rand( ) ti t e to control and disturbance inputs, where rand( )t is the function that generates random signals in the range [ 1,1] . The initial position of the leader is 0 [ 0.4, 0.6,0.6]q . The initial positions and velocities of the NMRs are 1 [0,0,0]q , 1 [0,0]v , 2 [0, 0.5,0]q , 2 [0,0]v , 3 [ 0.5, 0.5,0]q , 3 [0,0]v . The evolutions of the cooperative position tracking errors for the NMRs under both OCTC and ACD are shown in Fig. 5. After the parameters converge, all errors in both algorithms are approximately zero. In the early periods, however, the errors under OCTC decrease faster than do those under ACD. The cooperative trajectories x , y and in both algorithms are shown in Figs. 6, 7 and 8, respectively. Consequently, NMR 1 tracks the leader while keeping the formation with its neighborhood such that the tracking errors are as small as possible, i.e., 1 1 1 0 0 0[ , , ] [ , , ]x y x y , 3 3 3 1 1 1[ , , ] [ , , ] [ 0.5,0,0]x y x y . NMR 3 keeps the formation with NMRs 1 and 2, i.e., 2 2 2[ , , ]x y 3 3 3[ , , ] [1,0,0]x y . Similarly, the results of NMRs 2 and 1 can be easily deduced. By observing these figures, again, it is found that the control performances under OCTC are better than those under ACD. The cooperative performance of the linear and rotational velocities among the NMRs using both algorithms is shown in Figs. 9 and 10, respectively. It can be seen that, after the parameters converge, the performances approach approximately optimal values, i.e., 2,3 2,3 1 1 0 0[ , ] [ , ] [ , ] , in finite time. Again, the cooperative velocity performances of the NMRs under OCTC continue to dominate those under ACD, especially at the peaks of the linear velocity curves. Now, the proposed control scheme is applied to the testbed. It is important to notice that the converged NN weights in Algorithm OCTC after the simulation are used to initialize the NNs in the experiment. The other learning parameters are chosen as those in the simulation. Figure 5. Evolution of the formation errors of positions for NMRs 1, 2, 3 by Algorithms OCTC with one NN and ACD with three NNs. Figure 6. Evolution of the cooperative positions for by Algorithms OCTC with one NN and ACD with three NNs. Intelligent distributed cooperative control for multiple nonholonomic mobile robots 159 Figure 7. Evolution of the cooperative positions for 1 2 3, ,y y y by Algorithms OCTC with one NN and ACD with three NNs. Figure 8. Evolution of the cooperative positions for 1 2 3, ,θ θ θ by Algorithms OCTC with one NN and ACD with three NNs. Figure 9. Evolution of the cooperative linear velocities for 1 2 3, ,v v v . Figure 10. Evolution of the cooperative linear velocities for 1 2 3, ,ω ω ω . Figure 11. Experimental formation for 1 2 3, ,x x x . Figure 12. Experimental formation for 1 2 3, ,y y y . Figure 13. Experimental formation for 1 2 3, ,θ θ θ . Figures 11, 12 and 13 show the positions , ,i i ix y , for all 1,2,3i . For the final stages of the experiment, Fig. 14 shows that NMR 1 tracks the desired virtual trajectory while maintaining the formations with the neighborhoods. Figs. 15(a) and 15(b) show the linear velocities i and the rotational velocities i , respectively. It is observed that the experimental results are consistent with the simulation. Figure 14. Experimental formation for . Figure 15. Experimental velocities: (a) linear velocities (b) rotational velocities . Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 160 5. CONCLUSION This paper provides a new distributed optimal cooperative tracking control method with disturbance rejection for multi-mobile robots. The method removed the phenomenon of separating the main controller and transformed a structure ACD with three NNs into a structure with only one NN for which the novel NN weight-tuning laws were designed. The online optimal cooperative control algorithms of the scheme, in which the knowledge of internal dynamics was relaxed, were proposed to approximate the Nash equilibrium solutions of the HJI equations. The algorithms guaranteed that the value functions and the control and disturbance laws simultaneously converged to the optimal values and that the cooperative tracking errors of the closed-loop systems and approximation errors of NN weights were uniformly ultimately bounded. The compared simulations were carried out, and the experimental results on the testbed equipped with an omnidirectional vision system were consistent with the simulation. Based on the experimental results, it can be inferred that our method is effective for a certain practical aspect of control systems technology including multiple nonholonomic mobile mechanic agents or autonomous vehicles, which track both positions and velocities. REFERENCES 1. Sun D., Wang C., Shang W., and Feng G. - A synchronization approach to trajectory tracking of multiple mobile robots while maintaining time varying formations, IEEE Trans. Robot 25 (5) (2009) 1074–1086. 2. Loria A., Dasdemir J., and Jarquin N. A. - Leader–follower formation and tracking control of mobile robots along straight paths, IEEE Trans. Contr. Syst. Technol. 24 (2) (2016) 727–732. 3. Dong W. - Tracking control of multiple-wheeled mobile robots with limited information of a desired trajectory, IEEE Trans. Robot 28 (1) (2012) 262–268. 4. Gu D. and Wang Z. - Leader–follower flocking: Algorithms and experiments,” IEEE Trans. Contr. Syst. Tech. 17 (5) ( 2009) 1211–1219. 5. Wang Z. and Gu D. - Cooperative target tracking control of multiple robots, IEEE Trans. Ind. Electron. 59 (8) (2012) 3232–3240. 6. Yu X. and Liu L. - Distributed formation control of nonholonomic vehicles subject to velocity constraints, IEEE Trans. Ind. Electron. 36 (2) (2016) 1289–1298. 7. Khoo S., Xie L., and Man Z. - Robustfinite-time consensus tracking algorithm for multirobot systems, IEEE/ASME Trans. Mechatr. 14 (2) (2019) 219–228. 8. Wang W., Huang J., Wen C., and Fan H. - Distributed adaptive control for consensus tracking with application to formation control of nonholonomic mobile robots,” Automatica 50 (4) (2014) 1254-1263. 9. Peng Z., Yang S., Wen G., Rahmani A., and Yu Y. - Adaptive distributed formation control for multiple nonholonomic wheeled mobile robots, Neurocomputing 173 (3) (2016) 1485–1494 10. Dierks T. and Jagannathan S. - Neural network output feedback control of robot formations,” IEEE Trans, Syst., Man, and Cybern., B Cybern. 40 (2) (2010) 383–399. 11. Movric K. H. and Lewis F. L. - Cooperative optimal control for multiagent systems on directed graph topologies, IEEE Trans. Autom. Contr. 29 (3) (2014) 769–774. Intelligent distributed cooperative control for multiple nonholonomic mobile robots 161 12. Vamvoudakis K. G., Lewis F. L., and Hudas G. R. - Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality, Automatica 48 (2012) 1598–1611. 13. H. Zhang, F. L. Lewis, and A. Das - Optimal design for synchronization of cooperative systems: State feedback, observer and output feedback, IEEE Trans. Autom. Contr. 56 (8) (2011) 1948–1952. 14. Jiao Q., Modares H., Xu S., Lewis F., and Vamvoudakis K. G. - Multiagent zero-sum differential graphical games for disturbance rejection in distributed control, Automatica, 69 (2016) pp. 24–34. 15. Tatari F., Naghibi-Sistani M.-B., and Vamvoudakis K. G. - Distributed learning algorithm for nonlinear differential graphical games, Transactions of the Institute of Measurement and Control, first published (2015), doi: 10.1177/0142331215603791. 16. Cao W., Zhanga J., and Ren W. - Leader–follower consensus of linear multi-agent systems with unknown external disturbances, Systems & Control Letters 82 (2015) 64–70. 17. Wang J. and Xin M. - Distributed optimal cooperative tracking control of multiple autonomous robots, Robotics and Autonomous Systems 60 (4) (2012) 572 – 583. 18. Dierks T., Brenner B., and Jagannathan S. - Neural network-based optimal control of mobile robot formations with reduced information exchange, IEEE Trans. Contr. Syst. Technol. 21 (4) (2013) 1407–1415. 19. Fierro R. and Lewis F. L. - Control of a nonholonomic mobile robot using neural networks, IEEE Trans. Neur. Netw. 9 (4) (1998) 589–600. 20. Vamvoudakis K. G. and Lewis F. L. - Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations, Automatica. 47 (8) (2011) 1556 – 1569. 21. Huai-Ning W. and Biao L. - Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H control, IEEE Trans. Neur. Netw. and Learn. Syst. 23 (12) (2012) 1884 –1895. 22. Zargarzadeh H., Dierks T., and Jagannathan S. - Optimal control of nonlinear continuous- time systems in strict-feedback form, IEEE Trans. Neur. Netw. Learn. Syst. 26 (10) (2015) 2535–2549. 23. Luy N. T. - Adaptive dynamic programming-based design of integrated neural network structure for cooperative control of multiple MIMO nonlinear systems, Neurocomputing (2016), 24. Khoshnam S., Alireza M. S., and Ahmadrez T. - Adaptive feedback linearizing control of nonholonomic wheeled mobile robots in presence of parametric and nonparametric uncertainties, Robotics and Computer Integrated Manufacturing 27 (1) (2011) 194–204. 25. Luy N. T. - Robust adaptive dynamic programming based online tracking control algorithm for real wheeled mobile robot with omnidirectional vision system, Transactions of the Institute of Measurement and Control (2016), doi: 10.1177/0142331215620267. 26. Chang Y. and Chen B. - A nonlinear adaptive H tracking control design in robotic systems via neural networks, IEEE Trans. Contr. Syst. Tech. 5 (1) (1997) 13–29. 27. Wang L., Wang X., and Hu X. - Connectivity maintenance and distributed tracking for double-integrator agents with bounded potential functions, Int. J. Robust Nonlinear Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 162 Control 25 (4) (2015) 542–558. 28. Aliyu M. D. S. - Nonlinear H control, Hamiltonian systems and Hamilton–Jacobi equations, CRC Press, 2011. 29. Basar T. and Bernhard P. - H Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, 2nd ed. Boston, MA:Birkhuser, 1995. 30. Abu-Khalaf M. and Lewis F. L. - Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, Automatica 41 (5) 779–791. 31. Hornik K., Stinchcombe M., and White H. - Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks, Neural Networks 3 (5) (1990) 551–560. 32. P. Ioannou and B. Fidan - Advances in design and control, Adaptive control tutorial. PA: SIAM, 2006. 33. Vamvoudakis K. G. and Lewis F. L. - Online actorcritic algorithm to solve the continuous-time inifinite horizon optimal control problem, Automatica 46 (5) (2010) 878–888. 34. Lewis F. L., Jagannathan S., and Yesildirek A. - Neural Network Control of Robot Manipulators and Nonlinear Systems, Taylor and Francis, Philadelphia, PA, 1999.

Các file đính kèm theo tài liệu này:

11967_103810382410_1_sm_2359_2061586.pdf