This paper provides a new distributed optimal cooperative tracking control method with
disturbance rejection for multi-mobile robots. The method removed the phenomenon of
separating the main controller and transformed a structure ACD with three NNs into a structure
with only one NN for which the novel NN weight-tuning laws were designed. The online
optimal cooperative control algorithms of the scheme, in which the knowledge of internal
dynamics was relaxed, were proposed to approximate the Nash equilibrium solutions of the HJI
equations. The algorithms guaranteed that the value functions and the control and disturbance
laws simultaneously converged to the optimal values and that the cooperative tracking errors of
the closed-loop systems and approximation errors of NN weights were uniformly ultimately
bounded. The compared simulations were carried out, and the experimental results on the testbed
equipped with an omnidirectional vision system were consistent with the simulation. Based on
the experimental results, it can be inferred that our method is effective for a certain practical
aspect of control systems technology including multiple nonholonomic mobile mechanic agents
or autonomous vehicles, which track both positions and velocities.
                
              
                                            
                                
            
 
            
                 23 trang
23 trang | 
Chia sẻ: honghp95 | Lượt xem: 904 | Lượt tải: 0 
              
            Bạn đang xem trước 20 trang tài liệu Intelligent distributed cooperative control for multiple nonholonomic mobile robots subject to unknown dynamics and external disturbances - Nguyen Tan Luy, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
advantages. First, they included two iterative loops, i.e., as the parameters of 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
142 
the disturber NN were updated in an iterative loop, the parameters of the actor NN had to wait 
for updating in the other loop. Second, the knowledge of system internal dynamics was required. 
Finally, the initial stability of the system strongly depended on how the initialization of the 
three-NN weights was performed. For those reasons, an increase in computational complexity in 
addition to wasted resources were inevitable [21, 22]. 
In our previous work [23], the design of an optimal cooperative control of multiple MIMO 
nonlinear systems overcame drawbacks of using many NNs. However, disturbance rejection was 
only considered for each agent, not for neighborhoods. 
To the best of our knowledge, optimal cooperative tracking control schemes with 
disturbance rejection for multiple nonlinear agents in the presence of no knowledge of internal 
dynamics with application to nonholonomic robot systems has not yet been considered. In this 
paper, we provide such a scheme with the following main contributions. 
1. A bounded 2L -gain synchronization problem of a multi-NMR system in a distributed 
communication graph is formulated. In contrast with the work in [18], we avoid 
separating kinematics and dynamics when designing the control scheme. Thus, a 
performance index function subject to all signals will be minimized. 
2. The design of an optimal cooperative tracking control scheme is proposed. This work 
extends the work of [14] to three cases: (i) Consideration of nonlinear agents instead of 
linear agents; (ii) Use of only one NN for each agent instead of three to overcome some 
disadvantages caused by the large number of NNs; (iii) No prior knowledge of system 
internal dynamics for analyzing and designing control algorithms. We prove that the 
system parameters converge to the approximately optimal values, and the cooperative 
tracking errors and the NN approximation errors are uniformly ultimately bounded. 
3. Through simulations, the proposed algorithms with other algorithms are compared to 
demonstrate effectiveness. To test our algorithms in practical applications, a hardware 
testbed, consisting of NMRs equipped with an omnidirectional vision system, is 
designed and constructed. Based on the experimental results, it can be inferred that our 
method is effective for a certain practical aspect of control systems technology including 
multiple nonholonomic mobile mechanic agents or autonomous vehicles. 
The paper is organized as follows. Section 2 provides the theoretical background of graph 
and nonholonomic mobile robots from which integrated cooperative control is derived. Section 3 
designs an optimal cooperative tracking control scheme. Section 4 shows the results of the 
simulation and experiment. A brief conclusion is given in Section 5. 
2. BACKGROUND AND PRELIMINARIES 
2.1. Distributed Communication Graph Theory 
Consider m robots in a cooperative system. The distributed communication of the system 
can be represented by a directed graph ( , , ) , where the robots are characterized by the 
set of nodes 0 ms , ,s , where 0s is a leader node. Relationships among the robots are 
determined by the set of edges with a connectivity weight matrix [ ]ija , where
0iia , 0ija for ija and 0ija , otherwise. If the states of the robot i are available to the 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
143 
robot j then js is a neighborhood of is . All neighborhoods of is give a set 
: ,( , )i j i jj s s s . Define a graph Laplacian matrix 
( 1) ( 1)m m , 
where ( )idiag b , 
i
i ijj
b a . Note that row sums of are equal to zero. 
A directed path is a sequence of ordered edges 1( , )i is s , 0, , 1i m . If a directed path 
from is to js exists such that ( , )i js s , i js s , then the directed graph is strongly connected. 
The graph is directed spanning tree if the set 1, , ms s exits at least one node with a directed 
path to all other nodes. 
The connectivity matrix between the ith robot and its leader is defined as 
 1 2, ,..., mdiag c c c (1) 
where 1ic if the ith robot connects to its leader, and 0ic , otherwise. 
2.2. Nonholonomic Mobile Robot and Integrated Cooperative Control Problem 
Consider a NMR presented by the node is . Its mass im , including the mass of the platform 
without wheels and the mass of wheels, is focused on the center point. A distance of driven 
wheels is ib . A radius of each wheel is ir . A distance from the center point to the driving axle is 
il . Without loss of generality, il can be equal to zero. NMR is a mechanical system with n 
generalized configuration variables 1 2( , , , )i i inq q q suffered p nonholonomic constraints [19]. 
The kinematics and dynamics of the ith NMR are written as 
( ) ( )
( ) ( , ) ( ) ( )
i i i i
i i i i i i i i i di i i i
q S q v t
q v C q q v F B qM q η η
 (2) 
where 1 2[ , , , ]
n
i i i inq q q q are position vectors, and ( ) [ , ]
n p
i i iv t ω are velocity 
vectors, where i and iω are translational and rotational velocities. 
( )n n p
iS are full-rank 
matrices, and i i iB S B with 
( )n n p
iB are the input transformation matrices. i i i iM S M S 
with 
n n
iM are inertia matrices consisting of the total mass im and moment of inertia iI . 
Centripetal and Coriolis matrices are defined as i i i i i i iC S M S S C S with 
n n
iC , and 
surface friction and gravitational vectors are defined as i i iF S F with
n
iF . Bounded 
disturbances including unstructured unmodeled dynamics and external disturbances are denoted 
by di i diη B η with 
n p
diη , and control torque vectors are denoted by [ , ]
n p
i li riη η η , 
where liη and riη are left and right torques, respectively. 
Properties 1: iM are asymmetric positive definite matrices. The parameters in (2) are bounded 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
144 
(Boundedness [19, 24]), i.e., min maxii im mM , min maxii ic cC , maxdi idη η , 
min maxqii is sg , 
1 1
max minvi i ii im gB m B , 
1 1
max minvi iim mk , and max max maxi i i i i iF f q f s v
max max maxi i i i i iF f q f s v , with positive constant scalars minim , maxim , maxic , minic , max min,di iη s , 
maxis and maxif . 
From (2), the nonlinear dynamics of NMR i with unknown internal dynamics vif and 
disturbances diη can be written as 
( ) ( )
( , ) ( , ) ( , )
qi i qi i i
vi i i vi i i i vi i i
i
di i
f q g q v
f q v g
q
v q v η k q v η
 (3) 
where 0qif , qi ig S , 
1
vi i i i if M C v F , 
1
ivi iMg B , 
1
vi ik M . 
Remark 1: vif is Lipschitz on the compact set 
n p
iΩ such that 
1
min max max max max( , ( )) i i i i i i ivi i if q s μv m c f v v 
for a positive constant scalar maxiμ [25]. diη has 2L -gain with 
2
0
diη dt [26]. 
Definition 1: It is assumed that a leader robot (virtual robot) generates the bounded smooth 
trajectory 0
nq that holds 
 0 0 0 0
0 0 0 0
( )
( , )v
q S q v
v f q v
 (4) 
where 0v is the desired velocity, and 0vf is an acceleration function which satisfies Lipschitz 
assumption [27]. Then the cooperative tracking control problem of the multi-NMR system is to 
design iη in (3) so that when 0diη , if each NMR directly connects to his leader, 
0( ) ( ) 0iq t q t and 0( ) ( ) 0iv t v t , or directly connects to its neighborhoods, ij :
( ) ( ) 0i jq t q t , ( ) ( ) 0i jv t v t . 
 For the ith NMR, the local tracking error functions are defined as [7] 
0( ) ( )
i
qi ij i j i ij
e a q q c q q (5) 
0( ) ( ).i
i
v ij i j i ij
e a v v c v v (6) 
Furthermore, to avoid collisions, (5) is written as 
0( ) ( ).
i
qi ij i i j j i i ij N
e a q δ q δ c q δ q (7) 
where 
n
iδ are coordinates of the front points on NMRs i and j . Taking the derivative of 
(5) or (7) and (6), with notation of (3) and (4), functions of tracking dynamics are rewritten as 
0 ( )
i
qi i i i qi i ij qj jj
e c q b c g v a g v (8) 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
145 
 ( ) ( )( ) ( )
i
vi evi i i vi i vi di ij vj j vj djj
e f t b c g η k η a g η k η (9) 
where 
0( ) ( ) ( )
i
evi ij vi vj i vi vj
f t a f f c f f . 
For m NMRs, the overall functions of tracking dynamics are given by 
 0 ( ) ( )q n n qe I q I g q v (10) 
 ( ) ( ) ( , ) ( , )( )v ev n p v v de f t I g q v η k q v η (11) 
where is Kronecker product operator, and 0 01 nq I q , with 1 1,...,1
m , and 
,n n pI I are identify matrices, 1 ,..., mq q q , 1,...,q q qme e e , 1( ) ,...,q q qmg q diag g g , 
1 ,..., mv v v , 1,...,v v vme e e , 1( ) ,...,ev ev evmf t f f , 1( , ) , ,v v vmg q v diag g g ,
( , )vk q v 1, ,v vmdiag k k , 1 ,..., mη η η , 1,...,d d dmη η η . , where and 
are sub-matrices of and , formed by removing the elements of the leader. 
Note that, (10) and (11) denote kinematic and dynamic equations. In almost reported 
studies, kinematic and dynamic controllers were designed successively based on these equations. 
In this paper, the objective of our design is to gain integrated controllers without separating 
kinematics and dynamics. To this end, the following transformations are performed. 
Adding and subtracting ( )n q vI g q e to (10), yields 
 ( )( ) ( )( ) ( )q n q a q ve I g q v v g q e (12) 
where 1[ ,..., ]a a amv v v , holds 
 0( ) ( ) .( ) ( )n q a n n q vI g q v I q I g q e (13) 
Adding and subtracting ( )n q qI g q e to (11), yields 
 ( ) ( ) ( , )( ) ( , ) ( )( )v ev n p v a v d n q qe f t I g q v η η k q v η I g q e (14) 
where 1[ ,..., ]a a amη η η , holds 
 ( ) ( , ) ( ) .n p v a n q qI g q v η I g q e (15) 
Next, the pseudo-control inputs of (10) and the real-control inputs of (11) are defined as 
*
av v v (16) 
*
aη η η (17) 
where the bounded 2L -gain optimal control inputs 
* * *
1 ,..., mv v v and 
* * *
1 ,..., mη η η 
will be designed in the next section. 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
146 
Now, we introduce Lemma 1 to convert the cooperative tracking control problem of the 
multi-NMR system into the stabilization of an integrated cooperative dynamical system. 
Lemma 1: Let the integrated cooperative control inputs 
*[ , ]r au v η u u hold (16) and 
(17), where the inputs [ , ]a a au v η hold (13) and (15), and the bounded 2L -gain optimal 
control inputs 
* * *[ , ]u v η stabilize the following integrated cooperative dynamical system 
 *2( ) ( ) ( ) ( )( )e n pe f t I g x u k x d (18) 
where [ , ]q ve e e , ,x q v , 1[0 , ]mn dd η , 1( ) 0 ,e mn evf t f denotes unknown 
internal dynamics, ( )( ) 0 ,mn m n p vk x diag k and ( ) [ , ]q vg x diag g g are input matrices. 
Then, the optimal cooperative tracking control scheme of the multi-NMR system with dynamics 
(3) is equivalent to the bounded 2L -gain optimal control scheme of (18). In other words, if the 
dynamical systems (3) are applied by the control law 
*
r au u u , the tracking error dynamics 
are transformed into the cooperative dynamical system (18). 
Proof: To evaluate the stability of the system (3) when applying ru , we rewrite the tracking 
error dynamic equations of (3) by substituting the control laws (16) and (17) into (12) and (14) 
with noting (13) and (15): 
*
*
( ) 0 0 00 0
( ) ( )
0 ( , )( ) 0 ( , )
( ) 0
( )
0 ( )
n qq n
n p vev dn p vv
n q v
qn p q
I g qe Iv
I k q vf t ηI g q ve η
I g q e
eI g q
 (19) 
Then, we choose the candidate Lyapunov function / 2J e e , and take derivative through (19): 
*
*
( ) 0 0 00 0
( ) ( )
0 ( , )( ) 0 ( , )
( ) 0
( ) .
0 ( )
n q n
n p vev dn p v
n q v
q v
qn p q
I g q Iv
J e
I k q vf t ηI g q v η
I g q e
e e
eI g q
 (20) 
One may easily recognize that the last term in the right-hand side of equation (20) is equal to 
zero. Thus, Eq. (21) can be rewritten in the reduced form as 
 *2( ) ( ) ( ) ( )( ( )).e n pJ e f t I g x u k x d (21) 
On the other hand, if we also choose the Lyapunov candidate J for closed-loop dynamical 
system (18), after taking derivative, we obtain the result as (21). One can conclude that the 
existence of a bounded 2L -gain optimal tracking controller to make (21) is negative to stabilize 
the dynamical system (18) is sufficient to make dynamical systems (3) stable. 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
147 
Remark 2: Because system internal dynamics ( )ef t and external disturbance d in (18) are 
derived from (3), which obviously hold the facts in properties 1 and remark 1, knowledge of 
( )ef t is completely unknown and disturbance 2[0, )d L . 
3. DESIGN OF OPTIMAL COOPERATIVE TRACKING CONTROL SCHEME WITH 
DISTURBANCE REJECTION 
In this section, motivated by the design of the optimal cooperative tracking control scheme 
with disturbance rejection but only applied to multiple linear agents [14], we propose a novel 
control scheme to apply to multiple nonlinear agents, of which the multi-NMR system in the 
paper is one. 
3.1. Bounded 2L -Gain Problem for Multi-NMR Systems 
Consider nonlinear cooperative dynamics of each NMR, derived from (18), with the 
measured outputs 
q
iy 
( ) ( ) ( ) ( ) ( ) ( )
( )
( ) ( )
i
i ei i i i i i i i i ij j j j j j jj
i i i
e f t b c g x u k x d a g x u k x d
y h e
 (22) 
where ( )i ih e are continuous smooth functions. Define the general disturbances i i iω d d 
with ,i j id d j , and performance outputs ( ) ( ) ( )i i i iz e t u t u t , ,i j iu u j , 
satisfying the following inequality of the bounded 2L -gain for all NMRs when disturbances 
0iω 
2 2 2
0 0 0
2
0
( ) ( (0))
( (0))
( )
i
i
T T T
i ii i i ii i i i
T
i ii i j ij
j
jj
jj
i
ijz dt Q e u R u dt γ ω dt β e
γ d T d d T d d
u R u
t β e
 (23) 
for some bounded functions β such that (0) 0β [28], where 0 ( ) 0i ii ie Q e and
0 (0) 0i iie Q , 0iiR , 0ijR , 0iiT and 0ijT . 
*γ γ is the prescribed disturbance 
attenuation level, where 
*γ is the minimum gain of γ for which the bounded 2L -gain condition 
(23) is satisfied. 
The objective in this section is to design an optimal cooperative tracking control scheme for 
each NMR subject to unknown internal dynamics eif and external disturbances id and jd to 
make all signals in (22) 2L -bounded. 
Define the local infinite horizontal tracking performance function for each NMR 
 2 2
0
( (0), , , , ) ( ) .( )
i i
j ij j ji i i i i i ii i i ii i i ii i ij jj j
J e u u d d Q e u R u γ du R u dT d γ T dd t (24) 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
148 
Two-player zero-sum differential game theory [29] can extensively be studied to find solutions 
of the bounded 2L -gain problem for the system (22) subject to (24) 
* (0) minmax (0), , , , ,( ) ( )
i i
i i i i i i i i
u d
V e J e u u d d (25) 
i.e., the saddle point (
* *,i iu d ), where 
*
iu and 
*
id are the optimal control law and the worst 
disturbance law, respectively, holds 
* * * * *(0) minmax (0), , , , max min (0), , , , .( ) ( ) ( )
i ii i
i i i i i i i i i i i i i i
u dd u
V e J e u u d d J e u u d d (26) 
*
iV is known as the Nash equilibrium value of the multi-player game holding the following 
constraints for all laws iu and id . 
 * * * * * * * * * *, , , , , , , , , .( ) ( ) ( )i i i i i i i i i i i i i i iJ u u d d J u u d d J u u d d (27) 
Followed by ADP principle, for state feedback control laws iu and id , we define a local 
cooperative value function for the ith NMR as [14] 
 2 2( ) ( )( ) ( )
i i
i i ii i i ii i j ij j i ii i j ij jj jt
V e t Q e u R u u R u γ d T d γ d T d dt (28) 
Using Leibnizs formula, a differential equivalent to (28) is given by the Hamiltonian 
2 2
, , , , , ( ) ( ) ( ) ( ) ( )
( ) 0.
( ) ( ( ) )
i i
i
i
i i i i i i ei ei i i i i i i i i ij j j j j j j
i j
ii i i ii i i ii ij ij j j ij jj j
V
H e u u d d V f b c g x u k x d a g x
u R u d T
u k x d
e
Q e u R u γ d dT d γ
 (29) 
Using the stationary condition for (29), one obtains 
 1
1
0 ( ) ( ) ( )
2
i i
i i i i ii i i
i i
H V
u e b c R g x
u e
 (30) 
 1
2
1
0 ( ) ( ) ( ) ,
2
i i
i i i i ii i i
i i
H V
d e b c T k x
d eγ
 (31) 
with boundary condition (0) 0iV . The following coupled HJI equations for the cooperative 
tracking problem are obtained by substituting (30) and (31) into (29): 
2 1 2 1
1 2 1 1 2
2 2
1 1
( ) ( ) ( ) ( ) ( ) ( )
4 4
1 1
( ) ( ) ( ) ( ) ( )
4 4
(
i
i
jci i i
ii i ei i i i i ii i i j j j j jj ijj
i i i j
j j j i
jj j j j j j j jj ij jj j j i ij
j j j i
i i
VV V V
Q e f b c g x R g x b c g x R R
e e e e
V V V V
R g x b c k x T T T k x b c
e e e eγ γ
k x 1) ( ) 0iii i i
i
V
T k x
e
 (32) 
where the closed–loop system corresponding to the ith NMR is defined as 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
149 
2 1 1 2
2
1 1
2
1 1
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
2 2
1
( ) ( ) ( ) ( ) ( ) .
2
1
2 i
i
jc i
ei ei i i i i ii i i j j j j jj j j i i
i j
ji
i i ii i
i
i j j j j jj j j
i
jj
ijj
j
VV
f f t b c g x R g x b c g x R g x b c
e e γ
VV
k x
a
T k x b c k x T k x
e γ
a
e
 (33) 
Let all optimal control laws and worst disturbance laws be functions of given solution iV , 
i.e., 
* ( )i i iu u V , 
* ( )i i iu u V , 
* ( )i i id d V and 
* ( )i i id d V , then the HJIs (32) become 
 * * * *, , , , , 0, (0) 0.( )ii i i i i i i
i
V
H e u u d d V
e
 (34) 
Lemma 2: Assume that ( )i iV e , 1,...,i m is smooth, ( ) 0i iV e , and ( )i iV e is a the solution to 
the coupled HJI equation (32). Let optimal control laws of neighborhoods be given. Then, for 
every 
*
iu and 
*
id , the following condition holds 
 * * * * * * 2 * *, , , , , ( ) ( ) ( ) ( ).( )ii i i i i i i i ii i i i i ii i i
i
V
H e u u d d u u R u u γ d d T d d
e
 (35) 
Proof: Complete the squares in (32) to obtain (35). 
Lemma 3: Choose 
*γ γ . Assume that 
*
iV , 1,...,i m is smooth, 
* 0iV , and 
*
iV is a the 
solution to the coupled HJI equation (32). Let optimal control laws of neighborhoods be given. 
Then the equilibrium point of the closed-loop system 
 * *( ) ( ) ( ) ( )
i
i ei i i i i i ij j j jj
e f t b c g x u a g x u (36) 
is asymptotically stable with control inputs 
* *( )i i iu u V given by (30) in terms of 
*
iV . In 
addition, in the presence of disturbances 
*,i id u makes the bounded 2L -gain condition (23) 
satisfied, where ii i i iQ h Q h with iQ is a positive-definite matrix. 
Proof: The proof is similar to the proof of Theorem 1 in [14] and is omitted here. 
Lemma 4 (Solution to Multi–player Zero-sum Game [14]): Choose
*γ γ . Assume that the 
value of the game (33) is finite and optimal control laws of neighborhoods are available. Let 
*,iV i m be smooth, 
* 0iV , and be the solution to the coupled HJI equation (32), such that the 
closed-loop system 
 * * * *( ) ( ) ( ) ( ) ( ) ( ) ( )
i i
i ei i i i i i ij j j j i i i i i ij j j jj j
e f t b c g x u a g x u b c k x d a k x d (37) 
is asymptotically stable around its equilibrium point. Then, the Nash condition (27) is satisfied 
for control and disturbance laws, 
* *( )i i iu u V and
* *( )i i id d V , given by (30) and (31) in terms 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
150 
of 
*
iV . Further, there exists the value of the game, i.e., the solution 
*( (0))i iV e , i m of the HJI 
equation (32). 
Proof: The proof is similar to the proof of Theorem 2 in [14] and is omitted here. 
It is shown in [29] that the nonlinear bounded 2L -gain optimal control problem relies on 
values of multi-player zero-sum games; in particular, as shown in Lemma 4, the values are the 
solutions of the coupled HJI equations (32). However, the HJI equations are impossible to solve 
analytically. 
3.2. Optimal Cooperative Tracking Control Scheme with Disturbance Rejection and 
Algorithms in Real Time 
In this section, we design a control scheme and online algorithms to solve the HJI equations 
(32) based on reinforcement learning techniques of [12, 14]. However, in contrast to most 
existing schemes using ADP for learning the solutions of the HJI equations, the scheme in this 
paper uses only one critic NN and does not need knowledge of system internal dynamics. 
Moreover, the online algorithms synchronously update parameters in one iterative loop. 
According to the Weierstrass higher order approximation Theorem [30], there exists an NN 
such that the smooth value function iV , i m is approximated by 
*( ) ( ) ( )Ti i i i i i iV e W e e (38) 
where 
(2 )( ) : n pi ie is the activation function vector of neurons in the hidden layer, 
( )i ie the NN approximation error, and iW R the ideal weight vector. 
Properties 2 (Approximation [31]):
( )i ie , 1,...,i m can be selected as a complete 
independent basis set so that when , ( ) 0i ie and ( ) ( ) / 0ei i i i ie e e , and 
for fixed , max( )i i ie , max( )ei i eie , where maxi and maxei are positive constants. 
Substituting (30), (31) and (38) into (32), the NN-based coupled HJI equations are obtained 
2
2
1 1
( ) ( ) ( ) ( )
4 2
1
( ) 0
4
( ) ( )
( )
i
ii
ii i i ei ei i i i ei i i ei i ij j j i ei j j ej jj
j j j ej i i ej j Hj
Q e W f t b c W g k W a b c W g k W
b c W g k W
(39) 
where { , }l i j , 
1
l l ll lg g R g , 2
1
l l ll lk k T k
, ( ) /el l l le e , /el l le , 
1 1
i j jj ij jj jg g R R R g , 
1 1
2
1
i j jj ij jj jk k T T T k . The residual errors iH
e , caused by the function approximation errors, are 
computed as 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
151 
* * 2 * *
1 * 1 *
2
1
( ) ( )( ) ( ) 2
4
1
( ) ( )
2
1
( ) .
4
( ) ( ) ( )
( ) ( )
( )
i
i
i i
i
H ei ei i i i i i i i i ei i i ei ei ij j j j jj
ij j j ei j j ej j j ej j jj ij j j jj ij jj j
j j ej i i ejj
ε ε f t b c g u k d b c ε g k ε ε a g u k d
a b c ε g k ε b c ε g R R u k T T d
b c ε g k ε
(40) 
Remark 3: According to Properties 1, { , , }l i j i , the functions lg and lk are positive 
definite and bounded, e.g., min max0 i i ig g g , where 
2
min min max ( )i i iig g R and 
2
max max min( )i i iig g R with min and max are the functions of largest and smallest eigenvalues, 
respectively. Then, using Property 2, Hi is bounded on a compact set. That is 
max max max0, ( ) : sup iHi Hi e Hi HiN . Moreover, if , Hie converges 
uniformly to zero [30]. 
The ideal weight vectors iW (38) are unknown, thus ( )i iV e are approximated by ˆiW : 
 ˆ ˆ( ) ( ).i i i i iV e W e (41) 
Then, the estimated control and disturbance laws become 
11 ˆˆ ( ) ( ) ( ) ,
2
i i i i ii i i ei iu e b c R g x W (42) 
 1
2
1ˆ ˆ( ) ( ) ( ) .
2
i i i i ii i i ei id e b c T k x W (43) 
The approximate Hamiltonian are obtained by substituting (43), (42) and (41) into (29): 
2
2
1ˆ ˆˆ ˆ ˆ ˆ ˆˆ ˆ, , , , , ( ) ( ) ( )
4
1 1ˆ ˆ ˆ ˆ( ) ( ) .
2 4
( ) ( )
( ) ( )
i i
i i i i i i i ii i i ei ei i i i ei i i ei i
ij j j i ei j j ej j j j j ej i i ej jj j
H e W u u d d Q e W f t b c W g k W
a b c W g k W b c W g k W
 (44) 
It is desired that besides turning ˆiW to minimize residual error functions of (44) such that 
ˆ
i iW W , assumption of identifying knowledge of internal dynamics should be removed. The 
residual error functions are chosen as the square integral functions 
1
2 i i
i H HE e e , where 
 ˆ ˆˆ ˆ ˆ ˆ, , , , ,( )
i
t
H i i i i i i i
t T
e H e W u u d d dη (45) 
where T > 0 is a chosen time interval. Based on the normalized gradient descent scheme that is 
modified from Levenberg-Marquardt algorithm, we propose the following weight-tuning laws 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
152 
( ) ( )2
2
( 1)
ˆ
( 1)
i i
i it it i t-T i t-T
i i
i
i i
i i
i i
α ζ
Ψ if e e e e
ζ ζ
W
α ζ
Ψ Ξ otherwise
ζ ζ
 (46) 
where ( )it ie e t , ( ) ( )i t T ie e t T , and i , i and i are defined in (47), (48) and (49), 
respectively, with positive constants i and i . 
21 1ˆ ˆ( ) ( ) ( )( ) ( )
2 2
( ) ( ) ( ) .
( )
( ) ( ) ( )
i
t t
i ei ei i i i i ei i ij j j j j ej j ei ijt T t T
i i i i i i
f b c g k W a b c g k W d e s d
e t e t T e t
(47) 
2 2( )
1 1ˆ ˆ ˆ ˆ ˆ( ) ( ) ( ) .
4 4
( )
i
t
i i i ii i i i ei i i ei i j ej i i ej j
t T
j jj
W Q b c W g k W W g kc db W (48) 
 ( ) ( ) ( ) ( ) .
2
( )
i
t
i
i i i ei i i i ij j j ej j j jjt T
b c g k e a b c g g e d (49) 
Later, Theorem 1 shows that if the tuning laws (46) are used in online algorithms to learn 
the solutions of (39), the approximated NN weights will converge and be stable. 
Remark 4: Knowledge of system internal dynamics ( )eif t is relaxed for the weight-tuning laws 
(46). 
Remark 5: If ( ) ( ) 0i je t e t , the values in the right side of the weight-tuning laws (46) are 
zeros. Thus, ˆ
iW are not tuned any more. To guarantee that 
ˆ
iW converge to the true values, we 
apply the Persistence of Excitation (PE) condition [32] though the following lemma. 
Lemma 5: Let iu and id , i m be any given bounded stable laws so that the value function 
(28) can be written as 
2 2( ) ( ) ( ) .( ) ( ) ( )
ii
j i
t
i i ii i i ii i i ii i j ij j i ijt
j jjT
V e t T Q e u R u γ d T d γ d T d dtu R u V e t (50) 
Using NN (36) for (50), 
2 2( )( )
ii i
t
T
ii i i ii i j ij j i ii i j ij j i i i Bj jt T
Q e u R u u R u d T d d T d dt W e e (51) 
where 
iB
e is the reconstruction error. If PE condition be satisfied in the interval 
[ , ]pt T t , 0pT 
 1 2( ) ( )
p
t
T
i i i i
t T
β I ζ η ζ η dη β I (52) 
where 1i , 2i are positive constants, / ( 1)
T
i i i i and I is the identity matrix with the 
appropriate dimension. Then, 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
153 
 For 0
iB
e (no reconstruction error), the NN weight approximation error converges to 
zero exponentially fast; 
 For 
maxiB i
e e , the NN weight approximation error converges exponentially fast to a 
residual set. 
Proof: From (42) and (43), Eq. (47) can be written as 
21 1ˆ ˆ ˆ ˆ( ) ( ) ( )( ) .
2 2
( )
i
p
t
T
i i i i ei ei i i i i ei i ij j j j j ej jj
t T
e W W f b c g k W a b c g k W d (53) 
With noting that ˆi i iW W W ˆ( )i iW W and pT T , substituting (53) into (46), function 
approximation error dynamics is obtained as 
2
2
2
1 1ˆ ˆ ˆ ˆ( ) ( ) ( ) ( )
( 1) 4 2
1ˆ ˆ ˆ( ) ( ) ( ) .
4
(
)
i
p
i
t
i i
i ii i i ei ei i i i ei i i ei i ij j j i eiT j
i i t T
j j ej j j j j ej i i ej jj
W Q e W f b c W g k W a b c W
g k W b c W g k W d
(54) 
Note that, from (44), 
iH
e in (45, i m can be written as 
2
2
1 1ˆ ˆ ˆ ˆ( ) ( ) ( ) ( ) ( )
4 2
1ˆ ˆ ˆ( ) ( ) ( ) .
4
(
)
i ip
i
t
H ii i i ei e i i i ei i i ei i ij j j ijt T
ei j j ej j j j j ej i i ej jj
e Q e W f t b c W g k W a b c W
g k W b c W g k W
 (55) 
Using (53) and (55), one obtains 
2 2
0
ˆ( ) .( )
i
i i
t T
ii i i ii i j ij j i ii i j ij j i i i Hj j
Q e u R u u R u γ d T d γ d T d dt W Δ e e (56) 
Notice that, using pT for (51), and then subtracting to (56), 
 .
i iH i i i B
e W e e (57) 
On the other hand, comparing (55) with (54), we obtain: 
2
.
1
i
i
i i H
i i
ζ
W α e
ζ ζ
 (58) 
Inserting (57) into (58), note that i i (47), iW becomes 
i
T i
i i i i i i B
i
ζ
W α ζ ζ W α e
m
 (59) 
where 1
T
i i im . This approximation error is the same as the approximation error in [33], 
and the reminder of the proof is followed by the proof of Theorem 1 in [33]. 
Online Algorithms: Based on (46), we design online algorithms for optimal cooperative 
tracking control (OCTC) (as shown in Algorithm 1), where all parameters of control and 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
154 
disturbance laws are updated simultaneously in one iterative loop, in contrast to policy iteration 
algorithms using three NNs for each agent in [14]. 
Algorithm 1. Online OCTC 
Step 1: 1,...,i m , ij select iiQ , iiR , iiT , ijR , ijT , , the activate functions vector i , 
i , i , choose the sampling interval T , initialize stable values of 
(0)ˆ
iW and 
(0)ˆ
jW . Compute 
(0)ˆ
iV (41), 
(0)ˆ
iu (42) and 
(0)ˆ
id (43), choose probing noise i for the PE condition (52). Assign 
0l , stopl (time to stop the algorithm), (the small positive real number) for the convergence 
criteria. 
Step 2: 1,...,i m add the probing noise to 
( )ˆ l
iu and 
( )ˆ l
id to excite the system: 
( ) ( )ˆ ˆl l
i i iu u and 
( ) ( )ˆ ˆl l
i i id d ; observe ( ), ( )i ie t e t T for i (47); update 
( 1)ˆ l
iW (46); 
compute 
( 1)ˆ l
iV (41) update both 
( 1)ˆ l
iu (42) and 
( 1)ˆ l
id (43), simultaneously. 
Step 3: 1,...,i m if 
( 1) ( )ˆ ˆl l
i iW W then assign 0i . If stopl l and 0i then stop 
the algorithm, else 1l l , go back to Step 2. 
3.3. Stability and Convergence Analysis 
Stability and convergence of the closed–loop systems (22) when the ith NMW performs the 
online OCTC with the NN weight tuning law (46), control law (42) and disturbance law (43), are 
stated and proven by the following theorem. 
Theorem 1: Let the cooperative dynamics of the multi–NMR systems be defined in (18), 
which gives the ith dynamics in (22), for all i, i = 1, ..., m. Let the cooperative value function of 
NMR i be chosen as (28) and the coupled HJI equations as (32). Let the NN weight–tuning law 
be defined in (46), the control law in (42) and the worst case disturbance law in (43). Let i be 
the PE condition (52). Assume that NMR i performs the online OCTC and that the control law 
and disturbance law of the neighborhoods in the previous steps were updated and stable with 
*
. Then, the online OCTC guarantees that 
 (Stabilization) The cooperative tracking errors ie of the closed–loop systems and the 
NN approximation errors iW are uniformly ultimately bounded (UUB). 
 (Convergence) After a limited number of iterative steps, the value function, the control 
law as well as the worst disturbance law are synchronously converged to the approximately 
optimal values, i.e., 
* ˆ
ii i v
V V e , * ˆ
ii i u
u u e and 
* ˆ
ii i d
d d e for small positive 
constants 
iv
e , 
iu
e and 
id
e . 
4. HARDWARE TESTBED AND RESULTS 
4.1. Hardware Testbed with Omnidirectional Vision 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
155 
Using the graph theory in Section II, the communication of the multi-NMR system is 
chosen (Fig. 2), where the virtual leader is indexed by 0. The information exchange between 
NMR i and its neighborhoods, including positions [ , , ]i i i iq x y , velocities [ , ]i i iv 
and torques [ , ]i il ir , is represented by arrows. It is desired that information of the virtual 
leader is only available to NMR 1 . 
To verify the effectiveness of the proposed algorithm for practical applications we 
developed the hardware testbed thatconsists of three experimental NMRs as shown in Fig. 1. 
The geometric parameters of the NMRs are 
1 0.05r m, 1 0.5b m, 1 0l , 2 3 0.025r r m, 
2 3 0.2b b m, and 2 3 0l l . The total mass parameters of the NWRs are 1 5m kg and 
2 3 0.5m m kg. Then, using these values, the parameters of iM and iB in (2) are obtained. 
Figure 2. Communication graph of multi– NMR system. Figure 1. Experimental NMRs; (a): Rear view, 
 (b): Front view. 
The hardware diagram of NMR 1 , shown in Fig. 3, consists of three main parts: the 
mechanical part (a mechanical frame, DC motors with stall torques of 0.73 Nm, and digital 
quadrature encoders with 400 divisions), a control board (a PIC micro-controller, a power 
circuit, and a XBee wireless module), and an embedded computer with an Intel Atom 
D510@1.66 GHz CPU executing the online OCTC. An omnidirectional vision system is 
constructed to identify feedback states of positions and linear velocities for all NMRs. 
Each neighbor of NMR 1 has only one frame and a control board. Communication with 
others is achieved using XBee radio transceivers. At each sample point, the neighbors send their 
encoder pulses to and receive torques from NMR 1 . In the case of a dropped packet, the previous 
data packet is used. 
In the embedded computer, software based on the VC++ programming language running 
on the Windows platform is programmed to implement image processing via the OpenCV 
software to identify the positions and linear velocities of all NMRs, read the encoder pulses to 
compute the rotation velocities, process communication, execute the online algorithm, and send 
the torque signals to the micro-controllers for controlling the DC motors through the pulse width 
modulation (PWM) technique with a frequency of 20 kHz. The upper bounds of the torques are 
selected as 0.2i Nm. In addition, the software generates reference trajectories of positions 
and the velocities. The embedded computer and the micro-controller communicate to each other 
via the RS232 protocol. Users can remotely start or stop the testbed through interfacing tools on 
the remote computer connected to the embedded computer through the wireless network. The 
data generated by the NMRs during movement are stored and plotted in Matlab. 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
156 
Figure 3. Structural diagram of NMR 1. Figure 4. Local and global Cartesian coordinate systems. 
Only NMR 1 is equipped with the omnidirectional vision system (OVS), which is shown in 
Fig. 3. OVS consists of a camera and a hyperbolic curve mirror with the optical center C and a 
bend radius R . The mirror is placed on a glass tube with a diameter 0.2D m. The camera, 
with a resolution of 1280 720 pixels and a frame rate of 30 fps, is fixed to the top of the robot 
platform at the geometrical center. The vertical center axis of the glass tube goes through the 
focus point of the camera and the mirror center. The distance between the origin O and the 
bottom point of the mirror 
1C , which is measured along the vertical axis, is 0.5H m. The OVS 
can recognize the colored landmarks in any direction; therefore, any mechanism to adjust the 
rotation of the camera around the pan, tilt and yaw axes is not needed. 
Consider the Cartesian coordinate system OXY fixed at the surface of NMR 1, where the 
origin O coincides with the geometrical center and OX aligns with the vertical axis of symmetry. 
Via OVS and image processing, it is not difficult to measure the center coordinate of any 
landmark placed around the robot in the image space OXY in units of pixels. However, when the 
robot moves, to identify every coordinate in the world space OXY in units of length, we need an 
identity operation [25]. Here, we use an NN with radial basis functions (RBFs). The RBFs are 
trained offline using samples, which are the center coordinate of the landmark measured in the 
image space (pixels) as inputs and in the world space (meters) as desired outputs. Using the 
approximation ability of the RBF, the centers of any landmarks surrounding the robot can be 
recognized and transformed into the world coordinates. 
To determine the center position coordinates of NMR 1 in the horizontal plane, we choose 
the Cartesian coordinate system Oxy based on two differently colored landmarks (as shown in 
Fig. 4). Without loss of generality, the origin O coincides with the center of the landmark (blue), 
and the axis Ox goes through the center of the other landmark (red). Suppose that the center 
coordinates of the two landmarks in OXY and the axis OXY are 1 1( , )X Y and 2 2( , )X Y ; they can 
be transformed to Oxy to obtain the robot center position vector 1 1 1[ , , ]x y [25]: 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
157 
2 1
1
2 2
2 1 2 1
1 2 1 1 2 1
1
2 2
2 1 2 1
1 2 1 1 2 1
1
2 2
2 1 2 1
argcos
( ) ( )
( ) ( )
.
X X
θ
X X Y Y
Y Y Y X X X
x
X X Y Y
Y X X X Y Y
y
X X Y Y
 (60) 
In Figure 4, we can determine the coordinates of the neighborhoods, e.g., NMR 2, in Oxy. 
On the surface of NMR 2, we place two color objects. The center of the green object coincides 
with the center of the robot, and the center of the yellow object is located on the longitudinal 
axis. The coordinates of the robot in Oxy can be identified by the following formulas: 
2 1 12 12 1
2 1 12 12 1
2 1 12
( )
( )
.
x x l cos ψ θ
y y l sin ψ θ
θ θ λ
 (60) 
where 1 2 1 2
12 2 2( ) ( )l X Y , 12 12 12 , 
1 2
2 2
12 1 2 2 1 2 2
2 2 2 2
argcos
( ) ( )
X X
X X Y Y
, 
1
12 2 12argco /s X l . 
4.2. Simulation and Experimental Results 
In this subsection, first, to evaluate the effectiveness of the proposed method, simulations 
of the OCTC algorithm with one NN and the ACD algorithm [14], extended for the multi–NMR 
system, with three NNs, are performed and compared. Then, based on the simulation results, the 
experiment is implemented. The parameter values of the NMRs in the simulation are assigned to 
be exactly as those in the experiment so that the converged NN weights after the simulation can 
be used to initialize the NN weights in the experiment to speed up the learning process in 
practice. 
The velocity vector that the virtual leader uses to generate the smooth reference postures is 
chosen as 
2 2 1 1 1 2 2 2 2 2 1 1
0 0 0 1 1 2 2 2 2
1 1 2 2
sin( ) cos( ) sin( ) cos( )
, ( cos( )) ( cos( )) ,
( cos( )) ( cos( ))
T
T Aω ω t A ω t A ω ω t A ω t
v ω A ω t A ω t
A ω t A ω t
 (62) 
where 
1 0.04 rad/s, 2 0.02 rad/s, 1 0.022A m/s, 2 0.02A m/s. The distance vectors (3) 
between NMR i and its neighbors j , 
i j
, , 1,2,3i j , are introduced as 
1 2 [0.5,0,0] , 3 2 [1.0,0,0] . 
For both algorithms, the NN weights, with 15 elements for each NMR, are defined as 
1 2 15
ˆ ˆ ˆ ˆ[ , , , ]i i i iW W W W , of which the initial values are zeros in OCTC but are properly chosen 
for three NNs in ACD. Note that the total number of NN weights of OCTC is 45, but that of 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
158 
ADC is 135. The adaptive gains are selected as 25i and 0.01i j . The activation 
functions ( )i ie are chosen as 
2( ) [ ,i i xie e , ,xi yi xi i xi vie e e e e e
2, , , ,xi i yi yi ie e e e e yi vie e , ,yi ie e
2 2 2, , , , ,
T
i i vi i i vi vi i ie e e e e e e e e .Select 1( )
T
ii i i i iQ e e Q e , 1iQ
5 5I , 1ii ij ii ijR R T T , 
and 1 . The PE condition is guaranteed by adding the probing noise 
0.008rand( ) ti t e to 
control and disturbance inputs, where rand( )t is the function that generates random signals in 
the range [ 1,1] . The initial position of the leader is 0 [ 0.4, 0.6,0.6]q . The initial positions 
and velocities of the NMRs are 1 [0,0,0]q , 1 [0,0]v , 2 [0, 0.5,0]q , 2 [0,0]v , 
3 [ 0.5, 0.5,0]q , 3 [0,0]v . 
The evolutions of the cooperative position tracking errors for the NMRs under both OCTC 
and ACD are shown in Fig. 5. After the parameters converge, all errors in both algorithms are 
approximately zero. In the early periods, however, the errors under OCTC decrease faster than 
do those under ACD. The cooperative trajectories x , y and in both algorithms are shown in 
Figs. 6, 7 and 8, respectively. Consequently, NMR 1 tracks the leader while keeping the 
formation with its neighborhood such that the tracking errors are as small as possible, i.e., 
1 1 1 0 0 0[ , , ] [ , , ]x y x y , 3 3 3 1 1 1[ , , ] [ , , ] [ 0.5,0,0]x y x y . NMR 3 keeps the 
formation with NMRs 1 and 2, i.e., 
2 2 2[ , , ]x y 3 3 3[ , , ] [1,0,0]x y . Similarly, the results of 
NMRs 2 and 1 can be easily deduced. By observing these figures, again, it is found that the 
control performances under OCTC are better than those under ACD. 
The cooperative performance of the linear and rotational velocities among the NMRs using 
both algorithms is shown in Figs. 9 and 10, respectively. It can be seen that, after the parameters 
converge, the performances approach approximately optimal values, i.e., 
2,3 2,3 1 1 0 0[ , ] [ , ] [ , ] , in finite time. Again, the cooperative velocity 
performances of the NMRs under OCTC continue to dominate those under ACD, especially at 
the peaks of the linear velocity curves. 
Now, the proposed control scheme is applied to the testbed. It is important to notice that the 
converged NN weights in Algorithm OCTC after the simulation are used to initialize the NNs in 
the experiment. The other learning parameters are chosen as those in the simulation. 
Figure 5. Evolution of the formation errors of 
positions for NMRs 1, 2, 3 by Algorithms OCTC 
with one NN and ACD with three NNs. 
Figure 6. Evolution of the cooperative positions 
for by Algorithms OCTC with one NN 
and ACD with three NNs. 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
159 
Figure 7. Evolution of the cooperative positions for 
1 2 3, ,y y y by Algorithms OCTC with one NN and 
ACD with three NNs. 
Figure 8. Evolution of the cooperative positions for 
1 2 3, ,θ θ θ by Algorithms OCTC with one NN and 
ACD with three NNs. 
Figure 9. Evolution of the cooperative linear 
velocities for 1 2 3, ,v v v . 
Figure 10. Evolution of the cooperative linear 
velocities for 1 2 3, ,ω ω ω . 
Figure 11. Experimental 
formation for 1 2 3, ,x x x . 
Figure 12. Experimental 
formation for 1 2 3, ,y y y . 
Figure 13. Experimental 
formation for 1 2 3, ,θ θ θ . 
Figures 11, 12 and 13 show the positions , ,i i ix y , for all 1,2,3i . For the final stages of 
the experiment, Fig. 14 shows that NMR 1 tracks the desired virtual trajectory while maintaining 
the formations with the neighborhoods. Figs. 15(a) and 15(b) show the linear velocities i and 
the rotational velocities i , respectively. It is observed that the experimental results are 
consistent with the simulation. 
Figure 14. Experimental formation 
for . 
Figure 15. Experimental velocities: (a) linear velocities 
 (b) rotational velocities . 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
160 
5. CONCLUSION 
This paper provides a new distributed optimal cooperative tracking control method with 
disturbance rejection for multi-mobile robots. The method removed the phenomenon of 
separating the main controller and transformed a structure ACD with three NNs into a structure 
with only one NN for which the novel NN weight-tuning laws were designed. The online 
optimal cooperative control algorithms of the scheme, in which the knowledge of internal 
dynamics was relaxed, were proposed to approximate the Nash equilibrium solutions of the HJI 
equations. The algorithms guaranteed that the value functions and the control and disturbance 
laws simultaneously converged to the optimal values and that the cooperative tracking errors of 
the closed-loop systems and approximation errors of NN weights were uniformly ultimately 
bounded. The compared simulations were carried out, and the experimental results on the testbed 
equipped with an omnidirectional vision system were consistent with the simulation. Based on 
the experimental results, it can be inferred that our method is effective for a certain practical 
aspect of control systems technology including multiple nonholonomic mobile mechanic agents 
or autonomous vehicles, which track both positions and velocities. 
REFERENCES 
1. Sun D., Wang C., Shang W., and Feng G. - A synchronization approach to trajectory 
tracking of multiple mobile robots while maintaining time varying formations, IEEE 
Trans. Robot 25 (5) (2009) 1074–1086. 
2. Loria A., Dasdemir J., and Jarquin N. A. - Leader–follower formation and tracking 
control of mobile robots along straight paths, IEEE Trans. Contr. Syst. Technol. 24 (2) 
(2016) 727–732. 
3. Dong W. - Tracking control of multiple-wheeled mobile robots with limited information 
of a desired trajectory, IEEE Trans. Robot 28 (1) (2012) 262–268. 
4. Gu D. and Wang Z. - Leader–follower flocking: Algorithms and experiments,” IEEE 
Trans. Contr. Syst. Tech. 17 (5) ( 2009) 1211–1219. 
5. Wang Z. and Gu D. - Cooperative target tracking control of multiple robots, IEEE Trans. 
Ind. Electron. 59 (8) (2012) 3232–3240. 
6. Yu X. and Liu L. - Distributed formation control of nonholonomic vehicles subject to 
velocity constraints, IEEE Trans. Ind. Electron. 36 (2) (2016) 1289–1298. 
7. Khoo S., Xie L., and Man Z. - Robustfinite-time consensus tracking algorithm for 
multirobot systems, IEEE/ASME Trans. Mechatr. 14 (2) (2019) 219–228. 
8. Wang W., Huang J., Wen C., and Fan H. - Distributed adaptive control for consensus 
tracking with application to formation control of nonholonomic mobile robots,” 
Automatica 50 (4) (2014) 1254-1263. 
9. Peng Z., Yang S., Wen G., Rahmani A., and Yu Y. - Adaptive distributed formation 
control for multiple nonholonomic wheeled mobile robots, Neurocomputing 173 (3) 
(2016) 1485–1494 
10. Dierks T. and Jagannathan S. - Neural network output feedback control of robot 
formations,” IEEE Trans, Syst., Man, and Cybern., B Cybern. 40 (2) (2010) 383–399. 
11. Movric K. H. and Lewis F. L. - Cooperative optimal control for multiagent systems on 
directed graph topologies, IEEE Trans. Autom. Contr. 29 (3) (2014) 769–774. 
Intelligent distributed cooperative control for multiple nonholonomic mobile robots 
161 
12. Vamvoudakis K. G., Lewis F. L., and Hudas G. R. - Multi-agent differential graphical 
games: Online adaptive learning solution for synchronization with optimality, Automatica 
48 (2012) 1598–1611. 
13. H. Zhang, F. L. Lewis, and A. Das - Optimal design for synchronization of cooperative 
systems: State feedback, observer and output feedback, IEEE Trans. Autom. Contr. 56 (8) 
(2011) 1948–1952. 
14. Jiao Q., Modares H., Xu S., Lewis F., and Vamvoudakis K. G. - Multiagent zero-sum 
differential graphical games for disturbance rejection in distributed control, Automatica, 
69 (2016) pp. 24–34. 
15. Tatari F., Naghibi-Sistani M.-B., and Vamvoudakis K. G. - Distributed learning algorithm 
for nonlinear differential graphical games, Transactions of the Institute of Measurement 
and Control, first published (2015), doi: 10.1177/0142331215603791. 
16. Cao W., Zhanga J., and Ren W. - Leader–follower consensus of linear multi-agent 
systems with unknown external disturbances, Systems & Control Letters 82 (2015) 64–70. 
17. Wang J. and Xin M. - Distributed optimal cooperative tracking control of multiple 
autonomous robots, Robotics and Autonomous Systems 60 (4) (2012) 572 – 583. 
18. Dierks T., Brenner B., and Jagannathan S. - Neural network-based optimal control of 
mobile robot formations with reduced information exchange, IEEE Trans. Contr. Syst. 
Technol. 21 (4) (2013) 1407–1415. 
19. Fierro R. and Lewis F. L. - Control of a nonholonomic mobile robot using neural 
networks, IEEE Trans. Neur. Netw. 9 (4) (1998) 589–600. 
20. Vamvoudakis K. G. and Lewis F. L. - Multi-player non-zero-sum games: online adaptive 
learning solution of coupled Hamilton-Jacobi equations, Automatica. 47 (8) (2011) 1556 – 
1569. 
21. Huai-Ning W. and Biao L. - Neural network based online simultaneous policy update 
algorithm for solving the HJI equation in nonlinear H control, IEEE Trans. Neur. Netw. 
and Learn. Syst. 23 (12) (2012) 1884 –1895. 
22. Zargarzadeh H., Dierks T., and Jagannathan S. - Optimal control of nonlinear continuous-
time systems in strict-feedback form, IEEE Trans. Neur. Netw. Learn. Syst. 26 (10) 
(2015) 2535–2549. 
23. Luy N. T. - Adaptive dynamic programming-based design of integrated neural network 
structure for cooperative control of multiple MIMO nonlinear systems, Neurocomputing 
(2016),  
24. Khoshnam S., Alireza M. S., and Ahmadrez T. - Adaptive feedback linearizing control of 
nonholonomic wheeled mobile robots in presence of parametric and nonparametric 
uncertainties, Robotics and Computer Integrated Manufacturing 27 (1) (2011) 194–204. 
25. Luy N. T. - Robust adaptive dynamic programming based online tracking control 
algorithm for real wheeled mobile robot with omnidirectional vision system, Transactions 
of the Institute of Measurement and Control (2016), doi: 10.1177/0142331215620267. 
26. Chang Y. and Chen B. - A nonlinear adaptive H tracking control design in robotic 
systems via neural networks, IEEE Trans. Contr. Syst. Tech. 5 (1) (1997) 13–29. 
27. Wang L., Wang X., and Hu X. - Connectivity maintenance and distributed tracking for 
double-integrator agents with bounded potential functions, Int. J. Robust Nonlinear 
Nguyen Tan Luy, Tran Ngoc Anh, Dang Quang Minh 
162 
Control 25 (4) (2015) 542–558. 
28. Aliyu M. D. S. - Nonlinear H control, Hamiltonian systems and Hamilton–Jacobi 
equations, CRC Press, 2011. 
29. Basar T. and Bernhard P. - H Optimal Control and Related Minimax Design Problems: 
A Dynamic Game Approach, 2nd ed. Boston, MA:Birkhuser, 1995. 
30. Abu-Khalaf M. and Lewis F. L. - Nearly optimal control laws for nonlinear systems with 
saturating actuators using a neural network HJB approach, Automatica 41 (5) 779–791. 
31. Hornik K., Stinchcombe M., and White H. - Universal approximation of an unknown 
mapping and its derivatives using multilayer feedforward networks, Neural Networks 3 
(5) (1990) 551–560. 
32. P. Ioannou and B. Fidan - Advances in design and control, Adaptive control tutorial. PA: 
SIAM, 2006. 
33. Vamvoudakis K. G. and Lewis F. L. - Online actorcritic algorithm to solve the 
continuous-time inifinite horizon optimal control problem, Automatica 46 (5) (2010) 
878–888. 
34. Lewis F. L., Jagannathan S., and Yesildirek A. - Neural Network Control of Robot 
Manipulators and Nonlinear Systems, Taylor and Francis, Philadelphia, PA, 1999. 
            Các file đính kèm theo tài liệu này:
 11967_103810382410_1_sm_2359_2061586.pdf 11967_103810382410_1_sm_2359_2061586.pdf