Abstract: In this paper, we adopt D-type and PD-type learning laws with the initial state of iteration to achieve uniform tracking problem of multi-agent systems subjected to impulsive input. For the multi-agent system with impulse, we show that all agents are driven to achieve a given asymptotical consensus as the iteration number increases via the proposed learning laws if the virtual leader has a path to any follower agent. Finally, an example is illustrated to verify the effectiveness by tracking a continuous or piecewise continuous desired trajectory.
Keywords: iterative learning control, multi-agent systems, impulsive consensus tracking.
Article
Iterative learning control for multi-agent systemswith impulsive consensus tracking

Recepción: 12 Diciembre 2019
Revisado: 28 Mayo 2020
Publicación: 01 Enero 2021
Multi-agent systems (MAS) have been widely used in various disciplines such as un- manned vehicles, wireless sensor networks, and communication networks in the past between them, and information can be trans- mitted to the ground to guarantee accurate positioning. The consensus problem is a fun- damental issue for MAS because of its wide applications in formation control, distributed estimation, and congestion control. In fact, consensus tracking over networks indicates that outputs of all agents track a given objective synchronously. We note that abrupt changes of states may exist at some time instants in biological and physical systems. For example, the migration of birds is subject to abrupt changes due to harvesting and diseases. For this scenario, MASs with impulse can well describe the inevitable interfer- ence during the actual system operation. When GPS suffers from solar storm and other external interference, their trajectory may shift, which is a pulse phenomenon. This paper only discusses the case of instantaneous pulse; that is, the time of pulse generation is very short compared with the whole process. To study the problem of uniform tracking of impulsive MAS is to study whether the agents can return to the predetermined trajectory through the information exchange after being disturbed by external environments. In this regard, Cui conducted related research in [6]. However, very few existing papers considered the consensus problem of MASs with impulse, for examples, [8, 14, 16, 21, 32, 35, 36, 38], in the conventional consensus framework. In addition, impulsive control approach is advantageous in simplicity and flexibility for such kind of systems because the standard continuous state information is not required. As a consequence, this approach has been offered to study uniform tracking problem [9–11, 15, 27, 28, 31, 39] and adaptive consistency and synchronization problems [5, 7, 22–25, 29] for MASs
For a robot performing a trajectory tracking task over a finite time interval, iterative learning control (ILC) uses the error information measured during the previous or previ- ous operations to correct the control input, such that the operation performance can be improved along the iteration axis. Consequently, the desired trajectory can be precisely tracked over the entire time interval by the inherent mechanism of learning. ILC was first proposed in [2] for a robot, whereas Ahn and Chen [1] applied ILC to the consensus tracking of a MAS. Recently, ILC laws have been extensively studies for various types of MASs such as fractional order MAS [4, 17–20, 26, 33, 34, 37]. Note that MASs with impulse can generate discontinuous inputs, thus it is still challenging to consider whether ILC can be successfully applied to collect the sampled error data from each agent and track continuous or discontinuous trajectory, i.e., achieving leader-following consensus for nonlinear dynamics of MAS with impulse. In addition, [12,13] use Lyapunov stability theory to analyze the coordination performance of MAS.
In consideration of all above discussions, we address the application of learning type consensus tracking algorithms for MASs in this paper. In particular, we use D-type and PD type ILC laws to derive the formation tracking performance of impulse MAS under a fixed topology. The D-type ILC update law refers to a differential learning law, which uses the derivative of error signals from the previous iteration to correct the input signals for the next iteration. The PD-type ILC update law is the superposition of a proportional learning law and a differential learning law. It uses error signals from the last iteration and their derivatives to correct the input signals for the next iteration. A fundamental challenge in this paper is how to design an effective ILC by using information of the tracked trajectory and the specified agent’s neighbors. This challenge is resolved by providing flexible control inputs according the changes of system states at fixed points. The output can be used to track a piecewise continuous trajectory by using continuous-time topology connections involving some instantaneous information exchanges.
The rest of the paper is organized as follows: Section 2 provides the problem formu- lation and preliminaries. Section 3 provides the main results of this paper. An illustrative example is presented in Section 4.
Consider a weighted directed graph composed of set of vertices . = 1, 2, 3, . . . , N , N represents the number of agents in the system, the set of edges E V V , and the adjacency matrix .. Set = (V, E, Z). . represents the set of multi-agents. Set of edge . is composed of directed sequence pairs (i, j), where (i, j) means that agent . can pass information to agent ., that is, . is called the parent node of ., and . is called the child node of .. All the sets adjacency with the . agent are called the adjacency sets of the . agent denoted as . = j V .j, i) . . . = (zi,j.. is the weighted adjacency matrix of , which is composed of nonnegative elements zi,j. In particular, zi,i= 0; if (j, i. E, zi,j= 1, it is means that agent . can pass information to agent .; if (i, j. / E, zi,j= 0, it is means that agent . can not pass information to agent .. The Laplace operator of Q is defined as . = . − ., where . = diag(.., d., . . . , d. ). d. represents the entry degree of vertex ., that is, d. = Σ. zi,j. If a directed graph has one node that has no parent and all other nodes have only one parent, the directed graph is called a spanning tree.
In this paper, . is used to represent the 2-norm of vector ., and . is used to represent the matrix norm compatible with it. The .-norm of the function . is expressed as v .: [0, α] R. and v . = sup. [0,α] e.λt v(.) , λ > 0.
The standard Kronecker product is defined as

where . = (hij.ac∈ R.×c, L R.×..
Consider a system with . agents, each agent with . pulse points. = (V, E, Z) rep- resents their interaction topology. The .th agent is controlled by the following nonlinear impulsive systems:

for all . ∈ V , τ ∈ [0, α]. This system is right-continuous, where X. ∈ R. is the state vector of the .th agent, u. ∈ R. is the control function of the .th agent, . is R.×. matrix, y. R. is the output vector of the .th agent, k( , ) : [0, α] R. R. and M. : R. R. are continuous, .(. ) is a continuous R.×. matrix function. Impulsive time sequence is denoted by 0 < τ. < τ. < < τ. < α. (τt ) = lim. .+ (τ. +.) and (τt−) = (τ.) represent the right and left limits of (. ) at . = τ., respectively.
We need the following conditions:
(H1) k(·, ·) satisfies the Lipschitz condition

for any . ∈ [0, α] and X.+1,j, Xi,j ∈ R..
(H2) M.(·) satisfies the Lipschitz condition

for any x, y ∈ R..
Under assumptions (H1) and (H2), following [30, Remark 4.1], system (1) with X (0) = X. has a unique solution in a piecewise continuous functions space

Let y.(. ) be the expected consistent trace of the MAS on the time interval . [0, α], 0 < α < . Here, y.(. ) is not necessarily continuous on the whole time interval [0, α]. We regard the desired trajectory y.(. ) as the virtual leader in the communication topology and mark it with vertex 0. Then, the information exchange among agents can be represented by an extended communication topology graph . = (. 0 , E., A.), where .. represents the edge set, and .. represents the weighted adjacency matrix. The control objective is to design appropriate iterative learning laws such that the output of all agents can asymptotically converge to the desired trajectory y.(. ).
We use the symbol σi,j(. ) to represent all the information received by the .th agent in the .th iteration. Then, it can be expressed as the sum of the information transmitted from other agents to the .th agent and the possible information transmitted from the leader to the .th agent

The .th agent can get information directly from the desired trajectory. That is, if (0, j) .., then d. = 1; otherwise, d. = 0. Where the first subscript of . and . indicates the number of iterations, and the second subscript indicates the sequence number of the agent. The subscripts of . and . are explained in Section 2. The derivative of the σi,j(. ) function is defined as follows:

In order to make the intelligent body track the target trajectory iteration number increases, the following D-type learning laws are employed:

where . (. ) is a R.×. matrix function and is differentiable during the interval [0, α]. The initial state learning rule is as follows:

Set ψi,j(. ) as the tracking error of the agent; that is, ψi,j(. ) = y.(. ) yi,j(. ). The learning law (3) can be written as

We set all involved quantities of all agents of arbitrary iteration into vector form as .(. ) = ( i,.(. )., i,.(. )., . . . , i,N(. ).)., u.(. ) = (ui,.(. )., ui,.(. )., . . . , ui,N(. ).)., ψ.(. ) =(ψi,.(. )., ψi,.(. )., . . . , ψi,N(. ).)., σ.(. ) = (σi,.(. )., σi,.(. )., . . . , σi,N(. ).)., where ( ). is the transpose of ( ). Then, (5), (6), and (7) can be written as follows:


To study the multi-agent consensus problem with pulse points, (H1), (H2) and the following assumptions are necessary in this paper.
The desired trajectory y. is trackable; that is, there exists a state X. satisfies y. = .X..
For brevity, let .. = max(θ.) and

And

where θ. is the Lipschitz constant in (3).
Consider the multi-agent system (1) based on fixed topology communicate with (H1), (H2), and Assumption . holding, and apply the D-type learning control law (5) and the initial state learning rule (6). As the iteration number approaches infinity, the tracking error ψ.(. ) converges to zero, i.e., lim.→∞yi,j(. ) = y.(. ) for all τ [0, α. if the desired trajectory has a path to any follower agent and

where Φ defined by (10).
It should be noted that each iteration will update the parameters of the entire system, and the value range of the system’s independent variable . is bounded, but the number of iterations is not be limited. In other words, the convergence meaning here indicates that a pointwise convergence over the entire time interval as the iteration number increases to infinity.
Proof. The tracking error of the .th agent in the (. + 1)th iteration is

And

From (4) it can be known that

where . ( ., s) = (k( i,., s)., k( i,., s)., . . . , k( i,N , s).)., and I. is an N N identity matrix
According to (8) and (9), (14) can be written as

Where

Then

Taking norm to both sides of (17), according to the (2) and (3), we can get

In a similar way, we can get

Multiply both sides of the inequality (19) by e.λτs :

Substituting (15) into (13) and taking norm to it. According to (3) and (2), we can get

Multiply both sides of inequality (21) by e.λτ according to (20):

Then, taking .-norm to (22), we have

According to (18) and impulsive Gronwall’s inequality (see [3, Lemma 4.2]), we can
get

Then, taking .-norm to (24), we have

Substitute (25) into (23) and then set . → ∞

And

By (26) and (11),

The proof is completed.
Further, we consider the PD-type learning law

where . (. ) and .(. ) are p p matrix functions and differentiable during the interval [0, α]. The initial state learning rule is as follows:

From above one has the following result.
Theorem 2. Consider the multi-agent system (1) based on fixed topology communicate with (H1), (H2), and Assumption . holding, and apply the PD-type learning control law (27) and the initial state learning rule (28). As the iteration number approaches infinity, the tracking error ψ.(. ) converges to zero, i.e., lim.→∞yi,j(. ) = y.(. ) for all τ ∈ [0, α. if the desired trajectory has a path to any follower agent and

where Φ defined by (10).
Proof. The proof is similar to Theorem 1. So we mainly express the differences
Clearly, the tracking error is

We need to compute the state error ( .+1(. ) .(. )) similar to (14), and by (15) and (16), we obtain

Taking norm to both sides of (3) via (2) and (3), we can derive

Similar to (19), we can get

Multiply both sides of inequality (32) by e.λτs :

Substituting (3) into (29) and taking norm to it. According to formula (2) and (3), we
can get

Similar to (22), multiply both sides of inequality (34) by e.λτ according to (33):

Then, taking .-norm to (35) and . → ∞, we have

By (36) and (11),

The proof is completed.
We can use the following procedures to carry out computer simulation experiments:
Step 1. Give the expression of the target trajectory y., the expression of the multi- agent system (1), and the initial parameters of the D-type or PD-type learning laws.
Step 2. Generate the system output y..
Step 3. Calculate the tracking error . and its norm ǁ.ǁ. If ǁ.ǁ < ϵ, the program ends. If ǁ.ǁ ≥ ., go to Step 4. Here, . is a given positive real number.
Step 4. Update the input according to the learning law using tracking errors and the communication topological relationship between agents, then go to Step 2.
We consider the following MAS consisting of five agents

for all i V , τ [0, 6], where i,. represents the first state of the .th agent, and i,. represents the second state. Initial value as follows:

The communication topology is shown in Fig. 1, where 0 represents the leader. Ac- cording to Fig. 1, the Laplace matrix is


and . = diag(1, 2, 2, 1). The target trajectory, i.e., the trajectory of vertex 0, is as follows: y. = (y.., y..)., where

And

Here, y.. and y.. represent the first and second dimension of the target trajectory, respectively. The D-type learning control law is

while PD-type counterpart is

where ..(. ) = [0, 0].. .(C, B, P ) = 0.8918 < 1, which satisfies the condition of Theorems 1 and 2. Therefore, the multi-agent system can uniformly track the target trajectory under the given learning control. Figures 2 and 3 show that the error between the output value and the target trajectory gradually converges to 0 (both D-type and PD- type).
Figures 4–7 show the iterative learning process of two output trajectories with D-type learning law. Figures 8–11 show the iterative learning process of two output trajectories with PD-type learning law. Figure 12 shows the iteration profile of the initial values.
As the number of iterations increases, the output trajectory gradually converges to the desired trajectory. When the iteration reaches 250th, the consensus errors of P-type learning law and PD-type learning law are shown in Table 1.












As can be seen from the table, when the number of iterations reaches 250, the system’s convergence error under the control of the D-type learning law is significantly smaller than the PD-type learning law. For this numerical example, a more complex learning law does not necessarily lead to better control effect. However, we should remind that the introduction of a proportional term may help stabilize the system dynamics
To solve the problem of uniform tracking of impulsive MAS, this paper uses two kinds of iterative learning laws to control the system and finds sufficient conditions for the system to converge to the target trajectory under the control of two kinds of learning laws respectively. The conditions show that when the initial parameters of the system meet certain conditions, we can adjust the initial parameters of the learning law. After finite iterations, the error between the output and the target trajectory can be sufficiently small. Compared with the single agent, MAS can exchange information between agents, which can better ensure the effectiveness of tracking. Compared with the continuous system, a pulse system is more general and more in line with real cases. Finally, a numerical example is given to demonstrate the effectiveness of the conclusion. Furthermore, we will construct a fractional iterative learning law to control the impulsive MAS and study its consistency tracking.
The authors are grateful to the referees for their careful reading of the manuscript and their valuable comments. We thank the editor also.












