Unconstrained Evolutionary and Gradient Descent-Based Tuning of Fuzzy-partitions for UAV Dynamic Modeling

In this paper, a novel fuzzy identification method for dynamic modelling of quadrotors UAV is presented. The method is based on a special parameterization of the antecedent part of fuzzy systems that results in fuzzy-partitions for antecedents. This antecedent parameter representation method of fuzzy rules ensures upholding of predefined linguistic value ordering and ensures that fuzzy-partitions remain intact throughout an unconstrained hybrid evolutionary and gradient descent based optimization process. In the equations of motion the first order derivative component is calculated based on Christoffel symbols, the derivatives of fuzzy systems are used for modelling the Coriolis effects, gyroscopic and centrifugal terms. The non-linear parameters are subjected to an initial global evolutionary optimization scheme and fine tuning with gradient descent based local search. Simulation results of the proposed new quadrotor dynamic model identification method are promising.


INTRODUCTION
Autonomous quad-rotor helicopters increasingly attract the attention of potential researchers. A wide area of robotics research is dedicated to aerial platforms. The quad-rotor architecture has low dimensions, good manoeuvrability, simple mechanics and payload capability. The electrically powered four-rotor quadrotor helicopter architecture has been chosen for this research ( Figure 1).

Figure 1. Quad-rotor helicopter
This structure can be attractive in several applications, in particular for surveillance, for imaging dangerous environments and for outdoor navigation and mapping. The study of kinematics and dynamics helps to understand the physics of the quad-rotor and its behaviour.
Together with modelling, the determination of the control algorithm structure is very important [1][2][3][4][5][6]. The quad-rotor UAV is controlled by angular speeds of four motors. Each motor produces a thrust and a torque, whose combination generates the main trust, the yaw torque, the pitch torque, and the roll torque acting on the quad-rotor. Motors produce a force proportional to the square of the angular speed and the angular acceleration; the acceleration term is commonly neglected as the speed transients are short thus exerting no significant effects.
Soft computing methods can be efficiently applied together with and also instead of conventional controllers. Fuzzy modelling [7][8][9][10] can be conducted as black-box modelling where all the system knowledge is mere input-output data. However when expert knowledge is readily available, we should take advantage of it -fuzzy grey-box modelling is a rational choice.
Identification of linear parameters is a well-studied area, with efficient matrix algebra and singular value decomposition based reliable tools. Non-linear parameters can also be simply traced to their local optimum with well-studied gradient descent methods, but we should always keep in mind that gradient descent methods are trapped by local optimum areas. Evolutionary algorithms are robust global optimum search engines, capable of multi-objective search as described in [8][9][10].
The paper is organized as follows: In Section 1 the Introduction is given. In Section 2 the quad-rotor dynamic model is presented. In Section 3 the Fuzzy-Logic systems are illustrated. In Section 4 the Multi-Objective Genetic Algorithms are illustrated.
Section 5 presents the simulation setup and results. Conclusions are given in Section 6.

DYNAMIC MODEL OF THE QUAD-ROTOR HELICOPTER
Dynamic modelling [11][12][13][14][15][16][17] of the quad-rotor helicopter is a well elaborated field of aeronautics. The model of the quad-rotor helicopter [18] and the rotational directions of the propellers can be seen in Figure 2. The rotor pair 2 and 4 rotates clockwise direction and the rotor pair 1 and 3, anticlockwise direction. A quad-rotor helicopter has fixed pitch angle rotors and the rotor speeds are controlled in order to produce the desired lift forces.
The quadrotor helicopter has four actuatorsbrushless DC motors which exert lift forces F 1 , F 2 , F 3 , F 4 proportional to the square of the angular velocities of the rotors. Actually, four motor driver boards are needed to amplify the power delivered to the motors. Their rotation is transmitted to the propellers which move the entire structure.
The main thrust is the sum of individual trusts of each four motor. The pitch torque is a function of difference in forces produced on one pair of motors, while the roll torque is a function of difference in forces produced on other pair of motors. The yaw torque is the sum of all four motor reaction torques due to shaft acceleration and blades drag. The motor torque is opposed by a general aerodynamical drag. The complete dynamics of an aircraft, taking into account aero-elastic effects, flexibility of wings, internal dynamics of the engine, and the whole set of changing environmental variables is quite complex and somewhat unmanageable for the purpose of autonomous control engineering.
For a full dynamic model of a quad-rotor system both (1) the center of mass position vector of (x, y, z) in fixed frame coordinates and (2) the orientation Euler angles: roll, pitch, yaw angles (Φ, θ, ψ) around body axes X, Y, Z are considered for the vector of generalized coordinates q. Using the Euler-Lagrange approach it can be shown how the translational forces F ξ applied to the rotorcraft due to main trust can be full decoupled from the yaw, pitch and roll moments For a full dynamic model of a quad-rotor system both (1) the center of mass position vector of (x, y, z) in fixed frame coordinates and (2) the orientation Euler angles: roll, pitch, yaw angles (ϕ, θ, ψ) around body axes x, y, z are considered for the vector of generalized coordinates q. Using the Euler-Lagrange approach it can be shown how the translational forces F ξ applied to the rotorcraft due to main thrust can be full decoupled from the yaw, pitch and roll moments τ: where m is the quadrotor mass and g is the gravitational constant. , where J is a 3x3 matrix, called the inertia matrix and C is also a 3x3 matrix that refers to Coriolis, gyroscopic and centrifugal terms. Further on, for the scope of this paper we shall address only equation (2) as the quadrotor dynamic model to be identified. Equation (2) can be analyzed as three resultant torques i τ acting along the i th axes respectively as i∈ (ϕ, θ, ψ), which using Christoffel symbols of the first kind can be defined as a function of the state vector of Euler angles q = [ϕ, θ, ψ], their velocities ( The first component of equation (3) is shortly referred to as Jq ɺɺ the inertia matrix part, while the second as Cq ɺ the Coriolis matrix term for which components are defined as: are in general, highly non-linear scalar functions of the state vector q. They contain sin( . ) and cos( . ) functions of q, and their products and sums defined by the geometry of the system. There are general relations that can be used for reducing the number of unknown elements of J and C, like: (1) J is symmetric and (2) ijk D are Christoffelsymbols of ij D [12] thus further properties are inherently defined as: It should be noted that direct measurement of any single component from equation (4) is not possible; the only measurable data, on the output of the system, is the resultant torque of equation (3). Identification of all non-linear functions (4) under these terms is a considerable problem.

FUZZY-LOGIC SYSTEMS
The Takagi-Sugeno-Kang (TSK) type Fuzzy-logic systems (FLSs) having n inputs and 1 output are defined in [19] as: where M is the number of rules, q is the vector of n input variables, y l is a scalar function of n input variables, defined by (n+1) c parameters as in equation (8). The antecedent, the premise part of a fuzzy rule is: is the membership function (MF) of the i th input variable in the l th rule that defines the linguistic value F l(i) . The linguistic form of the l th rule from the previously described first order TSK FLS is defined in [20] as: Zadeh-formed MFs are the z-, the s-, and πfunctions (named after their shape) defined respectively as: are parameters defining MFs. In case there is more than one value q such that the degree of membership of q is equal to one, the interval where A linguistic variable can be assigned K different linguistic values, each described by a MF such that for every input q it holds , the MFs are said to form a fuzzy-partition. Forming fuzzy-partitions by antecedent membership functions ensures that there cannot be a numerical input within the defined input range that will not result in firing at least one rule consequent of the fuzzy model, which means that there is a defined output for all possible input states. Keeping specific properties of fuzzy-partitions imposes a set of hard constraints on membership function parameters as detailed in [20]. By imposing these restrictions on all linguistic variables of the FLS, and additionally assuming that the rule base is complete in the sense that it covers the whole input domain, it immediately follows that the TSK model structure of equation (6) simplifies to: Automatic fine tuning FLS parameters that satisfies all of above listed constraints is a significant problem.
In [21] a method is introduced that simplifies parameter optimisation of equation (11) while preserving all required constraints. Fuzzy-partitions can be simply formed from Zadeh-typed MFs by making equal the last two parameters of each preceding MF to the first two parameters of the succeeding MF. This way a fuzzy partition of K MFs is defined by parameters. Let our input space be normalised (q min = 0 and q max = 1). If we do not want to allow any plateaux, parameter b 2 must be equal to b 3 in equation (9) this way the number of parameters is further reduced to 2 − K .
b 1 <b 2 <…<b K-1 Let us add two more constraints: Let us define the first MF to be: Let the Kth, the last MF concluding the fuzzy partition be: Let us define intermediate k th MFs to be: for k = 1, …, 2 − K , where b 0 = 0 and b K-1 =1. This way the ordered series of 2 − K b i parameters together with constants 1 and 0 are the minimal number of parameters to define a fuzzy-partition of Zadeh-formed MFs.
This minimal number of non-linear parameters is a very important issue for optimisation as over parameterised systems are hard to optimise [22][23]. The only problem is that when tuning the non-linear parameters of a FLS having an n dimensional input space, we must comply with ∑ =

FME Transactions
Although there are a number of constrained optimisation methods it is obvious that an unconstrained optimisation method would be more efficient. Let us consider K-1 pieces of rational, positive or zero parameters as proposed in [ When we define b k as: for every k = 1, …, K-2; all the constraints of equation (12) and equation (13) are automatically fulfilled for every κ a from equation (18) without any further restrictions on any κ a , other than κ a ≤ 0 . An ANFIS like optimisation, defined in [23] or any other efficient unconstrained nonlinear numerical method can be applied to minimise equation (11) error along the κ a parameters. For calculating all linear parameters a linear least square (LS) method can be applied to c l(j) parameters of the consequent part. To avoid traps of local optimal solutions for κ a , a preliminary global search should be applied.

MULTI-OBJECTIVE GENETIC ALGORITHMS
A genetic algorithm (GA) is constructed on the basis of imitating natural biological processes and Darwinian evolution [24]. GAs are widely used as powerful global search and optimization tools [25]. Real life optimization problems often have multiple objectives. To establish ranking of chromosomes for Gas the comparison of two objective vectors is required. Often a simple weighted sum is used, but its drawbacks are widely known [9]. Pareto based comparison is the bases of a few popular methods like Non-dominated Sorting GA (NSGA) [7] and Multi-Objective GA (MOGA).
A general multi-objective optimization problem consists of n number of scalar minimization objectives where every scalar objective function fi(x) is to be minimized simultaneously, where x is an n-dimensional vector of parameters. As maximization can be easily transformed to minimization, the generality of the previous statement stands.
A vector x1 Pareto-dominates x2, when no scalar component of x2 is less than the appropriate component of x1, and at least one component of x1 is strictly smaller than the appropriate component of x2. Since no metrics can be assigned to Pareto-dominance, there have been two different attempts to define a GA ranking method, which can be used for Pareto-dominance vector comparison: (1) "Block-type" ranking is defined as: Rank is equal to 1 + (number of individuals that dominate the ith individual) (2) "Slice-type" ranking is defined in [7] as: Rank is equal to 1 + (number of turns when the non-dominated individuals are eliminated, needed for the ith individual to become non-dominated).
Quantity-dominance, as proposed in [10] is defined as: vector a=[ai] Quantity-dominates vector b=[bi] if a has more such ai components, which are better than the corresponding bi component of vector b, and a has less such aj components, which are worse than the corresponding bj. A metrics can be defined as: the measurement of the extent of Quantity-dominance is the difference between the number of better and the number of worse components. For a measurement based ranking method the Rank of the ith objective vector can be simply defined as the sum of Quantity-dominance measurements for every individual measured from the ith vector. This ranking method can be readily applied with Quantity-comparison.
The proposed vector comparison method provides more information when comparing two vectors than the classic Pareto-based comparison, thus the GA is faster, more efficient in its search. The MMNGA algorithm is computationally less expensive, and more efficient compared to the classical methods, and its results analyzed on a number of GA hard problems are at least equally good [10].
A simple method for realistic training data acquisition is to define the desired state position vector (x, y, z) with the desired yaw rotation angle . As the quadrotor dynamics is very sensitive to jerks, sudden changes in the third derivatives of state variables, the desired state variables will be defined through jounce (the fourth derivatives of state variables) with continuous functions π mf from equation (9), like: A general basic smooth jounce function setup to minimise jerk and thus results in smooth torque transitions is presented in Figure 3.
The proposal of this paper is to identify D ij inertia matrix components of the dynamic model in equation (4) as FLSs defined by equations (11) to (18), where the FLS general input variable q will be substituted for the appropriate state variables of (ϕ, θ, ψ). When the D ij inertia matrix components are constructed in this way, forming the D ijk components as Christoffel symbols is to be expressed by partial derivatives of equation (11) The unknown inertia matrix components of equation (2) to be identified are: Based on quadrotor system structure and inertia matrix symmetry the remaining inertia components are known to be: 11 12 21 12 31 13 32 23 , 0, , , Based on equation (5) Remaining D ijk components are trivial identities defined by equation (5).
The calculated roll, pitch and yaw motions are as presented on Figures 4, 5 and 6. The simulated resultant torque training data is as presented on Figure 7. parameters of the system are identified in a manner that first the input space is normalised to the unit hyper-cube. A set of non-linear parameters consists of six times four K a integer parameters for defining six fuzzy-partitions of five MFs each, where each partition consists of one z-type MF, three π-type MFs and one s-type MF as in (9)- (18). These six fuzzy-partitions serve as antecedents for the four fuzzy-systems like in equation (11) and (21), used for identifying D ij , ij= (13,22,23,33) as defined in equations (22)- (24) and (5).
Two unknown linear parameters D 11 and D 12 of the quadrotor model as in equation (23), together with 170 linear parameters of the four TSK FLSs (2 FLSs with 5 MFs on one input, each rule with 2 c parameters, plus 2 FLSs with 5 MFs on both of the 2 inputs, each rule with 3 c parameters) of equations (22) and equations (24) are determined by the SVD-based LS method as used in [21].
Concluded from equation (17) six fuzzy-partitions (antecedent part of 2 FLSs with 1 input, plus 2 FLSs with 2 inputs are covered by 6 independent fuzzypartitions) are represented by a vector of six times four K a parameters, which are optimized by a multiobjective hybrid genetic algorithm as detailed in [10]. Each chromosome evaluation is extended to include an additional round of nonlinear LSQ optimization of K a parameters. Chromosomes are updated before applying further GA operators, so the GA does not waste time on local optimization; only global search capabilities of the GA are utilized. Three objectives are set for minimization: (1) the root mean square of the torque identification error, (2) the maximum absolute error for any given training data input-output pair, and (3) the condition number of the linear system of equations used for LS calculation of linear parameters.
The GA is set to work on a population of 125, divided into 5 subpopulations with migration rate 0.2 taking place after each 5 completed generations. Crossover rate, generation gap and insertion rate is 0.8, selection pressure is 1.5. In each generation 4% of individuals are subject to mutation, when 1% of the binary genotype is mutated. Individuals, chromosomes are comprised of 24 Gray-coded integers, each consist of 16 bits. The initial population is set up in a completely random manner.
Matrix of the linear equation is pre-processed from equation (3), where FLSs like equation (11) and their partial derivatives like equation (21) are inserted as described in equations (22)- (24). Unknown linear parameters are D 11 , D 12 and the 170 c parameters of fuzzy-rule consequents.
Evaluation of each individual is done as follows: (1) Convert the coded K a values from the chromosome to b k by equation (18).
(3) The pre-processed matrix of the linear equation is evaluated with the MFs. (4) Linear components of equations (11) and (21) are calculated by SVD decomposition as described in [21]. (5) Next the K a parameters are fine-tuned by the Matlab "lsqnonlin" function, (6) MFs are re-calculated for the optimised K a parameters and all linear parameters are recalculated. (7) Resulting optimised K a parameters are re-assigned into the chromosome of the evaluated chromosome.
For the multi-objective rank assignment described in [10], the objective vector is created from: (i) the mean square of the identified torque error, (ii) the maximum absolute torque identification error and (iii) the condition number of the matrix of the linear equation.
Stochastic universal sampling is used for selecting the next generation without explicit elitism. To speed up the GA processing, a database of evaluated chromosomes and their objective vectors is created, so only unique new individuals are evaluated in each generation.
Convergence is achieved in some 50 generation evaluations, when the mean square error is in order of 5e -7 , the maximum torque error is <0.005 Nm. For nondominated chromosomes the condition number of the used matrix of linear equation is in order of 1e +38 .
One typical non-dominated chromosome and the corresponding torque identification error are presented on  The quality of identification with the relative torque error being uniformly <0.1% is excellent, suitable for taking part in a model based control algorithm.
The typical condition number for used linear parameter evaluations is very high for the used training data setup, so a more advanced trajectory has to be planned with sufficient inertia excitation along the complete input domain.
Also the FLS structure is to be made flexible, in terms that the GA should be able to turn off unnecessary MFs and thus reduce the number of unnecessary rules and linear parameters.