Dynamic Neural Network-Based Output Feedback …ncr.mae.ufl.edu/papers/JDSMC17_2.pdf · Dynamic...

Dynamic Neural Network-Based

Output Feedback Tracking Control

for Uncertain Nonlinear Systems

Huyen T. Dinh1

Department of Mechanical Engineering,

University of Transport and Communications,

Hanoi, Vietnam

e-mail: [email protected]

S. BhasinDepartment of Electrical Engineering,

Indian Institute of Technology,

Delhi, India


R. KamalapurkarSchool of Mechanical and Aerospace Engineering,

Oklahoma State University,

Stillwater, OK 74074


W. E. DixonDepartment of Mechanical and Aerospace Engineering,

University of Florida,

Gainesville, FL 32611


A dynamic neural network (DNN) observer-based outputfeedback controller for uncertain nonlinear systems with boundeddisturbances is developed. The DNN-based observer works inconjunction with a dynamic filter for state estimation using onlyoutput measurements during online operation. A sliding modeterm is included in the DNN structure to robustly account forexogenous disturbances and reconstruction errors. Weight updatelaws for the DNN, based on estimation errors, tracking errors,and the filter output are developed, which guarantee asymptoticregulation of the state estimation error. A combination of a DNNfeedforward term, along with the estimated state feedback andsliding mode terms yield an asymptotic tracking result. The devel-oped output feedback (OFB) method yields asymptotic trackingand asymptotic estimation of unmeasurable states for a class ofuncertain nonlinear systems with bounded disturbances. A two-link robot manipulator is used to investigate the performance ofthe proposed control approach. [DOI: 10.1115/1.4035871]

1 Introduction

The problem of output feedback (OFB) tracking control fornonlinear dynamic systems has been a topic of considerable inter-est over the past several decades. Motivation arises from the factthat full access to system states is sometimes impossible in manypractical systems. An obvious method to estimate the unmeasura-ble states is using ad hoc numerical differentiation. The simplicityof this technique makes it particularly useful for implementation.However, if output measurements are noisy, such numericaltechniques will amplify the high frequency content which mayproduce undesired oscillations or even system instability. Other

solutions can be classified as observer-based or filter-basedtechniques that utilize the output information for estimatingunmeasurable states. While observers estimate the output deriva-tive by approximating the system dynamics, filters approximatethe behavior of a differentiator over a range of frequencies.Hence, observer designs need partial or exact model knowledge ofthe system dynamics, whereas filters can provide a model-freemeans of estimating unmeasurable states.

Output feedback controllers using model-based observers weredeveloped in Refs. [1–4], based on the assumption of exact modelknowledge. OFB control for systems with parametric uncertaintieshave been developed in Refs. [5–7]. However, a limitation of suchprevious adaptive OFB control approaches is that only linear-in-the-parameters (LP) uncertainties are considered. As a result, ifuncertainties in the system do not satisfy the LP condition or ifthe system is affected by disturbances, the results will reduce to auniformly ultimately bounded result.

Neural network (NN) and fuzzy logic are employed to compen-sate adaptively for the uncertainties to relax the LP condition as inRefs. [8–14]; however, both estimation and tracking errors areonly guaranteed to be bounded due to the existence of reconstruc-tion errors. A semiglobal asymptotic OFB tracking result forsecond-order dynamic systems, under the condition that uncertaindynamics are first-order differentiable, was developed in Ref. [15]using a novel filter design. All of the uncertain nonlinearities inRef. [15] are damped out by a sliding mode term, so the discontin-uous controller requires high-gain. However, it is not clear how tosimply add a NN-based feedforward estimation of the nonlinear-ities in results such as Ref. [15] to mitigate the high-gain condi-tion, because of the need to inject nonlinear functions of theunmeasurable state. The approach used in this paper avoids thisissue by exploiting the recurrent nature of a dynamic neural net-work (DNN) structure to inject terms that cancel cross terms inthe stability analysis associated with the unmeasurable state.

In this paper and the preliminary work in Ref. [16], a DNN-basedobserver-controller is proposed for uncertain nonlinear systemsaffected by bounded disturbances, to achieve a two-fold result:asymptotic estimation of the unmeasurable states and asymptotictracking control. The uncertain dynamics are assumed to be first-order differentiable. The universal approximation property of DNNsis utilized to approximate the uncertain nonlinear system. A modi-fied version of the filter introduced in Ref. [15] is used to estimatethe output derivative. A combination of a NN feedforward term,along with estimated state feedback and sliding mode terms aredesigned for the controller. The DNN observer adapts online fornonlinear uncertainties and should heuristically perform better thana robust feedback observer. Weight update laws for the DNN basedon the estimation error, tracking error, and filter output are pro-posed. Asymptotic regulation of the estimation error and asymptotictracking are achieved. Experiments on a two-link robot manipulatorshow the effectiveness of the developed method compared with aproportional–integral–derivative (PID) controller and the approachin Ref. [15].

2 System Model and Objectives

Consider a control-affine second-order Euler–Lagrangelikenonlinear system of the form

€x ¼ f ðx; _xÞ þ GðxÞuþ d (1)

where x 2 Rn is the measurable output with a finite-initial condi-tion xð0Þ ¼ x0; u 2 Rn is the control input, f : R2n ! Rn andG : Rn ! Rn�n are continuous functions, and dðtÞ 2 Rn is anexogenous disturbance. The following assumptions about the sys-tem in Eq. (1) will be utilized in the subsequent development.

ASSUMPTION 1. The time derivatives of the system output _x; €xare not measurable.

ASSUMPTION 2. The unknown function f is C1, and the function Gis known, invertible, and the matrix inverse G�1 is bounded.

1Corresponding author.Contributed by the Dynamic Systems Division of ASME for publication in the

JOURNAL OF DYNAMIC SYSTEMS, MEASUREMENT, AND CONTROL. Manuscript receivedMay 3, 2016; final manuscript received January 20, 2017; published online May 10,2017. Assoc. Editor: Yongchun Fang.

Journal of Dynamic Systems, Measurement, and Control JULY 2017, Vol. 139 / 074502-1Copyright VC 2017 by ASME

Downloaded From: http://dynamicsystems.asmedigitalcollection.asme.org/pdfaccess.ashx?url=/data/journals/jdsmaa/936186/ on 08/02/2017 Terms of Use: http://www.asme.org/about-asme/terms-of-use

ASSUMPTION 3. The nonlinear disturbance d and its first timederivative are bounded (i.e., d; _d 2 L1Þ:

The universal approximation property of multilayer NNs(MLNN) states that given any continuous function F : S! Rn,where S is a compact set, there exist ideal weights such that theoutput of the NN, F approximates F to an arbitrary accuracy[17,18]. Hence, the unknown function f in Eq. (1) can be replacedby a MLNN, and the system can be represented as

€x ¼ WTrðVT1 xþ VT

2 _xÞ þ eðx; _xÞ þ Guþ d (2)

where W 2 RNþ1�n; V1;V2 2 Rn�N are unknown ideal constantweight matrices of the MLNN having N hidden layer neurons,

r¢rðVT1 xþ VT

2 _xÞ : R2n ! RNþ1 is the activation function(sigmoid, hyperbolic tangent, etc.), and e 2 Rn is a functionreconstruction error. The following assumptions will be used inthe DNN-based observer and controller development and stabilityanalysis.

ASSUMPTION 4. The ideal NN weights are bounded by knownpositive constants [19], i.e., jjWjj � W ; jjV1jj � V1; jjV2jj � V2.

ASSUMPTION 5. The activation function r and its partialderivatives r0; r00 are bounded [19]. This assumption is satisfiedfor typical activation functions (e.g., sigmoid, hyperbolic tangent).

ASSUMPTION 6. The function reconstruction error and its firsttime derivative are bounded [19], as jjejj � e1; jj_ejj � e2; wheree1; e2 are known positive constants.

A contribution of this paper is the development of a robustDNN-based observer such that the estimated states asymptoticallyconverge to the real states of the system (1), and a discontinuouscontroller enables the system state to asymptotically track adesired time-varying trajectory xd 2 Rn, despite uncertainties anddisturbances in the system. To quantify these objectives, an esti-mation error ~x 2 Rn and a tracking error e 2 Rn are defined as

~x¢x� x; e¢x� xd (3)

where x 2 Rn is the state of the DNN observer which is intro-duced in the subsequent development. The desired trajectory xd

and its derivatives xðiÞd (i ¼ 1; 2Þ; are assumed to exist and be

bounded. To compensate for the lack of direct measurements of _x,a filtered estimation error, res 2 Rn, and a filtered tracking error,rtr 2 Rn, are defined as

res¢ _~x þ a~x þ g; rtr¢ _e þ aeþ g (4)

where a 2 R is a positive constant gain, and g 2 Rn is an outputof the dynamic filter

g ¼ p� ðk þ aÞ~x_p ¼ �ðk þ 2aÞp� � þ ððk þ aÞ2 þ 1Þ~x þ e

_� ¼ p� a� � ðk þ aÞ~x; pð0Þ ¼ ðk þ aÞ~xð0Þ; �ð0Þ ¼ 0

(5)

where � 2 Rn is another output of the filter, p 2 Rn is used as aninternal filter variable, and k 2 R is a positive constant controlgain. The filtered estimation error res and the filtered trackingerror rtr are not measurable since the expressions in Eq. (4) dependon _x:

Remark 1. The basic structure of the second-order dynamic fil-ter in Eq. (5) was first proposed in Eq. [15]. The filter in Eq. (5)admits the estimation error ~x and the tracking error e as its inputsand produces two signal outputs � and g. An interesting point isthat there is a virtual filter inside the introduced filter, where g isthe filtered signal of � since g and � are related as g ¼ _� þ a�.The auxiliary signal p is utilized to only generate the signal gwithout involving the unmeasurable state _x. Hence, the filter canbe physically implemented since it depends only on the estimationerror ~x and the tracking error e which are measurable.

3 Dynamic Neural Network-Based Robust Observer

The following MLDNN architecture is proposed to observe thesystem in Eq. (1)

€x ¼ WTr þ Gu� ðk þ 3aÞgþ b1sgnð~x þ �Þ (6)

where ½xT _xT�T 2 R2n are the states of the DNN observer,

W 2 RNþ1�n, V1; V2 2 Rn�N are the weight estimates,

r¢rðVT

1 x þ VT

2_xÞ : R2n ! RNþ1, and b1 2 R is a positive con-

stant control gain.Remark 2. The term ðk þ 3aÞg in the DNN observer in Eq. (6)

is a cross-term which is canceled in the stability analysis. Thesliding mode term sgnð~x þ �Þ is added to the observer structureto provide robustness against NN reconstruction errors and

external disturbances. The NN term WTr receives feedback of the

observer states x; _x as inputs; hence the observer exploits a DNNstructure. Motivation for the DNN-based observer design is thatthe DNN is proven to approximate nonlinear dynamic systemswith any degree of accuracy [17,20], and the DNN includesstate feedback yielding computational advantages over a staticfeedforward NN [21].

The weight update laws for the DNN in (6) are developed basedon the subsequent stability analysis as

_W ¼ Cwproj½rdð~x þ eþ 2�ÞT�_V 1 ¼ Cv1proj½xdð~x þ eþ 2�ÞTW

Tr0d�

_V 2 ¼ Cv2proj½ _xdð~x þ eþ 2�ÞTWTr0d�

(7)

where Cw 2 RðNþ1Þ�ðNþ1Þ; Cv1;Cv2 2 Rn�n; are constant sym-metric positive-definite adaptation gains, the terms rd; r

0d are

defined as rd¢rðV T

1 xd þ VT

2 _xdÞ; r 0d¢drð1Þ=d1j1¼V

T

1 xdþVT

2 _xd, and

projð�Þ is a smooth projection operator [22,23] used to guarantee

that the weight estimates W ; V1; V 2 remain bounded.To facilitate the subsequent analysis, Eqs. (4) and (5) can be

used to express the time derivative of g as

_g ¼ �ðk þ aÞres � agþ ~x þ e� � (8)

The closed-loop dynamics of the filtered estimation error inEq. (4) can be determined by using Eqs. (2)–(4), (6), and (8) as

_res ¼ WTr� WTr þ eþ d þ ðk þ 3aÞg� b1sgnð~x þ �Þ

þ aðres � a~x � gÞ � ðk þ aÞres � agþ ~x þ e� � (9)

Adding and subtracting WTrd þWTrd þ WTrd , where rd¢r

ðVT1 xd þ VT

2 _xdÞ, the expression in Eq. (9) can be rewritten as

_res ¼ ~N1 þ N � kres � b1sgnð~x þ �Þ þ ðk þ aÞg� ~x (10)

where the auxiliary term ~N1 2 Rn is defined as

~N1¢WTðr� rdÞ � WTðr � rdÞ � ða2 � 2Þ~x � � þ e (11)

and N 2 Rn is segregated into two parts as

N¢ND þ NB (12)

In Eq. (12), ND; NB 2 Rn are defined as

ND¢eþ d; NB¢NB1þ NB2

(13)

074502-2 / Vol. 139, JULY 2017 Transactions of the ASME


In Eq. (13), NB1; NB2

2 Rn are defined as

NB1¢WTOð ~V T

1 xd þ ~VT

2 _xdÞ2 þ ~WTr 0dð ~V

T

1 xd þ ~VT

2 _xdÞ

NB2¢ ~W

Trd þ W

Tr0dð ~V

T

1 xd þ ~VT

2 _xdÞ(14)

where ~W¢W � W 2 RNþ1�n; ~V 1¢V1 � V 1 2 Rn�N ; ~V2¢V2

�V2 2 Rn�N are the estimate mismatches for the ideal NN

weights, and Oð ~VT

1 xd þ ~VT

2 _xdÞ2 2 RNþ1 is the higher-order termin the Taylor series of the vector functions rd in the neighborhood

of VT

1 xd þ VT

2 _xd as

rd ¼ rd þ r 0dð ~VT

1 xd þ ~VT

2 _xdÞ þ Oð ~VT

1 xd þ ~VT

2 _xdÞ2 (15)

Motivation for segregating the terms in Eqs. (10), (12), and (13) isderived from the fact that different terms have different bounds.The term ~N1 includes all terms which can be upper bounded bystates, whereas N includes all terms which can be upper boundedby constants. The difference between the terms ND and NB inEq. (12) is that the first time derivative of ND is upper-bounded bya constant, whereas the term _NB is state dependent. The term NB

is further segregated as Eq. (13) to aid in the weight update lawdesign for the DNN in Eq. (7). In the subsequent stability analysis,the term NB1

is canceled by the error feedback and the slidingmode term, while the term NB2

is partially compensated for by theweight update laws and partially canceled by the error feedbackand the sliding mode term.

Using Eqs. (3), (4), and Assumptions 4; 5; the projð�Þ algorithmin Eq. (7) and the mean value theorem [24], the auxiliary function~N1 in Eq. (11) can be upper-bounded as

jj ~N1jj � f1jjzjj (16)

where f1 2 R is a computable positive constant, and z 2 R6n isdefined as

z¢½~xT eT rTes rT

tr �T gT�T (17)

Based on Assumptions 3–6, the Taylor series expansion inEq. (15), and the weight update laws in Eq. (7), the followingbounds can be developed:

jjNDjj � f2; jjNB1jj � f3; jjNB2

jj � f4

jj _NDjj � f5; jj _NBjj � f6 þ f7jjzjj(18)

where fi 2 R; i ¼ 2; 3;…; 7; are computable positive constants.

4 Robust Adaptive Tracking Controller

The control objective is to force the system state to asymptoti-cally track the desired trajectory xd, despite uncertainties anddisturbances in the system. Quantitatively, the objective is to reg-ulate the filtered tracking controller rtr to zero. Using Eqs. (2)–(4)and (8), the open-loop dynamics of the tracking error in Eq. (4)are expressed as

_r tr ¼ WTrþ Guþ eþ d � €xd þ aðrtr � ae� gÞ�ðk þ aÞres � agþ ~x þ e� � (19)

The control input u is designed as a composition of theDNN term, the estimated states x; _x, and the sliding mode term as

u ¼ G�1½€xd � WTrd � ðk þ aÞð _e þ aeÞ � b2sgnðeþ �Þ� (20)

where b2 2 R is a positive constant control gain, and the trackingerror estimate e 2 Rn is defined as e¢x � xd: Based on the factthat the estimated states are measurable, the tracking error esti-mate e and its derivative _e are measurable; moreover, rtr is relatedto res as

rtr ¼ res þ _e þ ae (21)

Adding and subtracting WTrd þWTrd and using Eqs. (19)–(21),the closed-loop error system becomes

_r tr ¼ ~N2 þ N � krtr � b2sgnðeþ �Þ � e (22)

where the auxiliary function ~N2 2 Rn is defined as

~N2¢WTðr� rdÞ � ða2 � 2Þe� � þ ~x � 2ag (23)

and the function N is introduced in Eq. (12). Similarly, usingEqs. (3), (4), Assumptions 4; 5; the projð�Þ algorithm in Eq. (7),and mean value theorem [24], the auxiliary function ~N2 inEq. (23) can be upper-bounded as

jj ~N2jj � f8jjzjj (24)

where f8 2 R is a computable positive constant.To facilitate the subsequent stability analysis, let y 2 R6nþ2 be

defined as y¢½zTffiffiffiPp ffiffiffiffi

Qp�T where the auxiliary function

P 2 R is the Filippov solution to the differential equation

_P¢L

P0 ¼ b1

Xn

j¼1

j~xjð0Þ þ �jð0Þj þ b2

Xn

j¼1

jejð0Þ þ �jð0Þj

� ð~xð0Þ þ eð0Þ þ 2�ð0ÞÞTNð0Þ (25)

where the subscript j ¼ 1; 2; ::; n denotes the jth element of ~xð0Þ,e(0), or �ð0Þ, and the auxiliary term L 2 R is defined as

L¢� rTesðND þ NB1

� b1sgnð~x þ �ÞÞ� rT

trðND þ NB1� b2sgnðeþ �ÞÞ

� ð _~x þ _e þ 2 _�ÞTNB2þ b3jjzjj2 (26)

where b1; b2 are introduced in Eqs. (6) and (20), and b3 2 R is apositive constant. The control gains bi; i ¼ 1; 2; 3 are selectedaccording to the sufficient conditions

b1;b2 > max f2 þ f3 þ f4; f2 þ f3 þf5

aþ f6

a

� �(27)

b3 > 2f7

where fi, i ¼ 1; 2;…; 7 are introduced in Eqs. (16) and (18).Provided the sufficient conditions in Eq. (27) are satisfied, thefollowing inequality can be obtained P � 0 (see Ref. [25]). Theauxiliary function Q 2 R is defined as

Q¢a2

tr ~WTC�1

w~W

� �þ tr ~V

T

1 C�1v1

~V1

� �þ tr ~V

T

2 C�1v2

~V2

� �h i(28)

where trð�Þ denotes the trace of a matrix. Since the gainsCw;Cv1;Cv2 are symmetric, positive-definite matrices, Q � 0:

5 Lyapunov Stability Analysis for Dynamic Neural

Network-Based Observation and Control

THEOREM 1. The DNN-based observer and controller proposedin Eqs. (6) and (20), respectively, along with the weight updatelaws in Eq. (7) ensure asymptotic estimation and tracking in sensethat

jj _~xjj ! 0 as t!1; and jjejj ! 0 as t!1

provided the gain conditions in Eq. (27) are satisfied, and the con-trol gains a and k ¼ k1 þ k2 introduced in Eqs. (4) and (5) areselected as

Journal of Dynamic Systems, Measurement, and Control JULY 2017, Vol. 139 / 074502-3


k¢min a; k1ð Þ > f21 þ f2

8

4k2

þ b3 (29)

where f1; f8; b3 are introduced in Eqs. (16), (24), and (26),respectively.

Proof. Consider the Lyapunov candidate function VL : D ! R;which is a Lipschitz continuous positive definite functiondefined as

VL¢1

2~xT~x þ 1

2eTeþ 1

2�T� þ 1

2gTgþ 1

2rT

esres þ1

2rT

trrtr þ Pþ Q

(30)

which satisfies the following inequalities:

1

2jjyjj2 � VL � jjyjj2 (31)

Let _y ¼ h represent the closed-loop differential equations in

Eqs. (4)–(7), (8), (10), (22), and (25), where h 2 R6nþ2 denotesthe right-hand side of the closed-loop error signals. UsingFilippov’s theory of differential inclusions [26–29], the existenceof solutions can be established for _y 2 K½h�ðyÞ, where

K½h�¢\d>0 \lM¼0 cohðBðy; dÞ �MÞ; where \lM¼0 denotes theintersection of all sets M of Lebesgue measure zero, co denotes

convex closure, and Bðy; dÞ ¼ fw 2 R6nþ2jjjy� wjj < dg: Thegeneralized time derivative of Eq. (30) exists almost everywhere

(a.e.), i.e., for almost all t 2 ½t0; tf �, and _VL2a:e: _~VL, where_~V L ¼ \n2@VLðyÞ n

TK½W�T; @VL is the generalized gradient of VL

[30], and W¢½ _~xT_eT _�T _gT _rT

es _rTtrð1=2ÞP�ð1=2Þ _Pð1=2ÞQ�ð1=2Þ _Q�.

Since VL is continuously differentiable, _~V L can be simplifiedas [31]

_~V L ¼ rVTKWT ¼ ~xT eT �T gT rTes rT

tr 2P12 2Q

12

h iKWT

Using the calculus for K½�� from Ref. [32] (Theorem 1; Properties2, 5, 7), and substituting the dynamics from Eqs. (4), (5), (8), (10),

(22), (25), (26), and (28) ; _~VL can be rewritten as

_~V L � ~xTðres � a~x � gÞ þ eTðrtr � ae� gÞþ gT½�ðk þ aÞres � agþ ~x þ e� ��þ �Tðg� a�Þ þ rT

esfðk þ aÞg� ~xgþ rT

esf ~N1 þ N � kres � b1K½sgnð~x þ �Þ�gþ rT

trf ~N2 þ N � krtr � b2K½sgnðeþ �Þ� � eg� rT

esfND þ NB1� b1K½sgnð~x þ �Þ�g

� rTtrfND þ NB1

� b2K½sgnðeþ �Þ�g þ b3jjzjj2

� ð _~x þ _e þ 2 _�ÞTNB2� atrð ~W

TC�1

w_WÞ

� atrð ~VT

1 C�1v1

_V1Þ � atrð ~V T

2 C�1v2

_V 2Þ (32)

Using the fact that K½sgnð~x þ �Þ� ¼ SGNð~x þ �Þ [32], such thatSGNð~xi þ �iÞ ¼ 1 if ð~xi þ �iÞ > 0; ½�1; 1� if ð~xi þ �iÞ ¼ 0; and�1 if ð~xi þ �iÞ < 0 (the subscript i denotes the ith element), the setin Eq. (32) reduces to the scalar inequality, since the right-handside is continuous a.e., i.e., the right-hand side is continuous except

for the Lebesgue measure zero set of times when1 rTesSGNð~x þ �Þ

�rTesSGNð~x þ �Þ ¼ 0. The fact that rT

trSGNðeþ �Þ � rTtr SGN

ðeþ �Þ ¼ 0 can be achieved similarly. Substituting the weightupdate laws in Eq. (7) and canceling common terms yields

_~VL �a:e:�a~xT~x � aeTe� a�T� � agTg� krT

esres

� krTtrrtr þ rT

es~N1 þ rT

tr~N2 þ b3jjzjj2 (33)

Using Eqs. (16) and (24), substituting k ¼ k1 þ k2; and complet-ing the squares, the expression in Eq. (33) can be furtherbounded as

_~VL �a:e:�ajj~xjj2 � ajjejj2 � ajj�jj2 � ajjgjj2 � k1jjresjj2

� k1jjrtrjj2 þf2

1 þ f28

4k2

þ b3

!jjzjj2

�a:e:� k� f2

1 þ f28

4k2

� b3

!jjzjj2 �

a:e:�cjjzjj2 (34)

for some positive constant c, and k is defined in Eq. (29). Theinequalities in Eqs. (31) and (34) show that VL 2 L1; hence,~x; e; �; g; res; rtr; P, and Q 2 L1. Using Eq. (4), it can be shownthat _~x; _e 2 L1: Based on the assumption that xd; _xd 2 L1; ande; _e 2 L1; x; _x 2 L1 by Eq. (3); moreover, using Eq. (3) and~x; _~x 2 L1; x; _x 2 L1: Based on Assumptions 2 and 5, the pro-jection algorithm in Eq. (7), the boundedness of the sgn and rfunctions, and xd; _xd; x; _x 2 L1, the control input u is boundedfrom Eq. (20). Similarly, _�; _g; _res; _r tr 2 L1 by using Eqs. (5),(8), (9), (22); hence _z 2 L1; using Eq. (17); hence, z is uniformlycontinuous. From Eq. (34), Ref. [33, Corollary 1] can be invokedto show that cjjzjj2 ! 0 as t!1: Using the definition of z inEq. (17), it can be shown that jj~xjj; jjejj; jjresjj; jjrtrjj; jj�jj; jjgjj! 0 as t!1: Using Eq. (4), and standard linear analysis, it canbe further shown that jj _~xjj ! 0 as t!1: �

6 Experiment Results

The performance of the output feedback control method istested on a two-link robot manipulator, where two aluminumlinks are mounted on a 240 N�m (first link) and a 20 N�m (secondlink) switched reluctance motor. The motor resolvers providerotor position measurements with a resolution of 614,400 pulses/revolution. Data acquisition and control implementation were per-formed in real-time using QNX at a frequency of 1.0 kHz. Thetwo-link revolute robot is modeled with the following dynamics:

M€x þ Vm _x þ Fþ sd ¼ u (35)

where x ¼ ½ x1 x2 �T are the angular positions (rad), _x ¼½ _x1 _x2 �T are the angular velocities ðrad=sÞ of the two links,

respectively, M 2 R2�2 is the inertia matrix, Vm 2 R2�2 denotes

the centripetal-Coriolis matrix, F 2 R2 denotes friction, and

sd 2 R2 is the external disturbance. The system in Eq. (35) can be

rewritten as €x ¼ f þ Guþ d; where f and G are defined as f¢�M�1ðVm _x þ FÞ; GðxÞ¢M�1: The desired trajectory for each linkof the manipulator is given as (in degrees) x1d ¼ 30 sinð1:5tÞð1� expð�0:01t3ÞÞ; x2d ¼ 30 sinð2:0tÞð1� expð�0:05t3ÞÞ: Thecontrol gains are chosen as k ¼ diagð25; 90Þ; a ¼ diagð22; 30Þ,b1 ¼ b2 ¼ 0:2; and Cw ¼ 0:2I8�8; Cv1 ¼ Cv2 ¼ 0:2I2�2, whereIn�n denotes an identity matrix of appropriate dimensions. TheNNs was implemented with seven hidden layer neurons, and theneural network weights are initialized as uniformly distributedrandom numbers in the interval ½0:1; 0:3�. The initial conditions of

the system and the observer were selected as x ¼ _x ¼ ½0 0�T, and

x ¼ _x ¼ ½0 0�T, respectively.The Lyapunov-based analysis provides conservative sufficient

gain conditions. The control gains for the experiments wereobtained by choosing gains and then adjusting them based on thetransient and steady-state performance. If the response exhibited a

1Let U¢~x þ � The set of times K¢ft 2 ½0;1Þ : resðtÞT K½sgnðUðtÞÞ��resðtÞT K½sgnðUðtÞÞ� 6¼ f0gg is equal to the set of times ft : UðtÞ ¼ 0 � resðtÞ 6¼ 0gUsing the fact that g ¼ _� þ a�, res can be expressed as res ¼ _U þ aU Thus, the setK can also be represented by ft : UðtÞ ¼ 0 � _UðtÞ 6¼ 0g Since / : ½0;1Þ ! Rn iscontinuously differentiable, it can be shown that the set of time instancesft : UðtÞ ¼ 0 � _UðtÞ 6¼ 0g is isolated, and thus, measure zero; hence, K is measurezero.



prolonged transient response (compared with the response obtainedwith other gains), the proportional gains were adjusted. If theresponse exhibited overshoot, derivative gains were adjusted. Thecontrol gains for the experiments were tuned based on this trial anderror basis. In contrast to this trial and error approach, the controlgains could have been adjusted using more methodical approachesas described in various survey papers on the topic [34,35].

The performance of the proposed output feedback controller iscompared with two controllers: a classical PID controller and thediscontinuous OFB controller in Ref. [15]. A standard backward dif-ference algorithm is used to numerically determine velocity fromthe encoder readings to implement the PID controller. Control gainsfor the discontinuous controller in Ref. [15] were selected asK1 ¼ 0:2; K2 ¼ diagð410; 38Þ, and control gains for the PID con-troller were selected as Kd ¼ diagð120; 30Þ; Kp ¼ diagð750; 90Þ;and Ki ¼ diagð650; 100Þ: The DNN-based observer yields bettervelocity estimation in comparison with the high frequency contentresults from a backward difference method as depicted in Fig. 1.Moreover, the tracking errors and control torques for all controllersare illustrated in Figs. 2 and 3, respectively. Table 1 shows the root-mean-square (RMS) and peak tracking errors and torques of Links 1and 2 at steady-state for all methods. The developed controller isshown to exhibit lower tracking errors with less control authoritythan the comparative controllers. Hence, the experiments illustratethat using the velocity estimation from a DNN-based observer,which adaptively compensates for unknown uncertainties in the

system, results in improved control performance with lower fre-quency content than the compared methods. To illustrate the lowerfrequency response of the proposed method compared to Ref. [15]and the PID controller, the frequency analysis plots of the experi-ment results are shown in Fig. 4. The high frequency content in thevelocity estimation of the backward difference method results in thehighest frequency content in control torques of PID controller. Theproposed method is a robust adaptive controller with a DNN struc-ture to learn the system uncertainties to asymptotically observe theunmeasurable state and asymptotically track the desired trajectory.On the other hand, the OFB control method in Ref. [15] is a purelyrobust feedback method, where all uncertainties are damped out bya sliding mode term resulting in higher frequency control torquesthan the proposed adaptive control method, as seen in experimentresults.

7 Conclusion

A DNN observer-based output feedback control of a class ofsecond-order nonlinear uncertain systems is developed. TheDNN-based observer works in conjunction with a dynamic filterto estimate the unmeasurable state. The DNN is updated online byweight update laws based on the estimation error, tracking error,and filter output. The controller is a combination of the NN feed-forward term, and the estimated state feedback and sliding modeterms. Asymptotic estimation of the unmeasurable state and

Fig. 1 Velocity estimation _x ðtÞ using (a) DNN-based observer and (b) numerical backwards difference: (a) velocity estima-tion by DNN observer and (b) velocity estimation by backwards difference

Fig. 2 The tracking errors e(t) of (a) link 1 and (b) link 2 using classical PID, robust discontinuous OFB controller [15], andproposed controller: (a) link 1 tracking error and (b) link 2 tracking error



asymptotic tracking results are achieved, simultaneously. Resultsfrom an experiment using a two-link robot manipulator demon-strate the performance of the proposed output feedback controller.

Acknowledgment

This research was supported in part by NSF award numbers0547448, 0901491, 1161260, and 1217908.

References[1] Berghuis, H., and Nijmeijer, H., 1993, “A Passivity Approach to Controller-

Observer Design for Robots,” IEEE Trans. Robot. Autom., 9(6), pp. 740–754.

[2] Do, K. D., Jiang, Z., and Pan, J., 2004, “A Global Output-Feedback Controllerfor Simultaneous Tracking and Stabilization for Unicycle-Type MobileRobots,” IEEE Trans. Robot. Autom., 20(3), pp. 589–594.

[3] Lim, S. Y., Dawson, D. M., and Anderson, K., 1996, “Re-Examining theNicosia–Tomei Robot Observer-Controller for a Backstepping Perspective,”IEEE Trans. Control Syst. Technol., 4(3), pp. 304–310.

[4] Ordaz, P., Espinoza, E. S., and Munoz, F., 2014, “Global Stability of PDþController With Velocity Estimation,” 53rd IEEE Conference on Decision andControl, Dec. 15–17, pp. 2585–2590.

[5] Arteaga, M. A., and Kelly, R., 2004, “Robot Control Without Velocity Meas-urements: New Theory and Experiment Results,” IEEE Trans. Robot. Autom.,20(2), pp. 297–308.

[6] Kaneko, K., and Horowitz, R., 1997, “Repetitive and Adaptive Control ofRobot Manipulators With Velocity Estimation,” IEEE Trans. Robot. Autom.,13(2), pp. 204–217.

Fig. 3 The control inputs for link 1 and link 2 using (a), (d) classical PID, (b), (e) robust discontinuous OFB controller [15],and (c), (f) proposed controller: (a) link 1 control input (PID controller), (b) link 1 control input (robust OFB controller), (c) link1 control input (proposed), (d) link 2 control input (PID controller), (e) link 2 control input (robust OFB controller), and (f) link2 control input (proposed)

Table 1 Steady-state RMS errors and torques for each of the analyzed control designs

SSRMS e1 SSRMS e2 Max je1j Max je2j SSRMS s1 SSRMS s2 Max js1j Max js2j

Classical PID 0.4538 0.2700 0.7371 0.5267 6.5805 2.4133 14.5871 9.0015Robust OFB [15] 0.3552 0.2947 0.5819 0.6429 8.6509 1.2585 56.5796 4.6107Proposed 0.1743 0.1740 0.3100 0.3760 6.3484 0.6944 12.5562 2.2122

Fig. 4 Frequency analysis of torques u(t) using (a) classical PID and (b) robust discontinuous OFB controller [15], and (c) pro-posed controller: (a) frequency analysis (PID controller), (b) frequency analysis (robust OFB controller), and (c) frequencyanalysis (proposed)



http://dx.doi.org/10.1109/70.265918

http://dx.doi.org/10.1109/TRA.2004.825470

http://dx.doi.org/10.1109/87.491205

http://dx.doi.org/10.1109/CDC.2014.7039784

http://dx.doi.org/10.1109/TRA.2003.820872

http://dx.doi.org/10.1109/70.563643

[7] Burg, T., Dawson, D. M., and Vedagarbha, P., 1997, “A Redesigned Dcal Con-troller Without Velocity Measurements: Theory and Demonstration,” Robotica,15(4), pp. 337–346.

[8] Kim, Y. H., and Lewis, F. L., 1999, “Neural Network Output Feedback Controlof Robot Manipulators,” IEEE Trans. Robot. Autom., 15(2), pp. 301–309.

[9] Patino, H. D., and Liu, D., 2000, “Neural Network-Based Model ReferenceAdaptive Control System,” IEEE Trans. Syst., Man, Cybern., Part B, 30(1),pp. 198–204.

[10] Seshagiri, S., and Khalil, H., 2000, “Output Feedback Control of NonlinearSystems Using RBF Neural Networks,” IEEE Trans. Neural Network, 11(1),pp. 69–79.

[11] Choi, J. Y., and Farrell, J. A., 2001, “Adaptive Observer Backstepping ControlUsing Neural Network,” IEEE Trans. Neural Network, 12(5), pp. 1103–1112.

[12] Calise, A. J., Hovakimyan, N., and Idan, M., 2001, “Adaptive Output FeedbackControl of Nonlinear Systems Using Neural Networks,” Automatica, 37(8),pp. 1201–1211.

[13] Hovakimyan, N., Nardi, F., Calise, A., and Kim, N., 2002, “Adaptive OutputFeedback Control of Uncertain Nonlinear Systems Using Single-Hidden-LayerNeural Networks,” IEEE Trans. Neural Networks, 13(6), pp. 1420–1431.

[14] Islam, S., and Liu, P., 2011, “Robust Adaptive Fuzzy Output Feedback ControlSystem for Robot Manipulators,” IEEE/ASME Trans. Mechatron., 16(2),pp. 288–296.

[15] Xian, B., de Queiroz, M. S., Dawson, D. M., and McIntyre, M., 2004, “A Dis-continuous Output Feedback Controller and Velocity Observer for NonlinearMechanical Systems,” Automatica, 40(4), pp. 695–700.

[16] Dinh, H. T., Bhasin, S., Kim, D., and Dixon, W. E., 2012, “Dynamic NeuralNetwork-Based Global Output Feedback Tracking Control for UncertainSecond-Order Nonlinear Systems,” American Control Conference, Montr�eal,Canada, June 27–29, pp. 6418–6423.

[17] Polycarpou, M., and Ioannou, P., 1991, “Identification and Control ofNonlinear Systems Using Neural Network Models: Design and StabilityAnalysis,” Systems Report, University of Southern California, Los Angeles,CA, Tech. Report No. 91-09-01.

[18] Hornick, K., 1991, “Approximation Capabilities of Multilayer FeedforwardNetworks,” Neural Networks, 4(2), pp. 251–257.

[19] Lewis, F. L., Selmic, R., and Campos, J., 2002, Neuro-Fuzzy Control of Indus-trial Systems With Actuator Nonlinearities, Society for Industrial and AppliedMathematics, Philadelphia, PA.

[20] Funahashi, K., and Nakamura, Y., 1993, “Approximation of Dynamic Systems byContinuous-Time Recurrent Neural Networks,” Neural Networks, 6(6), pp. 801–806.

[21] Gupta, M., Jin, L., and Homma, N., 2003, Static and Dynamic Neural Net-works: From Fundamentals to Advanced Theory, Wiley, Hoboken, NJ.

[22] Dixon, W. E., Behal, A., Dawson, D. M., and Nagarkatti, S., 2003, NonlinearControl of Engineering Systems: A Lyapunov-Based Approach, Birkhauser,Boston, MA.

[23] Krstic, M., Kokotovic, P. V., and Kanellakopoulos, I., 1995, Nonlinear andAdaptive Control Design, Wiley, New York.

[24] Xian, B., Dawson, D. M., de Queiroz, M. S., and Chen, J., 2004, “A ContinuousAsymptotic Tracking Control Strategy for Uncertain Nonlinear Systems,” IEEETrans. Autom. Control, 49(7), pp. 1206–1211.

[25] Dinh, H., Bhasin, S., and Dixon, W. E., 2010, “Dynamic Neural Network-BasedRobust Identification and Control of a Class of Nonlinear Systems,” IEEE Con-ference on Decision Control, Atlanta, GA, Dec. 15–17, pp. 5536–5541.

[26] Filippov, A., 1964, “Differential Equations With Discontinuous Right-HandSide,” Am. Math. Soc. Transl., 42(2), pp. 199–231.

[27] Filippov, A. F., 1988, Differential Equations With Discontinuous Right-HandSides, Kluwer Academic Publishers, Dordrecht, The Netherlands.

[28] Smirnov, G. V., 2002, Introduction to the Theory of Differential Inclusions,Vol. 41, American Mathematical Society, Providence, RI.

[29] Aubin, J. P., and Frankowska, H., 2009, Set-Valued Analysis, Birkh€auser, Bos-ton, MA.

[30] Clarke, F. H., 1990, Optimization and Nonsmooth Analysis, SIAM, New York.[31] Shevitz, D., and Paden, B., 1994, “Lyapunov Stability Theory of Nonsmooth

Systems,” IEEE Trans. Autom. Control, 39(9), pp. 1910–1914.[32] Paden, B., and Sastry, S., 1987, “A Calculus for Computing Filippov’s Differ-

ential Inclusion With Application to the Variable Structure Control of RobotManipulators,” IEEE Trans. Circuits Syst., 34(1), pp. 73–82.

[33] Fischer, N., Kamalapurkar, R., and Dixon, W. E., 2013, “Lasalle-YoshizawaCorollaries for Nonsmooth Systems,” IEEE Trans. Automat. Control, 58(9), pp.2333–2338.

[34] Killingsworth, N. J., and Krstic, M., 2006, “PID Tuning Using ExtremumSeeking: Online, Model-Free Performance Optimization,” IEEE Control Syst.Mag., 26(1), pp. 70–79.

[35] Astr€om, K., H€agglund, T., Hang, C., and Ho, W., 1993, “Automatic Tuning andAdaptation for PID Controllers—A Survey,” Control Eng. Pract., 1(4),pp. 669–714.



http://dx.doi.org/10.1017/S0263574797000386

http://dx.doi.org/10.1109/70.760351

http://dx.doi.org/10.1109/3477.826961

http://dx.doi.org/10.1109/72.822511

http://dx.doi.org/10.1109/72.950139

http://dx.doi.org/10.1016/S0005-1098(01)00070-X

http://dx.doi.org/10.1109/TNN.2002.804289

http://dx.doi.org/10.1109/TMECH.2010.2041464

http://dx.doi.org/10.1016/j.automatica.2003.12.007

http://ieeexplore.ieee.org/abstract/document/6315234/

https://pdfs.semanticscholar.org/4b72/5bc82249baf6149c7c9c7aa4240f29a7fcc8.pdf

http://dx.doi.org/10.1016/0893-6080(91)90009-T

http://dx.doi.org/10.1137/1.9780898717563

http://dx.doi.org/10.1137/1.9780898717563

http://dx.doi.org/10.1016/S0893-6080(05)80125-X

http://dx.doi.org/10.1002/0471427950

http://dx.doi.org/10.1002/0471427950

http://dx.doi.org/10.1007/978-1-4612-0031-4

http://dx.doi.org/10.1007/978-1-4612-0031-4

http://dx.doi.org/10.1109/TAC.2004.831148


http://dx.doi.org/10.1109/CDC.2010.5717445

http://dx.doi.org/10.1090/trans2/042/13

http://dx.doi.org/10.1007/978-0-8176-4848-0

http://dx.doi.org/10.1137/1.9781611971309

http://dx.doi.org/10.1109/9.317122

http://dx.doi.org/10.1109/TCS.1987.1086038


http://dx.doi.org/10.1109/MCS.2006.1580155

http://dx.doi.org/10.1109/MCS.2006.1580155

http://dx.doi.org/10.1016/0967-0661(93)91394-C

Dynamic Neural Network-Based Output Feedback …ncr.mae.ufl.edu/papers/JDSMC17_2.pdf · Dynamic...

Documents

Transcript of Dynamic Neural Network-Based Output Feedback …ncr.mae.ufl.edu/papers/JDSMC17_2.pdf · Dynamic...