LEARNING ALGORITHMS for SERVO- MECHANISM TIME SUBOPTIMAL CONTROL

LEARNING ALGORITHMS for SERVO- MECHANISMLEARNING ALGORITHMS for SERVO- MECHANISM TIMETIME SUBOPTIMAL CONTROLSUBOPTIMAL CONTROL

1 - Time Optimal Control - Switching Function (SwF)2 - Sliding Mode Control (SMC), Adaptive Sliding Mode Control3 - Learning Control (LC) based on SMC – approximation of SwF4 – LC based on Neural Nets – quasi real time computation5 – LC based on Identification – real time computation of SwF6 - Real Time Simulation

M. Alexík, University of Žilina, Slovak Republic

(6+1)x 0.6 kg(6+1)x 0.6 kg

Cart with variable loadCart with variable loadTime and position DisplayTime and position Display

DC DC Drive with gear Drive with gear

Hand ControlHand Control

CommunicationCommunication with PCwith PC – RS 232 – RS 232

LoadLoad

GOALGOAL: : Derivation of Time Optimal Control algorithm for Servomechanism with variable loadDerivation of Time Optimal Control algorithm for Servomechanism with variable load . .

„ „Time Optimal (feedback) Control“ - „Sliding Mode Control“ – estimationTime Optimal (feedback) Control“ - „Sliding Mode Control“ – estimation

of switching function (switchingof switching function (switching curved line, or approximation- only linecurved line, or approximation- only line, , polynomial).polynomial).

For variable unknown load of servomechanism and time suboptimal control is necessary For variable unknown load of servomechanism and time suboptimal control is necessary

to apply learning algorithm for looking for switching function (curved line, line).to apply learning algorithm for looking for switching function (curved line, line).

Problem: Problem: Nonlinearities – variable fiction, two springs – non sensitivity in output variable

Laboratory Model of Servomechanism Laboratory Model of Servomechanism

Spring Spring LoadLoad

M. Alexík, KEGA,06- 08, Žilina, Sept. 2008

µµP AtmelP Atmel

Physical Model of Servomechanism Physical Model of Servomechanism real time simulation real time simulation

61Brown2.00.034

41Green1.750.046

21Purple1.50.058

01Blue1.00.085

Weightscircular mecha-

nism

Weights- cart

ColoursTem[s]Km

)(

)(

)(

)(

)(

)((t)

)()()(

2

1

x

bAxx

ty

tyw

te

te

tx

tutt

m = Weights(changeable), b = coef. of friction (changeable) then Km, Tem are also changeable

S(s) =

s(Tem s + 1)

Km Km = 1/b, Tem = m/b

Controller output20 times reduced scale

Umax = 5 [V]

Umin = - 5 [V]

D/A converter pulse modulation

of Action variable u(k)

u(k)max= 5 [V], u(k)min = -5 [V],

L [m]

Controller output20 times reduced scale

Why we need hysteresis in the controller output? Controller output have to be without oscillation (zero ) in steady state.But then there is small control error in steady state, which depends from controller output, sampling interval and plant dynamics. If good condition also transient state is without oscillation.

Time Optimal Responses Time Optimal Responses digital simulation digital simulation hysteresis (non sensitivity - dead zone) onhysteresis (non sensitivity - dead zone) on controller outputcontroller output

From hysteresis on controller output

From hysteresis on controller outputHysteresis in this simulation examples deS= (-0.05 0.05)

t [s]

Analog model + Real timeHardware in Loop Simulation

Sampling interval: 5, 10, 20 [ms].Problem with Interrupts:DOS, Linux, W98. XP

Position measurenment:1 m = 2600 impulses1 impulse = 0.384 mm

Speed measurenment:0.1 ms-1 = 260 imp/s = 1.3 imp/5 ms

600

200

0

-100 1 2 4 6

x1[mm]

u(k) [V]

2

3 – controlled variable

2 - control output 5 [V]

4 - set point w= 400 [mm]

x1 – posit ion [mm]

u(k) – control output [V]

3

-500

Time [s]

1- control trajectory

Settl ing t ime = 3.75[ s]

Time Optimal Responses Time Optimal Responses real time simulationreal time simulation speed measurement problemspeed measurement problem

Sampling interval 5 [ms], no filter, no noiseSampling interval 20 [ms]

Add special noise signal to the measured position for elimination of speed quantization error, and after this filtration.Or state observer for position and speed as signal from state reconstruction (see later).

position[rad]

[ rad/s]

Cp3

Cp=e(t) / [e’(t)], e’(t)=d/dt[e(t)]

)(minmin)(min 0

)()()(

*

0

ttdttJJ kt

t

ttt

k

uuuu

00 )(

)(

),(),()(

xx

xx

uxx

t

t

tttft

NN

0,1,/

0,

/1,0

1,0

)(

)(

)(

)(

)(

)()(

)()()(

variablealsothenvariable,

frictionofcoef.,(variable)mass

/,/1,)1(

)(

2

1

cbA

x

bAxx

emmem

em

emmem

m

TKT

ty

tyw

te

te

tx

txt

tutt

Tm

bm

bmTbKsTs

KsS

12

3

x2=e’(t)

x1(t)= e(t) [rad]

t[s]

y

Optimal responses and trajectoriesOptimal responses and trajectories

Cp1

Cp2

1-nominal Jm -T1,K1

2- J= 5*Jm – T2,K2

3- J= 10*Jm - T3,K3

L[m]

Cp3= x1,3 (Cp) / x2,3(C p) = tg(α3,p)

x1,3(Cp)

x2,3(Cp)

α3,p

Cp= optimal slope of switching line


Optimal trajectories and switching curved lineOptimal trajectories and switching curved line

Switching line for w = 100 [rad/s] – Cp3

Switching function

Switching line for w = 300 [rad/s] - Cp1

One Switching curved line (switching function)

but

More switching line (depends on set point)

S(s) =Km

s(Tem*s +1)

58.64 S(s) =

s(0.108 s (0.0812 s + 1 ) + 1)

Cp1 < Cp3


0)(;0)(0)](V[

)(-)(-)(

+1ln)(sign)](V[

)(-)(

-1ln)(

)(-1/exp-)(0

)(-1/exp-1-)()(0

2min2max

122

2

121

1211

12

1211

fif

txforUFtxUFt

txtxTFK

txFTKtxt

txTFK

txTKFtx

tTFKtxFK

tTFKtxTFKttx

ii

emim

iemm

emm

emmi

kemimim

kemimemimk

orx

x

Switching curved line functionSwitching curved line function It can be computed only for known „Km“, „Tem“.

Switching function

Switching line

u[x(t)] = 0 for -deS V[x(t)] deS

Umax for V[x(t)] deS

Umin for V[x(t)] deS

deS – hysteresis of state variable measurement

Umax

Umin deSdeS

0 S );)()(

1

21 tg(

Cx

CxCtxC-tx-=(t)]V[

p

ppp2 xS ~

x

s(x)u+

u-

Plantu

Controller

sA

sB

x1[m]

x2 [m/s]

yA

yB

u>0

u<0

Sliding mode – trajectory “slide” along sliding line

0)()( txxCxs

0

),(

),(

0

aspre

tV

tV

ss

T

0

x

ssx

))(sign(max xsUu

tT.xK.U

tx1..K.T.Utxtxs 2

2max21B

max

abslnsign)(x

) )( 12 (t.xC(t) -xs xA x

Condition of SMC:

Lyapunov function :

Relay control:

Cx – instantaneous slope of trajectory point

Sliding Mode ControlSliding Mode Control - SMC - SMC

)1ln(.1

)()(

)()(

)()(

0)1()1(

)()(

2

1

1

2

1

2

1

2

1

2

lkxkx

C

dlkxlkx

kxkx

kxkx

kxkx

1. t-suboptimal control with SL (Switching Line)2. t-suboptimal adaptive control3. t-optimal control

1. 2. 3.

2.

3.

SL for t-optimal control.

ΔC

x1 , t

x1 , y

d

C

u>0

u<0

Adaptive adjusting of switching line slope

1.

Ci - initial slope of switching line

Adaptive - SMC Adaptive - SMC


0 100 300

-300

-100

position error - Xe1 [rad]

Xe2[rad/s]

1

2

3

1 - time optimal trajectory2 - adaptive trajectory3 - sliding mode trajectory

300

100

1 2 3

Time [s]

1

2 3

4

X1[rad]

1 - time optimal response2 - adaptive sliding mode response3 - conventional sliding mode response4 - actuating variable for response 2 (times 10)

AdaptiveAdaptive Algorithm based on Sliding Mode Algorithm based on Sliding Mode

0100 300

-300

-100

position error - Xe1 [rad]

Xe1(1)

12

34

5

Xe2(1)

Ct =Xe1Xe2

Xe2

[rad/s]

Es - angular speed errorCp0

CpC

1

2

3

1 - time optimal trajectory2 - adaptive trajectory3 - sliding mode trajectory

1,2,3

Copt

Cp

IF C_1= C3 > Ct =C4 (C5)

t

THEN Change Cp

Adaptive adjustment of the switching line slopeAdaptive adjustment of the switching line slope

Optimal trajectory ofOptimal trajectory of all all II. Order Systems II. Order Systems slope of switching line on the optimal trajectory be on the decreaseslope of switching line on the optimal trajectory be on the decrease

S1

S2 S3 S4

x1

x2

S3(s) =1

s(2*s +1)

S1(s) =1

s2

S2(s) =0.5s + 1

2*s2

S4(s) =1

(0.7*s +1)(1.7s+1)


1 2 3 4 5

1' 3'

4'

5'

e(t) = w- y(t)

1,2,3,4,5 –progressive generation of Cp

[e5(t),e'5(t)] – first found point of switching curve

[e5'(t),e'5'(t)] – second found point of sw. curve e'

t[s]

time optimal running of controller output – points 5, 5' on trajectories

controller outputs on trajectories 4 a 4' in Fig.6 a.

y(t)u(k)

Automatic generation of suboptimal responses and trajectoriesAutomatic generation of suboptimal responses and trajectories

Generation of suboptimal trajectoriesGeneration of suboptimal trajectories

14

Point for slope of Suboptimal switching line

Learning = looking for Points for suboptimal switching line + look up table (memory) for its + classification (identification)

of Load (parameters of transfer function – parameters of controlled process)


)1()(

sTs

KsS

em

m

1- SL - Switching line2- LSC - Linear switching curve3- NS - Neural network4- SCL - Switching curved line. (Identification of Km, Tem and computation of SCL)

Classification option:1 -Hopfield net2 -fuzzy clustering3 -ART net(1-3 – classific. off line)4 – Parameters identification (on line)

Possibilities of Learning (historical evolution)1- fractional changing of SL slope and polynomial interlace2- adaptation of LSC profile (online and offline) 3- simulation of finishing trajectories on neuro – model (1,2,3 – off line learning)4- continuous identification of process parameters (Km, Tem) (on line learning)

1- slope of SL and polynomials parameters2- LSC points3- NS veights4- structure of SC function

Clasification (Identification)(number of load)

s(x)u+

u-

Plantu

SMC –control algorithm

s(x) – switching function (line)

Learning algorithm Memoryc_sus

x

After learning process, recognitionof „number of load“ – Km, Tm

Learning Controller baseLearning Controller basedd on SMC on SMC basic problembasic problemss

Classification problems = non linearity's in Km, Tem bring about changing instantaneous values of this parameters and then alsochanging of step response for the same number of load.


s(x) = -x2-Cx1

c_susCmin

Cmax

kroke(0)

c_sysa0

a1

a2

Memory - „look up“ tables

Classification

u+

u-

plantu

SMC

Learning algorithm c_sys

x

Learning algorithm based on switching line – SLLearning algorithm based on switching line – SL Real Time Simulation Experiments Real Time Simulation Experiments

Memory Polynomial approximation of switching function

}

121222)( xxaxas x

121222)( xxaxas x

x1 x2 xn

S1 S2 Sm

wij

)/2exp(11

)(

1

)sgn(

11

TShSSP

bSN

rSwh

hS

iiii

N

jjj

N

jiji

ii

Stochastic asynchronous dynamics :

p

jiij bxbxN

w1

))((1

scale adaptation:

Transient response

y(t)

t

y(t)

t

α1

α2

α3

α4

α5

α6

α7

4α1

4α24α7

Pattern coding

N

iii xSE

1

.5,0

Classification - Hopfield NET Classification - Hopfield NET


1. S1

3. S4

5. S1

7. S5

9. S2

11. S3

2. S2

4. S5

6. S4

8. S2

10. S4

12. S3

n.č.1

n.č.3

s.č.1

s.č.3

s.č.2

n.č.4

n.č.2

s.č.3

s.č.3

s.č.2

s.č.3

s.č.3

Input

Input

Output

Output

Evolution of nets energy according to number of iteration

Advantage: qualitydisadvantages: speed, number of pattern limited , pattern numbering

Classification - Hopfield net (N=255) Classification - Hopfield net (N=255)

PlantPlant Input dataInput data ( (xx)) OutputOutput ( (yy))

S1S1 [[xx11, , xx22, ..., , ..., xx2525]]11 11

S2S2 [[xx11, , xx22, ..., , ..., xx2525]]22 22

...... ...... ......

SSnn [[xx11, , xx22, ..., , ..., xx2525]]nn nn

FIS(Sugeno) y

x1

x2

x25

Data clustering (counts of rules and membership functions )

Parameters estimation in consequent rules of fuzzy classifier

t

y(t)

)(_

)(

yroundsusc

yroundyError

Fuzzy classification Fuzzy classification


1. S1

3. S3

5. S2

7. S2

9. S4

11. S4

2. S3

4. S3

6. S3

8. S3

10. S3

12. S2

Disadvantages: too lot of parameters, necessity to keep data patterns

Advantage: quality

Fuzzy classificationFuzzy classification


x1 x2 xn

• Initialisation:• Recognition:

• Comparison:

• Searching:• Adaptation:

)(max* jjj

yy

x

xt *jS

t

y(t)

4α1

4α24α7

y1 y2 ym

Control signal 2

Control signal 1

wij

tij

N

iiij

iijnewij

xt

xtw

1*

*

*

5,0iij

newij

xtt **

Advantages: quality, speed

Classification – ART Classification – ART network network


SC for t-optimal control

0-1 1

0.5

-0.5

linearized SC

r

x(m)

x(n)

x1

x2

)()())()(()()(

)()())(( 2211

11

22 txnxnxtxmxnx

mxnxtsLSC

x

)))((()( max tssignUtu LSC x

SC for t-optimal control

0 1

-0.5

LSCstep

=1

LSCstep=

2

Method for LSC points setting:

)()()( mtn xxx

...,,......)(

)(,

)(

)(,

)(

)(... 11

11

12

1

2

11

12

iii

i

i

i

i

i

i CCCnx

nx

nx

nx

nx

nxC

Learning switching curve (LSC) definition Learning switching curve (LSC) definition

On-line – according to adaptation For LSC points.

Off-line – according to trajectory profile For LSC points..

C

kx

kxCkn ii )(

)()()(

1

2xx

3.

2.

4.

1.

x1, t

x2, y

1. trajectory 2. LSC3. SC for t-optimal control.4. system output

3.

2.

4.

1.

∆C

x1, t

x2, y

1. LSC according to adaptation 2. LSC according to trajectory3. SC for t-optimal control.4. System output

)()(

0)(1

2

kxkx

Cku i

Settings of LSC profile (1. Learning step )Settings of LSC profile (1. Learning step )


2.

1.

3

x1, t

x2, y

1. LSC in single steps2. SC for t-optimal control.3. System output

2.

1.

4.

x1, t

x2, y

2.

1.

4.x1, t

x2, y

According to adaptation

According to trajectory profile

together

Control on LSC for different set pointsControl on LSC for different set points


sLPK(x)

c_susC

Pamäť Classification

u+

u-

Systemu

SMC

Learning Algorithm c_sus

x

Learning algorithm based on LSCLearning algorithm based on LSC Real Time Simulation ExperimentsReal Time Simulation Experiments

Two Neuro Networks: NS1 and NS2. First step: From measured values of input (Umax, Umin) and output [y(k)] to set up NS1. Then NS1 can generated t - optimal phase trajectories and to set up NS2. Second step: t – optimal control with NS2 as the switching function. It is possible to find t-suboptimal control only from ONE loop response (with switching line). This t- suboptimal control is compliance for all set points (but only for one combination of loads). NS1- 2 layers (6 and 1) neurons with linear activation function. (Model of servo system (output) with inverted time). n= transfer function order (2,3)

NS2 - 3 layers, model of switching function. Input layer – 6 neurons with tangential sigmoid activation function. Hidden layer – 6 neurons with linear activation function Output layer - 1 neuron with linear activation function.

For 2 order transfer function it is needed from simulation approximately 300 points as the substitutionof switching function.

Learning algorithm based on neuro networksLearning algorithm based on neuro networks- NN- NN Basic descriptionBasic description

))1(),...,1(),(),(),...,1(()( 1 nkukukunkykyfky NS

))(),...,(),(()()( 12122 kxkxkxfkxs nNSnNS x


Learning algorithm based on Learning algorithm based on NNNNSteps of computationSteps of computation

x2(t), y(t)

x1(t),t

Output (1.step)

Output (2.step)

Phase trajectory (1.step)


Switching function (2.step)


1. Step: Real time response2. Step a: Off line computation of switching function : 5 [s]{DOS}, 3 [s] Windows on line computation – {in progress} b: Real time suboptimal time response


sNS2(x)

c_susWNS2

Memory Clasification

u+

u-

Systemu

SMC

Learning algorithm c_sus

Model NN1

Block of simulation according to NN model

x

))1(),...,1(),(),(),...,1(()( 1 nkukukunkykyfky NS

))(),...,(),(()()( 12122 kxkxkxfkxs nNSnNS x

NN model in invert. time:

NN switching function:

Learning algorithm based on Learning algorithm based on NNNN Real Time Simulation ExperimentReal Time Simulation Experiment

x2

(t)

y(t)

x 1 (t),t

Output (1.step)

Output (2.step)



Switching function

(2.step)


Learning algorithm based on Learning algorithm based on NNNNSteps of ComputationSteps of Computation

Learning algorithm based on Learning algorithm based on NN, NN, Simulation ExperimentsSimulation Experiments

Load: 1+0

Load: 1+2

Load: 1+4

Load: 1+6

Response quality:Settling time, tR = 2.83 [s] , 3.31 IAE: = 1.53 [Vs] , 1.63

Response quality: tR = 3.68 [s] , 3.99 IAE = 1.76 [Vs] , 1.81



Initial switching plain:

Switching plain according to NN2

Points of phase trajectories fromsimulation

0-1 1

0.5

-0.5

koncový state

Model phase trajectories for u=Umin

x1

x2

x3

123 412)( xxxs x

x1

x2

x3

t

y(t)

First control according to switching plain

Second control according to neuro nete NN2

Model phase trajectory for u=Umax

Model phase trajectory for u=Umin

Model phase trajectories for u=Umax

Optimal Trajectories for 3. Order Controlled SystemOptimal Trajectories for 3. Order Controlled SystemComputed by Neuro NetworksComputed by Neuro Networks


)2)(1)(7,0(

4)(

ssssS

Km=[x2(t/2)]2/{Umax[2x2(t/2)-x2(t)]}

Tem= -t/{ln[1-(x2(t)/Km)]}

S(s) =Km

s(Tem*s +1) S(z) =

b1z-1 + b2z-2

1+ a1z-1 + a2z-2

Step response of transfer function:

h(t) = Km t + Km Tem exp ((-1/T) t) – Km Tem

Analytical derivation of parameters Km and Tm

Is possible with static optimisation or continuous identification

1. Static optimization fromh(t) Km

-1 – t = Tem (exp((-1/Tem) t) - 1)

2. Continuous Identification.Parameters of discrete transfer function from Identification (a i , bi)and recalculation to parameters of continuous transfer function Km, Tem

Advantages: Direct calculation of parameters of switching functionDisadvantages: Real time calculation of RLS algorithm.

3. Iterative computation of Km , Tem .

Tem= T0/[ln(1/a2)]Km=b1/[T0+Tem (a2 - 1)]

Classification with IdentificationClassification with Identification 3 possibilities3 possibilities


Classification with IdentificationClassification with Identification speed measurement problemspeed measurement problem

x2(t) = [x1(k) - x1(k-1)]/ T0

T0 – sampling interval

Speed measurenment:0.1 ms-1 = 260 imp/s = 1.3 imp/5 ms

600

200

0

-100 1 2 4 6

x1[mm]

u(k) [V]

2

3 -controlled variable

2 – u(k)- control output 5 [V]

4 -set point w= 400 [mm]

x! - position [mm]

3

Time [s]

1 - control trajectory

Settling time = 3.75 [s] 1 - control trajectory

2 – u(k)- control output 5 [V]

4 -set point w= 0.6 [m]

-500

600

200

0

-100 1 2 4 6

x1[mm]

u(k) [V]

2

3 – controlled variable

2 - control output 5 [V]

4 - set point w= 400 [mm]

x1 – posit ion [mm]

u(k) – control output [V]

3

-500

Time [s]

1- control trajectory

Sett l ing t ime = 4.04[s]

Classification with IdentificationClassification with Identification


Km=[x2(t/2)]2/{Umax[2x2(t/2)-x2(t)]}

Tem= -t/{ln[1-(x2(t)/Km)]}

Classification with IdentificationClassification with Identification state estimator state estimator

b

r S

z-1

F

c

h

y(t) u(k) w e s δ

x(k)= x1(k),x2(k)

ε(k)

?(k)

d[e(t)]/dt


4

2

0

-2

1

2

4 6

u [V]

y(t ) -controlled variable

2 - control output 5 [ V ] 2.5 times reduced scale

set point w= 400 [mm ]

x 1 =w -y(t) – position [mm] u(k) – control output [V]

3

-4

Time [s]

1 -control trajectory 2 times reduced scale

Settling time = 3.7 [s]

u(k)/2.5

L [m]

x1


Learning algorithm - Identification Learning algorithm - Identification + state estimator + state estimator real time hardware in loop simulation real time hardware in loop simulation


Load: 1+2

Load: 1+2

Load: 1+4

Load: 1+6

Set Point

Identification and state estimator

Switching function

learned with NN

Switching function identification

Trajectory -

Neuro

State trajectory

-estimator

Controller output -Neuro

0

-1

1

-3

4

7

1 2 4 6 0.5*x 1 , t [s]

0.5*x 1

0.4*u(k)

y (t)

Settlingtime [s]

Set point [m]

Integral of ab - solute value of tthe error [ms]

Algo -

rithm

Switching function

Identif.+ estimation

Neuro

0.4

0.4

0.4

4. 06

3.86

3.64 0.76

0.82

0.98

Comparison of Learning algorithm Comparison of Learning algorithm – loop response quality– loop response quality real time hardware in loop simulation real time hardware in loop simulation

3 –Realization t – optimal control based on sliding mode and Neuro Nets (real time computation of NS1 and NS2) but also real tike identification with estimator state have to use parallel computing. So control algorithm than can be classified as „intelligent control“.

2- Nowadays, paradigm of optimal and adaptive control theory culminates. It is needed to solve problems such as MIMO control, multi level and large-scale dynamic systems with discrete event, intelligent control. That demands to turn adaptive control chapter into appearance of classical theory. Moreover, we need to classify adaptive systems with one loop among as classic ones and focus on multi level algorithms and hierarchical systems. Then we will be able to formulate new paradigm of large-scale systems control and intelligent control.

Conclusion and outlook Conclusion and outlook


LEARNING ALGORITHMS for SERVO- MECHANISM TIME SUBOPTIMAL CONTROL

Documents

Transcript of LEARNING ALGORITHMS for SERVO- MECHANISM TIME SUBOPTIMAL CONTROL