ExploitingErgodicityfortheAnalysisofShortTimeSeries · short data segments by exploiting phase...

Nonlinear Phenomena in Complex Systems, vol. 11, no. 3 (2008), pp. 399 - 402

Exploiting Ergodicity for the Analysis of Short Time SeriesMatja�z Perc

Department of Physics, Faculty of Natural Sciences and Mathematics,University of Maribor, Koro�ska cesta 160, SI-2000 Maribor, SLOVENIA

Qing Yun Wang∗ and Zhi Sheng DuanState Key Laboratory for Turbulence and Complex Systems,

Department of Mechanics and Aerospace Engineering,College of Engineering, Peking University, Beijing 100871, CHINA

(Received 13 May, 2008)

A new generally applicable method for analyzing short time series that exploits the ergodicnature of chaotic systems is presented and tested on a simple example. The method can beintegrated into several existing algorithms and should prove valuable for characterizing shortexperimental signals.

PACS numbers: 05.45.Tp, 05.45.AcKeywords: chaotic systems, Lyapunov exponents

The accurate characterization of dynamicsin an experimental time series is fundamentalfor understanding the observed behavior of anynatural or arti�cial system. While numerousmethods are available for this task [1], theiraccuracy more or less depends on the availabilityof long data sets. This requirement often hindersproductive research from experimental data,especially in �elds where acquisition of longdata sets is impossible due to intrinsic systemproperties or lack of non-invasive recordingtechniques that destroy the system's dynamics.Such instances are frequent in biology, especiallyat the cellular level [2], and can also beencountered in economics [3], astronomy [4], aswell as man-made systems in general.

Here we propose a new approach foranalyzing short time series based on ergodicityof deterministic chaotic systems [5]. We exploitthe fact that an arbitrarily small neighborhoodof every point that forms the chaotic attractoris repeatedly visited by the trajectory duringthe temporal evolution of the system. Considerthe reconstructed phase space vectors in mdimensions ~yi = [xi, xi+τ , . . . , xi+(m−1)τ ] that

∗Also at School of Statistics and Mathematics, InnerMongolia Finance and Economics College, Huhhot 010051,CHINA

were obtained from a short scalar time seriesxi [6]. If xi is su�ciently short, the trajectorymay occupy only a small portion of the wholephase space, as shown in Fig. 1(a). In this case,it is usually impossible to carry out furtheranalyses. However, ergodicity assures that thelast point of the trajectory, denoted as ~yn, hasan arbitrary close neighbor ~yj , from which thetrajectory continues, possibly further occupyingthe available phase space, as shown in Fig.1(b). It is thereby crucial to note that j isn'tnecessarily the n + 1 point in time. Based on thisobservation we conclude that several short timeseries segments, although not obtained continuousin time, i.e. from a single experiment, maystill yield a long enough continuous trajectoryin the phase space to allow further analysis.Since initial conditions in real-life settings arenever know perfectly it is certain that repetitionsof the same experiment will always yield atleast slightly di�erent time series, thus providingfurther information about the underlying system.Obviously, parameter settings of the system mustbe kept constant during repetitions so thatstationarity criteria are not violated [7].

Following the above reasoning let ussummarize the algorithm. Suppose we possessr short time series each occupying n datapoints that were obtained from r repetitionsof an experiment. Altogether then s = rn

399

400 Matja�z Perc, Qing Yun Wang, and Zhi Sheng Duan

-15 -10 -5 0 5 10 15

-15

-10

-5

0

5

10

15

yj

yn

x i+

xi

-15 -10 -5 0 5 10 15

-15

-10

-5

0

5

10

15

b)

a)

x i+

xi

FIG. 1. Reconstructed phase space (d = 3, τ =17) from 50 randomly selected data segments of theLorenz system [8], each occupying n = 200 points. Theline connects points from the 3rd [panels (a) and (b)]and the 18th [panel (b)] segment.

data points are available for further analysis,which we can simply merge together, one datasegment after another, to form a new timeseries xi, where i = 1, 2, . . . , s (provided pointsin all data segments were evenly sampled).This new time series is discontinuous whenever(i mod n) = 0. To correct this, each time thetraditional embedding procedure [6] encountersa point xi where [n − (i mod n)] < (m −1)τ [~yi cannot be formed without xi+(m−1)τ

going through a discontinuity] a search for aclose phase space neighbor, denoted as ~zj =[xj , xj−τ , . . . , xj−(m−1)τ ], begins that is less than εapart from ~zi+t = [xi+t, xi+t−τ , . . . , xi+t−(m−1)τ ],

where t = [n − (i mod n]. Note that a di�erentletter is used for these delay vectors since theyare formed backwards in time as opposed to ~yi

, whereas t is introduced in order to start thesearch algorithm at the last point of each datasegment to minimize data wastage. Moreover, ~zj

is considered as a suitable close neighbor only if[n− (j mod n)] ≥ [(m− 1)τ − t] so that the pointxj has enough successors within its own datasegment to completely form the un�nished delayvector ~yi. The `missing' coordinate xi+(m−1)τ ofthe delay vector ~yi is, upon �nding a suitable~zj , replaced by xj−t+(m−1)τ . Successive delayvectors then have to be formed from xj onwards,whereby if xj−t+(m−1)τ is introduced as the m-thdelay coordinate of ~yi, then the �rst coordinateof ~yi+1 (note that delay vectors are numberedconsecutively) must be xj−t+1. This procedurehas to be repeated until as many points of thewhole data set (i = 1, 2, . . . , s) as possible areused while keeping ε small. In general, a trade o�between ε (continuity of the �nal `merged' timeseries) and the percent of all occupied data pointshas to be made. The whole set of data points canbe viewed as a labyrinth. However, the catch liesnot simply in coming from one end to the other,but rather to do so in an as continuous manneras possible, whereby none of the points shouldbe used more than m times, i.e. each time as adi�erent delay coordinate. At the end, one obtainsa continuous long set of delay vectors ~yi, i.e. acontinuous trajectory in the phase space, wherethe time course of a delay coordinate representsa new continuous time series that has exactly thesame properties as each individual data segment,and is due to its extended length suitable forfurther analysis.

Note that although the above procedureassumes that proper embedding parameters mand τ are known, this must not be the casein general. The algorithm can be integrateddirectly into existing methods for determiningproper embedding parameters [9], whereby thosem and τ that are being considered as suitableare used for calculations. In fact, the algorithmcan be integrated into any existing methods thatincorporate time operations of the form xa →xa+b .

Íåëèíåéíûå ÿâëåíèÿ â ñëîæíûõ ñèñòåìàõ Ò. 11, � 3, 2008

Exploiting Ergodicity for the Analysis of Short Time Series 401

An excellent tool for optimizing the outcomeof the presented algorithm is the determinismtest introduced by Kaplan and Glass [10]. Thetest measures average directional vectors in acoarse-grained m-dimensional embedding space.If xj−t+(m−1)τ is not a suitable replacement forxi+(m−1)τ , either because ‖ ~zj − ~zi+t ‖ is toolarge, or the trajectory doesn't evolve furtherin the same direction, the norm of the averagedirectional vector of all passes through the boxthat occupies the emergent discontinuity will besmaller than unity (if each pass is treated as a unitvector). Thus, we may introduce the weightedaverage of all lengths of average directionalvectors Λ (de�ned as in [10]) as the crucialquantity that, for an optimal execution of thealgorithm, has to be maximal. Of course Λ neednot be exactly 1, since the time series itself maybe burdened with measurement error or mayeven origin from a stochastic system. The coarsegrained �ow of the phase space that was mergedtogether from 50 data segments of the Lorenzsystem, after the application of the algorithmwith εmax = 0.7 (approximately 1/10 of thestandard deviation of data), is presented in Fig.2. It can be observed that basically all vectors areof unit length. Accordingly Λ = 0.99. Since wedemonstrate our results on a numerical example(the input data segments are noise-free) this highΛ is not surprising and con�rms the successfulnessof our algorithm.

As a �nal stringent test, we calculateLyapunov exponents of the newly obtainedphase space by approximating the �ow withradial basis function as advocated in [11]. Theobtained results presented in Fig. 3 show excellentagreement with accurate values obtained with thehelp of di�erential equations. Noteworthy, similarresults were also obtained by using polynomialbasis functions [12].

The method introduced in this Letter isset out to enable time series analysis of veryshort data segments by exploiting phase spaceergodicity of chaotic systems. Thereby, the factthat an arbitrarily small neighborhood of everypoint of the chaotic attractor is repeatedly visitedby the trajectory during the temporal evolutionof the system is exploited. Although throughout

-15 -10 -5 0 5 10 15

-15

-10

-5

0

5

10

15size of the unit vector

x i+

xi

FIG. 2. Directional �ow of the merged phase spacethat was coarse grained into a 30× 30× 30 grid. Thelength of arrows is equal to the vector norm in 3D,whereas directions correspond to projections onto thexi − xi+τ plane.

0 2500 5000 7500 10000

-15

-10

-5

0

5

i

FIG. 3. Lyapunov exponents of the merged phasespace shown in Fig. 2. Thin straight lines indicateaccurate values. Thick lines depict the convergenceof the maximal, middle and the lowest Lyapunovexponent from top to bottom.

the Letter stress is put on deterministic chaoticsystems, which should de�nitely not be assumedad hoc when attempting analysis of experimentaltime series, the method can be implementedon any data sets. As always, the outcomeof the method has to be veri�ed by knowndeterminism [11, 13] and stationarity [7] tests

Nonlinear Phenomena in Complex Systems Vol. 11, no. 3, 2008

402 Matja�z Perc, Qing Yun Wang, and Zhi Sheng Duan

so the time series can be deemed suitable forfurther analysis. An obvious weak point of themethod is that it assumes multiple short timeseries being at hand prior to implementation.Although conclusions drawn from an experimentare valid only if the experiment can be repeatedand conclusions veri�ed, this is a rather poorargument if the experiment is costly or otherwisedi�cult to repeat, as are for example certainmedical conditions like rarely occurring heartarrhythmia or epileptic seizures. Nevertheless,there exist several experimental situations thatare currently being studied predominantlywith mathematical modeling and numericalsimulations, as for example e�ects of di�erent

hormones on intracellular calcium oscillations[14], for which the approach advocated in thispaper may be especially useful.

Acknowledgments

This work was supported by the NationalScience Foundation of China (Funds No. 10702023and 60674093) and China Post-Doctoral ScienceFoundation (Fund No. 20070410022). Matja�zPerc additionally acknowledges support from theSlovenian Research Agency (Project No. Z1-9629).

References

[1] H. D. I. Abarbanel, Analysis of Observed ChaoticData (Springer, New York, 1996); H. Kantz andT. Schreiber, Nonlinear Time Series Analysis(Cambridge University Press, Cambridge, 1997);J. C. Sprott, Chaos and Time-Series Analysis(Oxford University Press, Oxford, 2003).

[2] A. Goldbeter, Biochemical Oscillations andCellular Rhythms (Cambridge University Press,Cambridge, 1996).

[3] R. S. Tsay, Analysis of Financial Time Series(John Wiley and Sons, New York, 2002).

[4] A. S. Weigend and N. A. Gershenfeld (eds.),Time Series Prediction: Forecasting the Futureand Understanding the Past (Addison-Wesley,Reading, 1993).

[5] J.-P. Eckmann and D. Ruelle, Rev. Mod. Phys.57, 617 (1985).

[6] F. Takens, in: D. Rand and L. S. Young (eds.),Dynamical Systems and Turbulence (Springer-Verlag, Berlin, 1981); T. Sauer, J. Yorke, and M.A. Casdagli, J. Stat. Phys. 65, 579 (1991).

[7] T. Schreiber, Phys. Rev. Lett. 78, 843 (1997).[8] E. N. Lorenz, Atmos Sci. 20, 130 (1963);

Parameters r = 25, b = 8/3, c = 10 and timestep for numerical integration dt = 0.01.

[9] A. M. Fraser and H. L. Swinney, Phys. Rev. A33, 1134 (1986); M. B. Kennel, R. Brown, and H.D. I. Abarbanel, Phys. Rev. A 45, 3403 (1992).

[10] D. T. Kaplan and L. Glass, Phys. Rev. Lett. 68,427 (1992).

[11] J. Holzfuss and U. Parlitz, in: L. Arnold, H.Crauel, and J.-P. Eckmann (eds.), Proceedingsof the Conference on Lyapunov Exponents(Springer-Verlag, Berlin, 1991); U. Parlitz, Int.J. Bifurcat. Chaos, 2, 155 (1992).

[12] K. Briggs, Phys. Lett. A 151, 27 (1990); P.Bryant, R. Brown, and H. D. I. Abarbanel, Phys.Rev. Lett. 65, 1523 (1990).

[13] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian,and J. D. Farmer, Physica D 58, 77 (1992); T.Schreiber and A. Schmitz, Physica D 142, 346(2000).

[14] K. S. R. Cuthbertson and P. H. Cobbold,Nature 316, 541 (1985); N. M. Woods, K. S. R.Cuthbertson, and P. H. Cobbold, Nature 319,600 (1986).

Íåëèíåéíûå ÿâëåíèÿ â ñëîæíûõ ñèñòåìàõ Ò. 11, � 3, 2008

ExploitingErgodicityfortheAnalysisofShortTimeSeries · short data segments by exploiting phase...

Documents

Transcript of ExploitingErgodicityfortheAnalysisofShortTimeSeries · short data segments by exploiting phase...