Application of Symbolic Paa

download Application of Symbolic Paa

of 22

Transcript of Application of Symbolic Paa

  • 8/6/2019 Application of Symbolic Paa

    1/22

    APPLICATION OF SYMBOLICAPPLICATION OF SYMBOLIC

    PIECEWISE AGGREGATEPIECEWISE AGGREGATEAPPROXIMATION (PAA) ANALYSISAPPROXIMATION (PAA) ANALYSISTO ECG SIGNALSTO ECG SIGNALS

    Burcu Klaholu{[email protected]}

    Serhan zdemir, Bora . Kumova

    Izmir Institute of Technology,Izmir, Turkey

  • 8/6/2019 Application of Symbolic Paa

    2/22

    OutlineOutline

    y Past Work

    y Symbolic Time Series

    y Application of Symbolic PAA Analysis to ECGData Experimental Settings

    Steps Applied

    Experimental Results Application on a Synthetic Data

    y Conclusion

    225.06.2008

  • 8/6/2019 Application of Symbolic Paa

    3/22

    PastPast workwork

    y The studies on Symbolic Time Representationshave emerged in the 1980s.

    y Crutchfield introduced computational

    mechanical approach into Symbolic Time SeriesAnalysis.

    y Ray analyzed symbolic dynamics of complexsystems for anomaly detection. (2003)

    y SAX (Symbolic Aggregate Approximation) isinvented by Keogh & Lin (2002)

    325.06.2008

  • 8/6/2019 Application of Symbolic Paa

    4/22

    SymbolicSymbolic timetime seriesseries

    y Huge amounts of data in the form of time-series are available today.

    y Symbolic time series provide the end-user asymbolic representation of time-series.

    y

    The goal is to derive a more compact andexpressive view of large time series thanks to asymbolic representation preserving as muchinformation as possible.

    425.06.2008

  • 8/6/2019 Application of Symbolic Paa

    5/22

  • 8/6/2019 Application of Symbolic Paa

    6/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG datay Experimental Settings:

    symbolize a problematic ECG data

    measured at 250Hz during 70 seconds.

    It consists of 15080 data points

    625.06.2008

  • 8/6/2019 Application of Symbolic Paa

    7/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG datay Steps Applied: Noise Seperation - assumed no noise Segmentation Methods:

    Symbolization725.06.2008

  • 8/6/2019 Application of Symbolic Paa

    8/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG data -- SegmentationSegmentationy Segmentation Methods Performed:

    It is first proposed to use PLA:x Top-Down

    x Sliding Window

    x Bottom-Up

    x SWAB

    Bottom-Up Approach utilized PAA used despite of PLA

    825.06.2008

  • 8/6/2019 Application of Symbolic Paa

    9/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG data -- SegmentationSegmentationy PLA with

    Merging 15000

    Data Points

    y PLA withMerging 10000Data Points

    y PLA withMerging 14500Data Points

    925.06.2008

  • 8/6/2019 Application of Symbolic Paa

    10/22

  • 8/6/2019 Application of Symbolic Paa

    11/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG data -- SymbolizationSymbolizationy SAX:

    a symbolization method In order to be able to use SAX; the time series

    should be converted into PAA representation

    1125.06.2008

    SAX Keogh & Linn

  • 8/6/2019 Application of Symbolic Paa

    12/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG data -- SymbolizationSymbolizationy Extended SAX:

    Employ min and max points into the symbol string

    1225.06.2008

    Extended SAX - Lkhagva,

    Suzuki & Kawagoe

  • 8/6/2019 Application of Symbolic Paa

    13/22

  • 8/6/2019 Application of Symbolic Paa

    14/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG data PAAPAA andand SAXSAXa if value < 1

    symbol = b if 1 < value < 2

    c if value > 2

    y Euclidean Distance will be used for similaritymeasure:

    2211 ......),( nn yxyxyxd !

    1425.06.2008

  • 8/6/2019 Application of Symbolic Paa

    15/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysis

    toto ECG dataECG data ExperimentalExperimental resultsresultsy Breakpoints 0.9 and 1.2 yield the following symbol

    string:aca aca aca bca aca bca bca bca bca bca bca bca cca cca caa cca acabca bca caa caa caa caa cab cab caa cab caa caa cab cab caa cab cabcab cab abc aac bca bca aca aca cab cab cab cab cab cab cac cab cabcac cac cab cab cac cac cac cab cab cab cba aca caa cab cac cac caccab cab

    1525.06.2008

  • 8/6/2019 Application of Symbolic Paa

    16/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysistoto ECG dataECG data ExperimentalExperimental resultsresults

    y aca aca aca bca aca bca bca bca bca bca bca bca cca ccacaa cca aca bca bca caa caa caa caa cab cab caa cab caacaa cab cab caa cab cab cab cab abc aac bca bca aca aca

    cab cab cab cab cab cab cac cab cab cac cac cab cab caccac cac cab cab cab cba ACA caa cab cac cac cac cabcab

    y

    There is a symbol shift (bca - cab - abc)y Might be caused by inequal segment sizes

    1625.06.2008

  • 8/6/2019 Application of Symbolic Paa

    17/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysistoto syntheticsynthetic datadata

    y A synthetic ECG series with segment size of 74

    740 data points

    noise with maximum value of 0.1 decreased the value of one segment by 1.0 (anomaly)

    1725.06.2008

  • 8/6/2019 Application of Symbolic Paa

    18/22

    ApplicationApplication ofof symbolicsymbolic PAAPAA analysisanalysistoto syntheticsynthetic datadata

    y Obtained symbol string with breakpoints 0.8 and 1.8:

    cab cab cab cbb cab cab cab cab baa cbb

    y The aim is to have an identifiable error for theanomalous segment than the others

    1825.06.2008

  • 8/6/2019 Application of Symbolic Paa

    19/22

    ConclusionsConclusions

    y The error caused by the anomaly is similar to theerror caused by some other parts of the symbolstring.

    y This discrepancy is a result of the symbol shiftsthat occur because ECG data has non-equalsegment sizes.

    y On an artificially created completely periodic ECGdata, lack of shifts exposes the true anomalies.

    1925.06.2008

  • 8/6/2019 Application of Symbolic Paa

    20/22

    ConclusionsConclusions

    y PAA along with extended SAX is not capable ofhandling aperiodic data.

    y Why are PAA and SAX are used instead ofother methods?

    Seemed to be the most appropriate methodto symbolize time series

    New hybrid techniques may be developed tosuit the needs

    25.06.2008 20

  • 8/6/2019 Application of Symbolic Paa

    21/22

    ReferencesReferencesy G. Hebrail, B. Hugueney, Symbolic representation of long time

    series. Conference on Applied Statistical Models and Data Analysis(ASMDA2001), Compigne, 2001

    y A. Ray, Symbolic dynamic analysis of complex systems for anomalydetection, Signal Processing, 84(7), 2004, 11151130.

    y E. Keogh, S. Lonardi, B. Chiu, A Symbolic Representation of TimeSeries, with Implications for StreamingAlgorithms, Proc. 8th ACMSIGMOD Workshop on Research Issues in Data Mining andKnowledge Discovery, San Diego, CA, 2003

    y B. Lkhagva, Y. Suzuki, K. Kawagoe, Extended SAX: Extension ofsymbolic aggregate approximation for financial time series datarepresentation, DEWS, 2006, 4A-i8.

    y E. Keogh, A tutorial on indexing and mining time series data, ICDM'01 The 2001 IEEE International Conference on Data Mining, SanJose

    y A. Ray, Anomaly detection and failure migitation in complexdynamical systems, Seminar at National Institute of Standards andTechnology, 2004

    2125.06.2008

  • 8/6/2019 Application of Symbolic Paa

    22/22

    ReferencesReferences ((continuedcontinued))y C. R. Shalizi, K. L. Shalizi, J. P. Crutchfield, Pattern Discovery in Time

    Series, Part I: Theory, Algorithm, Analysis, and Convergence, Journalof Machine Learning Research, 2003

    y L. Karamitopoulos, G. Evangelidis, Current trends in time seriesrepresentation, A-68, Panhellenic Conference on Informatics (P.C.I.)2007, Patras

    y B. Chiu, E. Keogh, S. Lonardi, Probabilistic discovery of time seriesmotifs, Proc. 9th ACM SIGKDD international conference onKnowledge discovery and data mining, Washington D.C., 2003, 493-498

    y M. Falk, F. Marohn, R. Michel, D. Hofmann, M. Macke, A First Courseon Time SeriesAnalysis, (Chair of Statistics, University of Wrzburg,

    2006)y T. K.Moon, W. C. Stirling, Mathematical methods and algorithms for

    signal processing (Prentice Hall, 2000)

    2225.06.2008