Avionics Stability Summary - Test Science · 2018. 3. 26. · T e s t i n g -T a c t i c s -T r a i...

20
T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n OUTLIERS HOW TO PREDICT A UMBC UPSET THE FIRST 16 SEED WIN OVER A 1 SEED

Transcript of Avionics Stability Summary - Test Science · 2018. 3. 26. · T e s t i n g -T a c t i c s -T r a i...

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    OUTLIERS

    HOW TO PREDICT A UMBC UPSETTHE FIRST 16 SEED WIN OVER A 1 SEED

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    JUST KIDDING!

    IT RUINED MY BRACKET AS WELL…

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    59 TES

    ANOMALY DETECTION: AIRCRAFT

    SYSTEM HEALTH DATA STABILITY

    REPORTING

    PRESENTED BY 1LT KYLE “KODAC” GARTRELL

    06/23/16This Briefing is:

    UNCLASSIFIED

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Start with a Definition:

    “An outlying observation, or outlier, is one that

    appears to deviate markedly from other

    members of the sample in which it occurs”

    - Grubbs, 1969

    “An observation which deviates so much from

    other observations as to arouse suspicions that

    it was generated by a different mechanism”.

    - Hawkins, 1980

    Going forward keep in mind the second

    definition

    ANOMALIES

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Health Code Reports (HRC)

    Intelligent Queuing Data (ICUE)

    Pilot Reports

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Asses System Stability Discovery

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Types of Anomalies:

    Point Anomaly

    Contextual Anomaly

    Collective Anomaly (small cluster)

    Anomalies Discovery

    Outlier Categories:

    Local Outlier - Density Based

    Example Technique: LOF

    Global Outlier - Distance Based

    Example Technique: K-means

    *The presentation on assessing system stability will be held Feb 29, 2019

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Local Outlier Factor

    K-means

    Pattern based Outlier

    Detection using Support Vector Machines

    (POD-SVM)

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Ex. a mission system fails or reports

    degradation in a flight several

    deviations outside the norm

    Easy to see to yet hard to diagnose as

    random or a new trend

    Mission dependent

    Distance and density techniques work

    well

    Point Anomaly: A single instance or observation different with

    respect to the data set features

    System Fail = a system loses total functionality and requires a reboot

    System Degrade = a system loses partial functionality, but is still usable

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Contextual Anomaly: Considered abnormal when viewed against meta-information associated with the data points

    Numerically similar clusters

    When taken in context easy to

    see

    Computers have a hard time

    analyzing categorical features

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Collective Anomaly: A collective anomaly designates a group of instances that exhibits an anomalous behavior compared to the other

    groups of instances

    Group of instances or actions

    found nonconforming

    Not necessarily a point anomaly

    May be found base on distance

    techniques, but density

    techniques may struggle to

    identify as anomalous

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Step One: Retrieve the system data from the Jet (ICUE)

    Step Two: Subset the ICUE data for relevant mission systems and features

    Step Three: Merge the Pilot Reports with the ICUE data and tidy

    Step Four: Perform analysis and identify anomalous flights

    Step Five: Use HRC data to identify the cause

    Flights: 1489

    Flight Hours: 2165

    Time Frame: 2016-Present

    Jets: 23

    Mission Systems: 46

    Location: Nellis AFB

    The Data

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Created a new column of the sum total of all the degrades and fails

    Took the Log transform of the totals to achieve a more “normal” distribution

    Found the St. Dev of each data point and picked the top 10 outliers

    Used these specific outliers to rank LOF, K-means, and POD-SVM

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Top Outliers were Flights with

    least system failures (10 flights)

    The top outlier is ranked 21st

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Centers = 6

    Identified 4 of the top 10 outliers

    (3, 4, 7, 8)

    The top outlier is ranked 7th

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Identified 5 of the top 10 outliers

    (2406, 2458, 2382, 2472, 2405)

    The top outlier is ranked 1st

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    Top Outlier = 2406

    1. MADOF: Top Outlier 1st

    2. K-Means: Top Outlier 7th

    3. LOF: Top Outlier 21st

    *Each entry in the data table is the row id and corresponds to a specific flight

    *Column header is the order rank for each outlier

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    SOURCES

    Barnett, V., Lewis, T., 1994. Outliers in Statistical Data, 3rd ed, 584.

    Mandar, Katdare, 2011. Pattern based Outlier Detection in Mixed-Attribute Datasets.

    Hayes, A. Michael, Capretz, AM. Miriam, 2014. Contextual anomaly work for big

    sensor data.

    Retrieved from https://journalofbigdata.springeropen.com/articles/10.1186/s40537-

    014-0011-y

    Vijayakumar Jawaharlal

    JawaHarlal, Vijayakumar, 2014. Knn Using caret R package. Retrieved from

    https://rpubs.com/njvijay/16444

    Saxena, Rahul, 2017. K-Nearest Neighbor Implementation in R Using Caret Package

    KNN R. Retrieved from http://dataaspirant.com/2017/01/09/knn-implementation-r-

    using-caret-package/

    https://journalofbigdata.springeropen.com/articles/10.1186/s40537-014-0011-yhttps://rpubs.com/njvijay/16444http://dataaspirant.com/2017/01/09/knn-implementation-r-using-caret-package/

  • T e s t i n g - T a c t i c s - T r a i n i n g - I n n o v a t i o n - I n t e g r a t i o n

    SOURCES

    http://www.cse.ust.hk/~leichen/courses/comp5331/lectures/LOF_Example.pdf

    Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jörg Sander, 2000. LOF:

    Identifying Density-Based Local Outliers. Retrieved from

    http://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf

    ANALYSIS

    Rstudio Version 1.1.419 – © 2009-2018 RStudio, Inc.

    R i386 3.4.3

    Packages:

    ellipse, Rlof, caret, AppliedPredictiveModeling, DMwR, FNN, factoextra, gridExtra,

    probsvm,stats, tidyr, e1071, distances, ggplot2, dplyr

    http://www.cse.ust.hk/~leichen/courses/comp5331/lectures/LOF_Example.pdfhttp://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf