Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

121
Data Science Conference 11-12 October 2016 Belgrade, Serbia

Transcript of Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Page 1: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Data Science Conference11-12 October 2016Belgrade, Serbia

Page 2: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 3: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Aircraft challenge

Marko, don’t forget to show

the aircraft to the audience

Page 4: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 5: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Dr David Warren

Page 6: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

3400 G shock for 6.5 ms

500 lb. Dropped from 10 ft with a ¼-inch-diameter contact point

1100 ºC flame for 30 minutes.

260 ºC for 10 hours

Immersion in aircraft fluids for 24 hours

Immersion in sea water for 30 days

5,000 pounds crush for 5 minutes on each axis

Pressure equivalent to depth of 20,000 ft.

Page 7: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 8: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 9: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Avoiding Black Boxes with Data ScienceRaffaele Rainone & Marko Vasiljevski

Data Science Conference11-12 October 2016Belgrade, Serbia

Page 10: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Hkjk;l

Dr Raffaele Rainone

Page 11: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Hkjk;l“I don’t know…, pure mathematician, Python developer, data scientist, pizza lover, feeling a bit home sick for Italy, …”

Page 12: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Chatting about…

Aviation Safety

A story about flight data monitoring (FDM)A story of a flight data analyst

A story of a flight safety statistician

Page 13: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 14: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Chatting about…

Data science in (a bit of) action

Jet-engine health - Working with Mr. BayesDetecting safety concerns – PCA, a friend

Finding cuckoo’s eggs – Mr. Markov’s chains

Page 15: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight Data Monitoring (FDM)

Page 16: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight Data Monitoring (FDM)

Page 17: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight Data Monitoring (FDM)

Page 18: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight Data Monitoring (FDM) what?!

Part of safety management system (SMS)Airlines worldwide obliged to do FDMSpotting deviations from safe operationNon-punitive – learn from mistakesConfidential

Page 19: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 20: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Quick Access Recorder

Flight Data Recorder

Data Acquisition Unit

Page 21: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 22: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 23: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 24: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

002AF5C0 00 00 76 07 04 01 64 08 06 01 3C 0C F2 0E 3C 0C ..v...d...<.ò.<.002AF5D0 00 00 58 02 02 0C 00 0C 48 08 00 00 00 00 00 0C ..X.....H.......002AF5E0 00 0C 00 00 00 00 E0 0F 00 00 00 0C 40 0C 00 0C ......à[email protected] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90 01 ...............�002AF600 47 02 77 07 03 00 2B 0E DF 0F 02 0C A5 0F 78 00 G.w...+.ß...¥.x.002AF610 55 04 D3 01 D8 0F EA 03 A4 0F EE 0F FD 0F 00 00 U.Ó.Ø.ê.¤.î.ý...002AF620 08 00 10 00 00 00 08 00 11 00 00 00 06 00 1E 00 ................002AF630 F8 0F 10 00 00 00 8C 0B 0C 00 0A 00 00 00 00 00 ø.....Œ.........002AF640 00 00 77 07 96 0F 64 08 06 01 3C 0C F2 0E 08 08 ..w.–.d...<.ò...002AF650 00 00 00 00 01 00 AC 00 4C 08 01 08 00 00 00 00 ......¬.L.......002AF660 00 00 04 02 D0 02 0D 02 12 02 00 0B 44 01 00 00 ....Ð.......D...002AF670 A4 06 40 01 03 00 00 00 40 02 00 00 00 00 00 00 ¤.@[email protected] 02 00 77 07 00 00 2B 0E DF 0F 04 0C A5 0F 00 08 ..w...+.ß...¥...002AF690 54 04 D4 01 DA 0F EA 03 A4 0F 73 0E 00 00 01 00 T.Ô.Ú.ê.¤.s.....002AF6A0 08 00 10 00 01 00 08 00 00 00 03 00 06 00 1D 00 ................002AF6B0 FB 0F 11 00 02 00 01 00 00 00 03 00 00 00 00 00 û...............002AF6C0 01 00 76 07 00 00 64 08 06 01 3C 0C F2 0E 3C 0C ..v...d...<.ò.<.002AF6D0 6A 02 21 03 03 00 00 00 48 08 00 00 00 00 00 00 j.!.....H.......002AF6E0 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00 00 ................002AF6F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FC 0F ..............ü.002AF700 CC 08 77 07 03 00 2B 0E DF 0F 02 0C A5 0F 79 01 Ì.w...+.ß...¥.y.002AF710 54 04 D4 01 D8 0F EA 03 A7 0F EE 0F FE 0F 01 00 T.Ô.Ø.ê.§.î.þ...002AF720 08 00 00 00 00 00 00 00 11 00 02 00 06 00 00 00 ................002AF730 F8 0F 00 00 00 00 8C 0B 00 00 00 00 00 00 00 00 ø.....Œ.........002AF740 00 00 76 07 94 0F 64 08 06 01 3C 0C F2 0E 08 08 ..v.”.d...<.ò...002AF750 9A 0C 18 00 61 0C 68 01 48 08 F8 0F A4 0C 00 00 š...a.h.H.ø.¤...002AF760 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5B 02 ..............[.002AF770 00 00 03 02 00 00 2B 0F F0 0F 2A 0F 00 00 00 00 ......+.ð.*.....002AF780 00 00 77 07 00 00 2B 0E DF 0F 04 0C A5 0F 2C 00 ..w...+.ß...¥.,.002AF790 54 04 D4 01 D8 0F EB 03 A6 0F 00 0D 00 00 01 00 T.Ô.Ø.ë.¦.......002AF7A0 08 00 17 00 03 00 0F 00 00 00 00 00 06 00 1C 00 ................002AF7B0 F8 0F 10 00 02 00 01 00 00 00 03 00 00 00 00 00 ø...............002AF7C0 00 00 76 07 11 00 64 08 06 01 3C 0C F2 0E 3C 0C ..v...d...<.ò.<.

What now?Data frames

Page 25: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 26: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Questions, suggestions, congestions?

Page 27: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

The story of a flight data analyst

Page 28: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 29: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 30: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 31: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 32: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 33: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 34: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 35: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 36: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Questions, suggestions, congestions?

Page 37: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

The story of a flight safety statistician

Page 38: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

The story of a flight safety statistician

STATISTICS

Page 39: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

How do we monitor safety in FDM

012345678

Event Count 3.141592653589793238462643383384

Page 40: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

And…

Domain knowledgeExperience

Common sense

Page 41: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Uncertainty - good companion

Normal acceleration - Tdwn Normal acceleration – Lift-off

Statistic ValueTotal count 80,663

Average 1.31(Min, Max) (1.03,

2.40)Range 1.37

Standard deviation 0.10

Statistic ValueTotal count 80,663

Average 1.19(Min, Max) (0.66,

1.63)Range 0.97

Standard deviation 0.05

Wider histogram,

less confidence

in mean value

Narrowerhistogram,

more confidence

in mean value

Page 42: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

How apples relate to flight safety

Page 43: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Micromanaging events

0369

Top 5 Events – January 2016

Page 44: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Micromanaging events

High sp

eed t

axiing

High la

teral

g tax

iing

Flap o

versp

eed

Airspe

ed hi

gh 10

000-5

000 f

t

Exces

sive b

reakin

gPu

ll up

Stick

shake

r0369

Top 7 Events – January 2016

Page 45: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Most severe safety event – a riddle

It doesn’t happen in the air

It doesn’t happen at the gate

It happens in the office

Page 46: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Micromanaging events

High sp

eed t

axiing

High la

teral

g tax

iing

Flap o

versp

eed

Airspe

ed hi

gh 10

000-5

000 f

t

Exces

sive b

reakin

gPu

ll up

Stick

shake

r0369

Top 7 Events – January 2016

Page 47: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Not looking at your data!

Page 48: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Trending?

Page 49: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Correlation?

Page 50: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Questions, suggestions, congestions?

Page 51: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Enough, it’s a data science conference!

Page 52: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Improving current state

Jet-engine health with Mr. Bayes

Page 53: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 54: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 55: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Engine performance monitoring

N1 N2

Fuel

Flow

Exhaust Gas

Temperature

Page 56: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Engine performance monitoring

Fuel

Flow

Exhaust Gas

Temperature

Page 57: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Engine performance monitoring

Page 58: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off flow

Page 59: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off flow

Page 60: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off flow

Page 61: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off flow

Page 62: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off flow

Page 63: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off flow

Page 64: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off flow

Page 65: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a problem

If not detected – problems with engine healthDifficult to do with classical signal processing

What if a parameter fails (not so rarely)

Page 66: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a solution

Time at vertical navigation mode selected (VNAV)Drop/rise in fuel flow (FF) and gas temperature (EGT)Search for changes around that point in timeGather the knowledgeCalculate probability for EOT for unseen flight

Page 67: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a solution

Time at vertical navigation mode selected (VNAV)Drop/rise in fuel flow (FF) and gas temperature (EGT)Search for changes around that point in timeGather the knowledgeCalculate probability for EOT for unseen flight

Page 68: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a solution

Page 69: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a solution

Page 70: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a solution

Page 71: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a solution

Page 72: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

End of take-off – a solution

If crazy about this, search for PyData London 2016 videoM. Vasiljevski & R. Rainone

“Python flying at 40,000 feet”

Page 73: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Questions, suggestions, congestions?

Page 74: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Finding new concerns

Novelty detection with principal component analysis

Page 75: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 76: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 77: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 78: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Standardising

mean = 0standard deviation = 1

Page 79: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 80: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 81: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 82: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Formally…

Previous 700 x 123 matrix is A Multiply transpose of A with A to get covariance matrix, C.

It’s 123 x 123 Do singular value decomposition of C to find eigenvectors

and eigenvalues Eigen vectors coincide with directions of highest variance Keep just first couple of vectors to reduce dimensionality

Page 83: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 84: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 85: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 86: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 87: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 88: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 89: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 90: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 91: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 92: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Questions, suggestions, congestions?

Page 93: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Helping business

Flight data upload monitor based on Markov chains

Page 94: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 95: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 96: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 97: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 98: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 99: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

State1 = M x State0State2 = M x State1 = M x (M x State0) = M2

x State0StateN = MN x State0

Don’t have to know states, just powers of M (probabilities)

Page 100: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 101: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 102: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 103: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 104: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 105: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Flight data upload data data data

Page 106: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Questions, suggestions, congestions?

Page 107: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Credits

Page 108: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski
Page 109: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

CHRIS JESSE

Page 110: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

RAFFAELE (PIZZA) RAINONE

Page 111: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

MARTA

VASI

LJEV

SKI

Page 112: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

MILA

N

BOROTA

Page 113: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

MILIC

A IVA

NIŠEV

IĆ+ foeniculumvulgare

Page 114: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

ZIZI(paid for my tickets)

Page 115: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

Main takeaways from this talk

Page 116: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

LEARN

Page 117: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

HAVE FUN

Page 118: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

SHARE KNOWLEDGE

Page 119: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

THINK

Page 120: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

PLAY

Page 121: Avoiding Black Boxes with Data Science part I & II - Marko Vasiljevski

THANK YOU & SAFE FLYING