Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... ·...

72
Machine learning for machine data 1 David Andrzejewski @davidandrzej Data Sciences, Sumo Logic Strata Conference – Machine Data Track February 13, 2014

Transcript of Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... ·...

Page 1: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Machine learning for machine data  

1  

David  Andrzejewski  -­‐  @davidandrzej  Data  Sciences,  Sumo  Logic  Strata  Conference  –  Machine  Data  Track  February  13,  2014      

Page 2: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

This  talk:  Machine  Learning  +  Machine  Data  =  Awesome!  

2  

!   YES  –  overview  of  log  data  –  solving  log  data  problems  with  machine  learning  –  specific  examples  

•  (mostly)  Sumo  Logic-­‐related  •  customer  use  cases  

–  general  lessons  learned    

Page 3: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

This  talk:  Machine  Learning  +  Machine  Data  =  Awesome!  

3  

!   NO  (or,  not  much)  –  Sumo  Logic  deep  dive    –  Tech  stack  talk    

•  In-­‐memory  Hadoop  for  real-­‐Ume  Cassandra  SQL  in  hybrid  clouds  

–  Big  data  “shock  and  awe”    •  800  yo[abytes  /  second  ZOMG!!11!!  

–  Algorithm  shootout    •  Deep  learning  vs  random  forests  vs  SVMs  vs  coin  flips  vs  ...  

–  Extreme  math    

HyperLogLog: analysis of a near-optimal cardinality algorithm 131

Definition 1 An ideal multiset of cardinality n is a sequence obtained by arbitrary replications and per-mutations applied to n uniform identically distributed random variables over the real interval [0, 1].In the analytical part of our paper (Sections 2 and 3), we postulate that the collection of hashed valuesh(M), which the algorithm processes constitutes an ideal multiset. This assumption is a natural way tomodel the outcome of well designed hash functions. Note that the number of distinct elements of such anideal multiset equals n with probability 1. We henceforth let E

n

and Vn

be the expectation and varianceoperators under this model.

Theorem 1 Let the algorithm HYPERLOGLOG of Figure 2 be applied to an ideal multiset of (unknown)cardinality n, using m � 3 registers, and let E be the resulting cardinality estimate.

(i) The estimate E is asymptotically almost unbiased in the sense that

1

nE

n

(E) =

n!11 + �

1

(n) + o(1), where |�1

(n)| < 5 · 10

�5 as soon as m � 16.

(ii) The standard error defined as 1

n

pV

n

(E) satisfies as n!1,

1

n

pV

n

(E) =

n!1

�mpm

+ �2

(n) + o(1), where |�2

(n)| < 5 · 10

�4 as soon as m � 16,

the constants �m

being bounded, with �16

.= 1.106, �

32

.= 1.070, �

64

.= 1.054, �

128

.= 1.046, and �1 =p

3 log(2)� 1

.= 1.03896.

The standard error measures in relative terms the typical error to be observed (in a mean quadraticsense). The functions �

1

(n), �2

(n) represent oscillating functions of a tiny amplitude, which are com-putable, and whose effect could in theory be at least partly compensated—they can anyhow be safelyneglected for all practical purposes.

Plan of the paper. The bulk of the paper is devoted to the proof of Theorem 1. We determine theasymptotic behaviour of E

n

(Z) and Vn

(Z), where Z is the indicator 1/P

2

�M

(j). The value of ↵

m

inEquation (3), which makes E an asymptotically almost unbiased estimator, is derived from this analysis,as is the value of the standard error. The mean value analysis forms the subject of Section 2. In fact,the exact expression of E

n

(Z) being hard to manage, we first “poissonize” the problem and examineEP(�)

(Z), which represents the expected value of the indicator Z when the total number of elements is notfixed, but rather obeys a Poisson law of parameter �. We then prove that, asymptotically, the behaviours ofE

n

(Z) and EP(�)

(Z) are close, when one chooses � := n: this is the depoissonization step. The varianceanalysis of the indicator Z, hence of the standard error, is sketched in Section 3 and is entirely parallel tothe mean value analysis. Finally, Section 4 examines how to implement the HYPERLOGLOG algorithmin real-life contexts, presents simulations, and discusses optimality issues.

2 Mean value analysis

Our starting point is the random variable Z (the “indicator”) defined in (2). We recall that En

refers toexpectations under the ideal multiset model, when the (unknown) cardinality n is fixed. The analysisstarts from the exact expression of E

n

(Z) in Proposition 1, continues with an asymptotic analysis of thecorresponding Poisson expectation summarized by Proposition 2, and concludes with the depoissonizationargument of Proposition 3.

Page 4: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

!   Data  sciences  @  Sumo  Logic  !   Co-­‐organizer  @  SF  ML  Meetup  !   Previous  

–  Post-­‐doc  in  knowledge  discovery  

!   Even  more  previous  machine    data  research  projects    

–  University  of  Wisconsin-­‐-­‐Madison  – Microsog  Research  

Context:  me  

4  

−1

0

1

−0.6−0.4−0.200.20.40.60.8

−1

−0.5

0

0.5

PCA3

PCA2PCA1

254 bug1 runs106 bug3 runs147 bug4 runs329 bug5 runs206 bug8 runs186 other runs

Page 5: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Context:  Sumo  Logic  

5  

“Turning Machine Data Into IT and Business Insights”  

Learn,  classify,  predict  

Search,  monitor,  visualize  

Page 6: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Context:  Sumo  Logic  

6  

“Turning Machine Data Into IT and Business Insights”  

Learn,  classify,  predict  

Search,  monitor,  visualize  

Page 7: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Anatomy  of  a  log  message:  Five  W’s  

7  

Page 8: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Anatomy  of  a  log  message:  Five  W’s  

8  

!   When?  Timestamp  with  Ume  zone  

Page 9: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Anatomy  of  a  log  message:  Five  W’s  

9  

!   When?  Timestamp  with  Ume  zone  !   Where?  Host,  module,  code  locaUon  

Page 10: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Anatomy  of  a  log  message:  Five  W’s  

10  

!   When?  Timestamp  with  Ume  zone  !   Where?  Host,  module,  code  locaUon  !   Who?  AuthenUcaUon  context  

Page 11: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Anatomy  of  a  log  message:  Five  W’s  

11  

!   When?  Timestamp  with  Ume  zone  !   Where?  Host,  module,  code  locaUon  !   Who?  AuthenUcaUon  context  !   What?  Log  level  and  key-­‐value  pairs  

Page 12: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

What’s  missing  

12  

Page 13: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Traversing  the  stack  

Sumo  Logic  ConfidenUal  13  

Custom  App  Code  

Server  /  OS  

VirtualizaUon  

Databases  

Network  

Open  Source  Sogware  

Middleware  

12/20/2011 17:23:44 PST [user=234fsf] failed transaction, sessionid:2F0A232324, [host=pay002.sjc] amount=1725.00

12/20/11 17:23:34 AMQ7163: WebSphere MQ job number 18429 started FOR client_session=2F0A232324.

12202011 17:23:27 /usr/local/build/mysql/libexec/mysqld: Abnormal shutdown [18429]

20-12-2011 17:23:19 database-host login[3866]: DEAD_PROCESS: 18429 ttys000

Dec 20, 2011 17:22:14,,, message=Created virtual machine user-3 on esxi01.office.thedomain.com

<134>Dec 20 2011 17:22:12: %PIX-6-106100: access-list inside_access_out denied tcp inside/68.162.72.163(4326) -> outside/45.200.244.124(3127) hit-cnt 1(first hit)

66.249.67.24 - - [20/Dec/2011:17:23:40 -0700] ”POST /APP/Order.php HTTP/1.1" 304 146 "-" SESSION=2F0A232324

Customer  ID  Session  ID  

Job  number  

Process  ID  

Root  cause!  

Page 14: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Enhanced  visibility  into  machine  behaviors  !   Compliance  

–  OperaUonal  (SLA)  –  Regulatory  (audits)    –  Security    

!   Availability  /  performance  –  Faster  MTTR  

!   Business  insights  ($$$)  

Log  use  cases  –  “organizaUonal  percepUon”  

14  

Page 15: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

!   (wildly)  varying  formats    –  prinq,  JSON,  XML,  Windows,  X-­‐delimited,  ...  

!   Specialized  knowledge  

!   Noise  !   Cascading  failures  

 

Log  challenges    

15  

[2008-05-07 09:50:08.450 'App' 3560 verbose] [VpxdHeartbeat] Invalid heartbeat from 10.17.218.46

Page 16: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

“A  distributed  system  is  one  in  which  the  failure  of  a  computer  you  didn't  even  know  existed  can  render  your  own  computer  unusable.”  -­‐  Leslie  Lamport    

Complexity  

16  

Transport for London December 2013

Key to symbols Explanation of zones

1

3

45

6

2

789

Station in both zones

Station in both zones

Station in both zones

Station in Zone 9

Station in Zone 6

Station in Zone 5

Station in Zone 3Station in Zone 2

Station in Zone 1

Station in Zone 4

Station in Zone 8

Station in Zone 7

National Rail

Riverboat services

Airport

Tramlink

Interchange stations

Step-free access from street to platform

Step-free access from street to train

Emirates Air Line

Check before you travel

Key to lines

Metropolitan

Victoria

Circle

Central

Bakerloo

DLR

London Overground

Piccadilly

Waterloo & City

Jubilee

Hammersmith & City

Northern

DistrictDistrict open weekends, public holidays and some Olympia events

Emirates Air Line

Bank Waterloo & City line open between Bank and Waterloo 0621-0030 Mondays to Fridays and 0802-0030 Saturdays. Between Waterloo and Bank 0615-0030 Mondays to Fridays and 0800-0030 Saturdays. Closed Sundays and Public Holidays.---------------------------------------------------------------------------------Camden Town Sunday 1300-1730 open for interchange and exit only.---------------------------------------------------------------------------------Canary Wharf Step-free interchange between Underground, Canary Wharf DLR and Heron Quays DLR stations at street level.---------------------------------------------------------------------------------Cannon Street Open until 2100 Mondays to Fridays and 0730-1930 Saturdays. Closed Sundays.---------------------------------------------------------------------------------Embankment Bakerloo and Northern line trains will not stop at this station from early January 2014 until early November 2014. ---------------------------------------------------------------------------------Emirates Greenwich Peninsula and Emirates Royal DocksSpecial fares apply. Open 0700-2000 Mondays to Fridays, 0800-2000 Saturdays, 0900-2000 Sundays and 0800-2000 Public Holidays. Opening hours are extended by one hour in the evening after 1 April 2014 and may be extended on certain events days. Please check close to the time of travel. ---------------------------------------------------------------------------------Heron Quays Step-free interchange between Heron Quays and Canary Wharf Underground station at street level.---------------------------------------------------------------------------------Hounslow WestStep-free access for manual wheelchairs only.---------------------------------------------------------------------------------Kilburn No step-free access from late January 2014 until mid May 2014.---------------------------------------------------------------------------------Stanmore Step-free access via a steep ramp. ---------------------------------------------------------------------------------Turnham Green Served by Piccadilly line trains until 0650 Mondays to Saturdays, 0745 Sundays and after 2230 every evening. At other times use District line.---------------------------------------------------------------------------------Waterloo Waterloo & City line open between Bank and Waterloo 0621-0030 Mondays to Fridays and 0802-0030 Saturdays. Between Waterloo and Bank 0615-0030 Mondays to Fridays and 0800-0030 Saturdays. Closed Sundays and Public Holidays.No step-free access from late January 2014 until late July 2014.---------------------------------------------------------------------------------West India QuayNot served by DLR trains from Bank towards Lewisham before 2100 on Mondays to Fridays.---------------------------------------------------------------------------------

River Thames

A

B

C

D

E

F

1 2 3 4 5 6 7 8 9

1 2 3 4 5 76 8 9

A

B

C

D

E

F

2 2

22

2

5

8 8 6

2

4

4

65

41

3

2

43

3

36 3 1

1

3

3

59 7 7Special fares apply

5

5

4

4

4

AmershamChorleywood

Mill Hill East

Rickmansworth

Perivale

KentishTown West

CamdenRoad

Dalston Kingsland

Wanstead Park

Vauxhall

Hanger Lane

Edgware

Burnt Oak

Colindale

Hendon Central

Brent Cross

Golders Green

WestSilvertown

EmiratesRoyal Docks

EmiratesGreenwichPeninsula Pontoon Dock

LondonCity Airport

WoolwichArsenal

King George V

Hampstead

Belsize Park

Chalk Farm

Chalfont &Latimer

Chesham

New CrossGate

Moor Park

NorthwoodNorthwoodHills

Pinner

North Harrow

Custom House for ExCeL

Prince Regent

Royal Albert

Beckton Park

Cyprus

GallionsReach

Beckton

Watford

Croxley

Fulham Broadway

LambethNorth

HeathrowTerminal 4

Harrow-on-the-Hill

KensalRise

BethnalGreen

Westferry

SevenSisters

Blackwall

BrondesburyPark

HampsteadHeath

HarringayGreen Lanes

LeytonstoneHigh Road

LeytonMidland Road

HackneyCentral

NorthwickPark

PrestonRoad

RoyalVictoria

WembleyPark

Rayners Lane

Watford High Street

RuislipGardens

South Ruislip

Greenford

Northolt

South Harrow

Sudbury Hill

Sudbury Town

Alperton

Pimlico

Park Royal

North Ealing

Acton Central

South Acton

Ealing Broadway

Watford Junction

West Ruislip

Bushey

Carpenders Park

Hatch End

North Wembley

West Brompton

Ealing Common

South Kenton

Kenton

Wembley Central

Kensal Green

Queen’s Park

Gunnersbury

Kew Gardens

Richmond

Stockwell

Bow Church

Stonebridge Park

Harlesden

Camden Town

Willesden Junction

Headstone Lane

Parsons Green

Putney Bridge

East Putney

Southfields

Wimbledon Park

Wimbledon

Island Gardens

Greenwich

Deptford Bridge

South Quay

Crossharbour

Mudchute

Heron Quays

West India Quay

Elverson Road

Oakwood

Cockfosters

Southgate

Arnos Grove

Bounds Green

Theydon Bois

Epping

Debden

Loughton

Buckhurst Hill

WalthamstowQueen’s Road

Woodgrange Park

Leytonstone

Leyton

Wood Green

Turnpike Lane

Manor House

Stanmore

Canons Park

Queensbury

Kingsbury

High Barnet

Totteridge & Whetstone

Woodside Park

West Finchley

Finchley CentralWoodford

South Woodford

Snaresbrook

Hainault

Fairlop

Barkingside

Newbury Park

East Finchley

Highgate

Archway

Devons Road

Langdon Park

All Saints

Tufnell Park

Kentish Town

Neasden

Dollis Hill

Willesden Green

South Tottenham

Swiss Cottage

ImperialWharf

Brixton

Kilburn

West Hampstead

Blackhorse Road

Acton Town

CanningTown

Finchley Road

Highbury &Islington

Canary Wharf

Stratford

StratfordInternational

FinsburyPark

Elephant & Castle

Stepney Green

Barking

East Ham

Plaistow

Upton Park

Poplar

West Ham

Upper Holloway

PuddingMill Lane

Kennington

Borough

Elm ParkDagenham

East

DagenhamHeathway

Becontree

Upney

Heathrow Terminal 5

Finchley Road& Frognal

CrouchHill

Northfields

Boston Manor

South Ealing

Osterley

Hounslow Central

Hounslow East

Clapham North

Clapham High Street

Oval

Clapham Common

Clapham South

Balham

Tooting Bec

Tooting Broadway

Colliers Wood

South Wimbledon

Arsenal

Holloway Road

Caledonian Road

Morden

West Croydon

HounslowWest

Hatton Cross

HeathrowTerminals 1, 2, 3

ClaphamJunction

WestHarrow

Brondesbury CaledonianRoad &

Barnsbury

TottenhamHale

WalthamstowCentral

HackneyWick

Homerton

WestActon

Limehouse EastIndia

Crystal Palace

ChiswickPark

RodingValley

GrangeHill

Chigwell

Redbridge

GantsHill

Wanstead

Ickenham

TurnhamGreen

Uxbridge

Hillingdon Ruislip

GospelOak

Mile End

Bow Road

Bromley-by-Bow

Upminster

Upminster Bridge

Hornchurch

Norwood Junction

Sydenham

Forest Hill

Anerley

Penge West

Honor Oak Park

Brockley

Harrow &Wealdstone

Cutty Sark for Maritime Greenwich

Ruislip Manor

Eastcote

Wapping

New Cross

Queens RoadPeckham

Peckham Rye

Denmark Hill

Surrey Quays

Whitechapel

Lewisham

Kilburn Park

Regent’s Park

KilburnHigh Road

EdgwareRoad

SouthHampstead

GoodgeStreet

Shepherd’s BushMarket

Goldhawk Road

Hammersmith

Bayswater

Warren Street

Aldgate

Euston

Farringdon

BarbicanRussellSquare

Kensington(Olympia)

MorningtonCrescent

High StreetKensington

Old Street

St. John’s Wood

Green Park

BakerStreet

NottingHill Gate

Victoria

AldgateEast

Blackfriars

Mansion House

Temple

Cannon Street

OxfordCircus

BondStreet

TowerHill

Westminster

PiccadillyCircus

CharingCross

Holborn

Tower Gateway

Monument

Moorgate

Leicester Square

London Bridge

St. Paul’s

Hyde Park Corner

Knightsbridge

StamfordBrook

RavenscourtPark

WestKensington

NorthActon

HollandPark

Marylebone

Angel

Queensway MarbleArch

SouthKensington

SloaneSquare

WandsworthRoad

Covent Garden

LiverpoolStreet

GreatPortland

Street

Bank

EastActon

ChanceryLane

LancasterGate

Warwick AvenueMaida Vale

Fenchurch Street

Paddington

BaronsCourt

GloucesterRoad St. James’s

Park

Latimer RoadLadbroke Grove

Royal Oak

Westbourne Park

Bermondsey

Rotherhithe

ShoreditchHigh Street

Dalston Junction

Haggerston

Hoxton

Wood Lane

Shepherd’sBush

WhiteCity

King’s CrossSt. Pancras

EustonSquareEdgware

Road

Southwark

Embankment

Stratford High Street

Abbey Road

Star Lane

Waterloo

TottenhamCourt Road

Canonbury

Shadwell

Earl’sCourt

NorthGreenwich

CanadaWater

Page 17: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

!   Logs:  like  “computer  tweets”  !   Twi[er  2013*    

–  Peak  @  ~144k  TPS    –  Avg  ~6k  tweets  /  second  

!   Log  data  –  Example:  1  TB  /  day    –  Avg  ~25k  logs  /  second  

“OMG  java.lang.NullPointerExcepUon  #fail”    

17  

* https://blog.twitter.com/2013/new-tweets-per-second-record-and-how  

Page 18: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Systems  that  learn  from  experience  

18  

Page 19: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

19  

Unsupervised  clustering  !   Given:  set  of  items  !   Do:  group  similar  items  

Page 20: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

20  

Unsupervised  clustering  !   Given:  set  of  items  !   Do:  group  similar  items  

Page 21: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Too  many  logs!  “data  disorientaUon”  

   

~60k  results:  30  minutes,  one  component  

Page 22: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

DisUll  logs  down  to  underlying  structure  

Page 23: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

LogReduce:  results  "compressed”  ~1000x    

   

Page 24: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

printf("Health status check: %s is %s”,

hostid, hoststatus)

Health status check: zim-5 is OK

Health status check: gir-3 is OK

Health status check: gir-2 is TIMED OUT

Health status check: dib-1 is OK

In  the  beginning,  there  was  the  prinq()  

   

Log  generaUon  

Page 25: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

printf("Health status check: %s is %s”,

hostid, hoststatus)

Health status check: zim-5 is OK

Health status check: gir-3 is OK

Health status check: gir-2 is TIMED OUT

Health status check: dib-1 is OK

Health status check: *** is ***

Reverse  engineering  prinq()  

       

Log  generaUon  

“magic”  

Page 26: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

26  

1.  Define  string  distance  funcKon  (e.g.,  Левенште́йн)              

2.  Do  distance-­‐based  clustering      

���������

��������

Unsupervised  clustering  !   Given:  log  messages  !   Do:  group  by  “signature”  

Page 27: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Drill-­‐down  into  the  original  raw  logs  

   

Page 28: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

28  

ParKally  supervised  clustering  !   Given:  set  of  items  +  side  info  !   Do:  group  similar  items  

Page 29: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

29  

ParKally  supervised  clustering  !   Given:  set  of  items  +  side  info  !   Do:  group  similar  items  

Page 30: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Too  many  wildcards!  

30  

Page 31: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

“Hint”  from  human  user  

31  

Page 32: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Not  enough  wildcards!  

32  

Page 33: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

“Hint”  from  human  user  

33  

Page 34: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

34  

Learning  to  rank  !   Given:  set  of  items,  historical  data  !   Do:  rank  by  “relevance”  

Page 35: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Two  pages  is  sUll  too  many!  

35  

Page 36: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

36  

Learning  to  rank  !   Given:  signatures,  user  acUvity  

!   Do:  rank  by  “relevance”  

Page 37: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

!   Site  “acUng  weird”  !   InvesUgaUon  with  LogReduce  

–  error  logs  è  issue  with  content  push/publish  workflow  •  root  cause  

–  50  minutes  later:  “object  missing”  errors  serving  content  •  user-­‐visible  outage  

!   Benefits  –  rapidly  “skim”  the  logs  –  create  an  alert  

True  story:  troubleshooUng  @  digital  media  co.    

37  

Page 38: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

 !   Domains  

–  financial  services  –  SaaS  vendors  

!   Use  cases  –  Availability  /  performance  

•  Mean  Ume  to  invesUgaUon  (MTTI)  =  “hours  to  minutes”  

–  Security  •  Quickly  bubble  up  unusual  logs  

More  true  stories  

38  

Page 39: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

General  lessons  

39  

Page 40: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

 !   Combat  “data  disorientaUon”    

General  lessons  

40  

Page 41: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

 !   Combat  “data  disorientaUon”  !   Surface  latent  structure    

 

 

General  lessons  

41  

Page 42: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

 !   Combat  “data  disorientaUon”  !   Surface  latent  structure    !   Link  to  underlying  raw  data    

General  lessons  

42  

Page 43: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

 !   Combat  “data  disorientaUon”  !   Surface  latent  structure    !   Link  to  underlying  raw  data  !   Empower  user  to  improve  results  

 

 

General  lessons  

43  

Page 44: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

unknown  unknowns  

44  

Page 45: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

45  

Outlier  detecKon  !   Given:  data  points  !   Do:  idenUfy  outliers  

Page 46: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

46  

Outlier  detecKon  !   Given:  data  points  !   Do:  idenUfy  outliers  

Page 47: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

47  

Health check OK

Request processed

Txn timeout, retry

 

Anomaly  detecKon  !   Given:  log  data    !   Do:  flag  anomalies  

Page 48: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

48  

Health check OK

Request processed

Txn timeout, retry

 

Anomaly  detecKon  !   Given:  log  data    !   Do:  flag  anomalies  

Page 49: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

InvesUgate  and  annotate  events  

49  

logs  

signatures  

RAW  DATA  

HUMAN  

Page 50: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

InvesUgate  and  annotate  events  

50  

logs  

signatures  

RAW  DATA  

HUMAN  

Page 51: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

InvesUgate  and  annotate  events  

51  

logs  

signatures  

event  

RAW  DATA  

HUMAN  

Page 52: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

InvesUgate  and  annotate  events  

52  

logs  

signatures  

event  

RAW  DATA  

HUMAN  

Umeline  /  alerts  

Page 53: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

53  

Supervised  classificaKon  !   Given:  labeled  data  points  !   Do:  predict  future  labels  

Page 54: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

54  

Supervised  classificaKon  !   Given:  labeled  data  points  !   Do:  predict  future  labels  

Page 55: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

55  

Supervised  classificaKon  !   Given:  log  data,    annotated  events  

!   Do:  classify  new    occurrences  

event  

Umeline  /  alerts  

Page 56: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

True  stories  

56  

!   SSH  problems  –  configuraUon  errors  –  script  user  auth  failures  

!   PotenUal  security  events  –  surge  of  failed  logins  

!   Unhappy  infrastructure  –  Oracle    –  VMware  

Page 57: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

   !   Explain  algorithm  “decisions”  

–  why  was  this  flagged  as  anomaly?  

 

More  general  lessons  

57  

Page 58: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

   !   Explain  algorithm  “decisions”  

–  why  was  this  flagged  as  anomaly?  

!   Link  to  underlying  raw  data  –  (AGAIN)  

 

More  general  lessons  

58  

Page 59: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

   !   Explain  algorithm  “decisions”  

–  why  was  this  flagged  as  anomaly?  

!   Link  to  underlying  raw  data  –  (AGAIN)  

!   Empower  user  to  improve  results  –  (AGAIN)  

 

More  general  lessons  

59  

Page 60: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Numerical  Ume-­‐series  data    

60  

Page 61: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

61  

Signal  decomposiKon  !   Given:  Ume-­‐series  !   Do:  extract  model  components  

Page 62: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

62  

Outlier  detecKon  !   Given:  data  points  !   Do:  idenUfy  outliers  

Page 63: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

63  

Outlier  detecKon  !   Given:  data  points  !   Do:  idenUfy  outliers  

Page 64: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Raw  

Smoothed  

Windowed  model  

Page 65: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

65  

Raw  

Smoothed  

Smoothed  vs  +3σ    

Windowed  model  

Page 66: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

True  stories  

66  

!   Financial  services:  bad  data  points  !   Security:  misbehavior  !   OperaUons:  alerUng    

Page 67: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

 Yet  more  general  lessons  

67  

!   Time-­‐series  data    analysis  well-­‐studied  

!   Read  the  literature(s)!    

Page 68: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

<  OBLIGATORY  PLUGS  >  

68  

freesumo.com  

Page 69: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

BONUS:  scale-­‐out,  streaming  architecture  

69  

logs  

logs  

logs  

Page 70: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

BONUS:  approximaUng  with  Count-­‐Min  Sketch  

70  

Page 71: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

BONUS:  approximaUng  with  Count-­‐Min  Sketch  

71  

Page 72: Machine learning for machine data - info.sumologic.cominfo.sumologic.com/rs/sumologic/images/... · Machine learning for machine data! 1 David!Andrzejewski!1!@davidandrzej! DataSciences,!Sumo!Logic!

Approximate  counUng  with  Count-­‐Min  Sketch  

72