when Data Science takes the...

35
Soccer Analytics when Data Science takes the field

Transcript of when Data Science takes the...

Page 1: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Soccer Analyticswhen Data Science takes the field

Page 2: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Charles Reep(1904 - 2002)

Page 3: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Valeri Lobanovskyi(1939 - 2002)

Page 4: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

“The greatest mistake of my life”Sir Alex Ferguson

Page 5: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

The Maldini principle:the better a defenderthe fewer the tackles

Page 6: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

The numbers game: why everything you know about football is wrong

C. Anderson, D. Sally

Page 7: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Soccer-logs

● events involving the ball occurring during a game

● player, team, position, time, outcome

● semi-automatic collection

Page 8: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Soccer-logs collection system

Page 9: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

{'eventName': 8, 'eventSec': 8.221464, 'id': 217097515, 'matchId': 2576132, 'matchPeriod': '1H', 'playerId': 8306, 'positions': [{'x': 42, 'y': 14}, {'x': 74, 'y': 33}], 'subEventName': 83, 'tags': [{'id': 1801}], 'teamId': 3158}

pass

high pass

accurate

powered by

1700 events per match(in average)

identifiers

Page 11: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Video tracking

● video-cameras are installed in stadiums

● each player identified

● trajectory of the player is inferred

Page 12: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer
Page 14: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

GPS devices track training

sessions

Page 15: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Prediction is better than cure

using AI to predict injuries of soccer players

Page 16: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

2

Page 17: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

2

Page 18: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

188M €in Spain

16% days absence

Economic costs estimation of soccer injuries in first and second spanish division professional teamshttp://bit.ly/cost_injuries_soccer

Page 19: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

High Speed Running

Metabolic Distance

Accelerations

Decelerations

Dynamic stress load Total distance

Fatigue index age, role, heightweight, injuries

High Metabolic Load Distance

Page 20: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Training features (GPS) • Total Distance • High Speed Running (>19.8 km/h)• Metabolic Distance (>20W/kg)• High Metabolic Load Distance (>25.5 W/Kg) • High Metabolic Load Distance Per Minute • Explosive Distance (>25 W/kg <19.8 Km/h)• Accelerations >2m/s2

• Accelerations >3m/s2 • Decelerations >2m/s2 • Decelerations >3m/s2

• Dynamic Stress Load (>2g) • Fatigue Index (Dynamic Stress Load/Speed

Intensity)

Players’ features

• Age• Height• Weight• Role• Previous injuries

Number of injuries that players had occurred before

each training session

Page 21: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

a classification problem

injury examples are very rare

(just 2% of the examples)

Page 22: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Re-balancing the dataset

ADASYN: a technique to rebalance

the dataset

It generates synthetic examples of theminority class

Page 23: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

ACWR =acute workload (7 days)

chronic workload (28 days)

State of the artPro:- simple to compute- high recall

Cons: - monodimensional- low precision- many false alarms

Page 24: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Acute Chronic Workload Ratio(ACWR)

Inur

y R

isk

UNDER-TRAINED

increased risk

OPTIMALTRAINING

ZONE

OVER-TRAINEDincreased risk

0.5 1 1.5 2

high risk

low risk

high recall > 90

low precision < 4%

Page 25: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

high recall > 90

low precision < 4%

ACWR

Nullmodel

Page 26: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

High Speed Running

Accelerations

Dynamic stress load

Fatigue index

Metabolic Distance

Decelerations

Total distance

High Metabolic Load D.

Will get injured?

Age, role, weightYes No

IF

infer the relation between workload variables and

injury likelihood

Injury Forecaster

Page 27: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

High Speed Running

Accelerations

Dynamic stress load

Fatigue index

Metabolic Distance

Decelerations

Total distance

High Metabolic Load D.

Will get injured?Age, role, weight

Yes

NoIFclassify a session as

injury or non-injury

Page 28: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

IFHow IF is made?

Decision trees are simple and

interpretable models

Page 29: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

ACWR

+50%

16

IF

Page 30: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

High Speed Running

Previous Injuries

< 112.3 < 0.39

How IF is made?

High Speed Running

TotalDistance

Previous Injuries

> 112.3 > 1.78 < 0.03

High Speed Running

TotalDistance

Previous Injuries

> 112.3< 374.7 > 2.65 > 1.01

Page 31: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Specialized IF

IFInter

IFJuventus

Page 32: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Personalized IF

IFIcardi

IFRonaldo

Page 33: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer
Page 34: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

Effective injury forecasting in soccer with GPS training data and machine learning

http://bit.ly/plosone_injury

Page 35: when Data Science takes the fielddidawiki.cli.di.unipi.it/lib/exe/fetch.php/bigdataanalytics/bda/bda... · Sir Alex Ferguson. The Maldini principle: the better a defender the fewer

1. Motivate your proposal:

2. State of the art:

3. Define:

4. Extract information

5. Implement:

6. Evaluate:

7. Interpret:

Phases of the projectfind material demonstrating the importance of your proposal;

search for existing solutions

formalize your problem in terms of predictive task

realize your solution using the most suitable technique

evaluate the quality of your solution

interpret your model to extract new knowledge

extract meaningful features