PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone...

29

Transcript of PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone...

Page 1: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives
Page 2: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives
Page 3: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives
Page 4: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Predicting biathlon shooting performanceusing machine learning

Thomas Maier1, Daniel Meister2, Severin Trösch1, Jon Peter Wehrlin1

1Eidgenössische Hochschule für Sport Magglingen EHSM2Datahouse AG

Page 5: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives
Page 6: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Introduction

• Shooting is crucial for end ranking (~50%)(Luchsinger et al. 2017)

• Influence of fatigue and biomechanical parameters(Hoffmann et al. 1992; Sattlecker et al. 2017)

• Shooting mode, athlete level, variation in performance(Luchsinger et al. 2017; Skattebo & Losnegard 2017)

• How predictable are individual shots?

Page 7: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Data

• World Cup, World Championships und Olympic Games (only single athlete categories)

• From HoRa, supplier of target system

• Training data: Test data: 2012/13 – 2015/16 2016/17

Total of 152’640 shots

Page 8: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Data … as PDF

xkcd

Page 9: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives
Page 10: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Tidy data

One row for each shot

Page 11: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Reorganise data with dplyr

Page 12: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Gather data

Page 13: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Feature Engineering (29 Variables)

Page 14: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Rolling functions with zoo

Page 15: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Analysis

Exploratory Data Analysis

• 95% Confidence limits

• Pearson Correlations

• Chi-squared- / Mann-Whitney-U-Tests

Machine Learning

• LogReg: logistic regression using only 1 input-variable

• XGB: extreme gradient boosting with trees

• NNet: artifical neural network

Page 16: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

LogReg XGB NNet

Sequential trees to fit errorsof previous trees

Page 17: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Training Prediction

Training Prediction

Training Prediction

PredictionTraining

Training data Test data

Time

Time sliced cross-validation

Page 18: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Caret – ML model wrapper

Page 19: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives
Page 20: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Final model configurations

Page 21: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Results – Exploratory Analysis

Hit rate varies between: Athletes > disciplines > shooting modes > shot number

Page 22: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Results – ML Models

All models show low predictive power

Complex models show about the same performance as LogReg

Page 23: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Discussion

• Largest differences in hit rates between athletes

• Individual preceding mode-specific hit rate holds almost all predictive information

• Individual shots can be modelled as Bernoulli trial → explains observed variation

• High random influence in competition results (± 1-2 hits / competition)

Page 24: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Selina was really concentratedtoday, so she was able to accessher true potential. She is a professional athlete!

Irene was losing her confidencemidway where she started tothink too much, the pressure was too high on the last two shots.

A Swiss coach

Another Swiss coach

xkcd

Page 25: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

The hot hand [in basketball] is a massive and widespread cognitive illusion.

Daniel Kahneman

Page 26: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives
Page 27: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Final thoughts…

• Not everyone understands probabilities / randomness

• Not everyone is interested in the complexity of your models

• Coaches / customers / executives / the public …

… are interested in stories and specific instructions

Page 28: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives

Thomas MaierSenior Data Scientist

Datahouse AGAlte Börse - Zürich

044 289 92 [email protected]

Page 29: PowerPoint-Präsentation · •Not everyone understands probabilities / randomness •Not everyone is interested in the complexity of your models •Coaches / customers / executives