NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics...

14
NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Transcript of NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics...

Page 1: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015

Michael Dickey, NCSU Statistics 2015

1

Page 2: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

1) Research classification methods2) Gather and clean NFL play by play

data3) Exploratory analysis and significance

testing

Method

2

4) Model building5) Implement models in R Shiny Application

If an offensive play call can be predicted, then the defense can adjust their formation to potentially shut down the offense

Motivation

Page 3: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

Final Variables Used For Modeling

+ Down+ Field Position+ Yards to Go+ Cumulative Number Interceptions+ Cumulative Number Fumbles+ Cumulative number of sacks+ Down * Yards To Go+ Number Offensive Points+ Number Defensive Points+ Point Differential+ Offensive Timeouts Remaining+ Defensive Timeouts Remaining+ Minutes Remaining in Quarter+ Previous play call+ Quarter

3

● Source: ArmchairAnalysis.com● NFL plays from 2000 – 2014 seasons● 132,220 observations (trained using only 2011-2014

seasons)

Data Characteristics

Page 4: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

Logistic Regression

Final Model Created For:

1st Quarter 4th Quarter Winning2nd Quarter 4th Quarter Losing3rd Quarter 4th Quarter Tied

Additional Steps to Final Model:

1) Adjust for non-linearity2) Perform forward selection3) Reduce overfitting with Lasso

Penalty4) Cross-Validation

4

Page 5: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

Non-Linearity of Odds

5

Page 6: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

6

1st Quarter model

Lasso and Cross-Validation

Lasso C-V for

Page 7: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

7

ResultsTested on 20 randomly selected games between 2011- 2014:

Mean Accuracy Rate: 75.4% CorrectHighest Accuracy Rate: 91.4% Correct

Actual Pass Actual Run

Pred Pass 59 13

Pred Run 12 37

79.3% Correct91.4% Correct

Actual Pass Actual Run

Pred Pass 63 8

Pred Run 2 46

Page 8: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

8

Random Forest

Tuning Parameters:

1) Number of candidate splitting variables2) Maximum number of nodes3) Cutoff threshold

Variable Selection

● Assess importance of variables with Gini Index

● Informal process of trial and error

Variable importance expressed as the mean decrease in Gini index

Page 9: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

Tuning ParametersCandidate Splitting Variables:- Compare OOB error rates

Maximum Nodes:- Trees grown to maximum depth- Determined via experimentation

Cutoff Threshold:- Use Youden index to maximize

(Specificity + Sensitivity)

9

tuneRF: {randomForest}

Page 10: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

Tuning ParametersCandidate Splitting Variables:- tuneRF: {randomForest}

Maximum Nodes:- Trees grown to maximum depth - Determined via experimentation

Cutoff Threshold:- Use Youden index to maximize

(Specificity + Sensitivity)

9

Trees are overgrown to promote variability!

Page 11: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

Tuning ParametersCandidate Splitting Variables:- tuneRF: {randomForest}

Maximum Nodes:- Trees grown to maximum depth- Determined via experimentation

Cutoff Threshold:- Use Youden index to maximize

(Specificity + Sensitivity)

9

coords: {pROC}

Page 12: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

12

Results

Actual Pass Actual Run

Pred Pass 59 7

Pred Run 6 47

Actual Pass Actual Run

Pred Pass 60 15

Pred Run 11 35

Tested on 20 randomly selected games between 2011- 2014:Mean Accuracy Rate: 74.5% CorrectHighest Accuracy Rate: 89.1% Correct

89.1% Correct78.5% Correct

Page 13: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

• R Shiny visualization demo

• Plotting the predictions within one new game (try for Super Bowl 2015)

13

Visualizing Predictions

Page 14: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1.

Results and Conclusions

• You can predict NFL plays with high accuracy using simple variables!

• Model building processes are much different but achieve similar accuracy rates

Further Research

• Compare with more complex machine learning techniques (i.e. neural networks)

• Develop a loss function to account for difference in consequences for each misclassification