An NFL Prediction Model Sports and math, what’s not to love?

29
AN NFL PREDICTION MODEL SPORTS AND MATH, WHAT’S NOT TO LOVE? By Kelly Galarneau

description

By Kelly Galarneau. An NFL Prediction Model Sports and math, what’s not to love?. Don’t Attack Me Until the End. My probability model is not perfect, and never will be. We will discuss the assumptions and limitations of this model. History and Concept NCAA tournament Rating Systems. - PowerPoint PPT Presentation

Transcript of An NFL Prediction Model Sports and math, what’s not to love?

Page 1: An NFL Prediction Model Sports and math, what’s not to love?

AN NFL PREDICTION MODELSPORTS AND MATH, WHAT’S NOT TO LOVE?

By Kelly Galarneau

Page 2: An NFL Prediction Model Sports and math, what’s not to love?

Don’t Attack Me Until the End. My probability model is not perfect,

and never will be. We will discuss the assumptions and

limitations of this model. History and Concept

NCAA tournament Rating Systems

Page 3: An NFL Prediction Model Sports and math, what’s not to love?

Jeff Sagarin’s Ratings

Published in the USA Today since 1985. NCAA men’s basketball tournament

selections Bowl Championship Series (BCS)

selections Secret Formula

Each team’s rating Home field advantage (constant for all

teams)

Page 4: An NFL Prediction Model Sports and math, what’s not to love?

Any Given Sunday

We know that a team’s performance is not always constant. On some days they play better, others worse.

Many things naturally vary according to the normal distribution, so perhaps we can assume NFL ratings do the same.

In an e-mail, Sagarin suggested that I use a standard deviation of 15 or 16 (perhaps 15.5). I chose 16.

Page 5: An NFL Prediction Model Sports and math, what’s not to love?

Normal Distribution

Here is a graph for the theoretical performance of the Tennessee Titans, with a mean of 32.7 and standard deviation of 16.

Page 6: An NFL Prediction Model Sports and math, what’s not to love?

Normal Curves

Now compare two teams, Titans with average of 32.7 and the Lions with an average of 7.82. We’ll pretend the Lions are playing at home and add in Sagarin’s prescribed 3.02.

Page 7: An NFL Prediction Model Sports and math, what’s not to love?

Monte Carlo Simulation

Used for simulating probability situations. Mostly used for business and finance

applications. Allows us to vary a parameter according to

whatever distribution we choose (uniform, normal, Poisson, exponential, etc.)

As the parameter is changing, we can observe the effect on other variables.

I am using free software from www.yasai.rutgers.edu. (Rutgers University)

Page 8: An NFL Prediction Model Sports and math, what’s not to love?

How each trial works We let the computer

pick a random “performance rating” from the normal distribution for each team.

We see which one is greater and tally that as a win for that team.

Then repeat 1000 times.

This graph shows the results of one trial, with a win going to the Titans.

Page 9: An NFL Prediction Model Sports and math, what’s not to love?

The graph on the left represents a game in which the Lions perform better than average, the Titans perform worse than average, but the Titans still get the win.

The graph on the right represents a game in which the Lions pull off the upset.

Then we repeat…

…1000 times.

Page 10: An NFL Prediction Model Sports and math, what’s not to love?

Here is a graph of 1000 trials

Bin -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 800

20

40

60

80

100

120

140Histogram

LionsTitans

Bin

Freq

uenc

y

Page 11: An NFL Prediction Model Sports and math, what’s not to love?

Here is my actual Excel programming:

A B C D E F G1 Team Rating S.D. Normal If/then Output2 Titans 35.72 16 =gennormal(C2,D2) =if(E2>E3,1,0) =simoutput(F2)3 Lions 7.82 16 =gennormal(C3,D3) =if(E3>E2,1,0) =simoutput(F3)

A B C D E F G

1 Team Rating S.D. Normal If/then Output

2 Titans 35.72 16 25.98819 1 13 Lions 7.82 16 -3.61047 0 0

Here is one trial:

This matchup gives us: Titans 88.8% (1000 trials) Lions 11.2%

Page 12: An NFL Prediction Model Sports and math, what’s not to love?

Do we need a simulation?

I tried to approach the probability calculation from a theoretical perspective.

My thought was to assign each team a random variable X and Y and vary them according to a normal distribution.So: X ~ N(μ₁, σ) and Y ~ N(μ₂, σ).

A statistics textbook led me to the idea of multiplying them to get a 3-D probability distribution [X, Y, and P(X,Y)].

Page 13: An NFL Prediction Model Sports and math, what’s not to love?

A 3-D Probability DistributionHere, X ~ N(35.72,16) and Y ~ N(10.84,16)

Where is X > Y?

Y X

Titans88.8%

Lions11.2%

Page 14: An NFL Prediction Model Sports and math, what’s not to love?

Integrals To find the area under this surface, we need to

evaluate the double integral. Mathematica will not evaluate the exact

integral, but it will give us a decimal approximation.

Page 15: An NFL Prediction Model Sports and math, what’s not to love?

Comparing the results Mathematica Input :

NIntegrate[NIntegrate[(1/((16^2)*2*Pi))*(E^((-1/2)*((x-35.72)/16)^2))*(E^((-1/2)*((y-7.82)/16)^2)),{y,x,300}],{x,-300,300}]

Mathematica gives us: Titans 89.1% Lions 10.9%

The Monte Carlo with 1000 trials gave us: Titans 88.8% Lions 11.2%

Page 16: An NFL Prediction Model Sports and math, what’s not to love?

The Bracket 6 teams from each conference (AFC and NFC)

make the playoffs. The division winners are seeded 1-4 by record, the wild card teams are seeded 5 and 6 by record.

In each game the higher seed gets home field advantage.

In the “Wild Card” round of the playoffs: The 3 seed plays the 6 seed The 4 seed plays the 5 seed.

The 1 and 2 seeds get a first round bye. In the “Divisional” round the 1 seed gets to

play the lower seed of the two advancing teams.

Page 17: An NFL Prediction Model Sports and math, what’s not to love?

So that means…

4   5    

1       2  

3       6  

4  6 5    

1       2  

3       4 or 56  

4  4 or 5 5    

1       2  

3      3 6  

3 seed wins

6 seed wins

Page 18: An NFL Prediction Model Sports and math, what’s not to love?

Possibilities…

In a playoff bracket with 12 teams, there are 11 games to be played, with a total of 2^11=2048 possibilities.

In my Excel spreadsheet I have made use of the complement rule and referencing so that I only need to calculate 66 matchups.

Page 19: An NFL Prediction Model Sports and math, what’s not to love?

Assumptions and Limitations We are using offensive and defensive

statistics from the regular season, we are assuming that teams continue similar play into the post-season.

We are assuming that overall performance is normally distributed.

We are assuming that home field advantage is the same constant for all teams.

We are not taking anything into account for weather, injuries, matchups that might be significant, or any other variables.

Page 20: An NFL Prediction Model Sports and math, what’s not to love?

2007 trial run (AFC)Seed Team Rating Div. Game Champ. Superbowl

Superbowl Win

1 pats 36.42 100.0% 73.9% 48.3% 32.4%

2 colts 30.21 100.0% 58.0% 25.1% 13.7%

3 chargers 29.98 65.7% 28.6% 11.9% 6.5%

4 steelers 23.55 45.5% 12.8% 4.3% 1.9%

5 jags 27.9 54.5% 19.2% 8.0% 4.0%

6 titans 21.79 34.3% 7.4% 2.5% 1.0%

4 2 1 59.6%

Page 21: An NFL Prediction Model Sports and math, what’s not to love?

2007 trial run (NFC)Seed Team Rating Div. Game Champ. Superbowl

Superbowl Win

1 cowboys 27.95 100.0% 62.6% 34.9% 14.5%

2 packers 29.41 100.0% 65.9% 35.6% 15.6%

3 seahawks 21.99 57.0% 18.4% 7.3% 2.3%

4 bucs 19.83 39.9% 12.0% 4.2% 1.2%

5 giants 28.31 60.1% 26.4% 12.7% 5.1%

6 redskins 22.21 43.0% 14.7% 5.2% 1.7%

4 2 1 40.4%

Page 22: An NFL Prediction Model Sports and math, what’s not to love?

2008 Playoff Prediction

My OFFICIAL prediction won’t be available until after week 17 (end of regular season is Sunday, Dec. 28)

Over Christmas break you can access it at: kgalarneau.wikispaces.com/NFL+Prediction

Feel free to e-mail me with your comments or suggestions: [email protected]

Page 23: An NFL Prediction Model Sports and math, what’s not to love?

2008 Prediction (As of Dec. 15)  Seed Team Rating Div. Game Champ. Superbowl Superbowl Win

NFC 1 Giants 28.14   100.0%   63.6%   37.7%   19.0%  2 Panthers 27.05   100.0%   57.5%   28.1%   13.5%  3 Vikings 25.66   55.5%   25.4%   11.8%   5.3%  4 Cardinals 20.66   48.3%   16.3%   5.8%   2.2%  5 Cowboys 23.66   51.7%   20.0%   8.4%   3.6%  6 Bucs 25.31   44.5%   17.2%   8.2%   3.7%     sum   4  2  1  47.4%                       AFC 1 Titans 29.95  100.0%  64.6%  39.6%  22.0%  2 Steelers 29.1  100.0%  65.1%  31.8%  16.9%  3 Jets 20.4  43.6%  14.7%  5.1%  2.0%  4 Broncos 17.61  39.1%  9.7%  2.8%  1.0%  5 Colts 26.39  60.9%  24.3%  10.8%  5.4%  6 Ravens 27.21  56.4%  21.5%  9.9%  5.3%     sum   4  2  1  52.6%  standard dev. = 16         home field advantage = 2.58

Page 24: An NFL Prediction Model Sports and math, what’s not to love?

Questions? Comments?

Page 25: An NFL Prediction Model Sports and math, what’s not to love?

But Wait…

…There’s More!

Page 26: An NFL Prediction Model Sports and math, what’s not to love?

History of Seeds

The NFL has been using the 12 team playoff system since 1990.

Seed Division Champ. Superbowl Superbowl Win1 28 18 82 27 10 53 25 7 2 14 24 5 4 25 12 4 1 16 11 1 1 1

sum 72 72 36 18

Page 27: An NFL Prediction Model Sports and math, what’s not to love?

By Percentage:

Seed Divisional Champ Superbowl Superbowl Win1 38.9% 50.0% 44.4%2 37.5% 27.8% 27.8%3 69.4% 9.7% 5.6% 5.6%4 66.7% 6.9% 11.1% 11.1%5 33.3% 5.6% 2.8% 5.6%6 30.6% 1.4% 2.8% 5.6%

sum 2 1 1 1

Page 28: An NFL Prediction Model Sports and math, what’s not to love?

Superbowl Winners by Seed since 1990

1990 1992 1994 1996 1998 2000 2002 2004 2006123456

Year

Seed

First 10 years – 7 #1 seeds

Last 8 years – Only 1 #1 seed9/10 are 1 or 2 seeds 4/8 are 1 or 2 seeds

Page 29: An NFL Prediction Model Sports and math, what’s not to love?

The End.Special thanks to:Harry GeiserSteve Havlichek