Edbert Puspito NBA Salary Predictionres.cloudinary.com/general-assembly-profiles/image/...Highest...
Transcript of Edbert Puspito NBA Salary Predictionres.cloudinary.com/general-assembly-profiles/image/...Highest...
NBA Salary Prediction
Edbert PuspitoLink to codes
Imagine
● You are Lakers GM● The team are it worst now, 16-65, last place in Western conference.● Kobe will retire, a bunch of player will have their contract expired.● You definitely need to rebuild, or L.A.’s top line from ticket sales and other
merchandise will drop hard.
Imagine
● The salary cap of 16-17 season is projected to be 89 Million.
● And the remaining contract totals in 26 Million.
● Which means you have 63 Million free beforeYou hit the salary cap.
Your challenge
● Who to re-sign?● Who to target at free agency?● How much should you pay them?
Data science to the rescue
● We created a model to predict a player salary based ONLY on their on-court performance.
● Find out who is overpaid, who is underpaid if you only consider their salary.● Find out which team’s GM is the best.
Hypothesis
● There are many factors that can affect the players’ salary1. Performance.2. Ability to attract fans.3. Market demands 4. Luck (hype)5. .etc
● We assume performance to be able to explain majority of their salary.
● 2,3 and even 4 are also tied to 1
Salary trivias (or not so)
● Highest Salary ever: 33M , MJ, 97-98 season.● The closest to the basketball god: 30 Million, KB24, 12-13, 13-14 season.● In 15-16 season, at least 10 players have salary > 20M● Average : ~5 M● Median: 2.5 M● Many player are “underpaid”, others “overpaid”
Dataset
● 4 seasons from 2012 to 2016.● Statistic from NBA.com, including player bio, basic and advanced stats.● Salaries were taken from ESPN.com, and adjusted for inflation.● Total of 1600 data and ~50 features.
Some graphs
Some graphs
Straight ball or Curve ball?
Base - linear : 0.596
Ridge - poly : 0.604
ElasticNet - poly : 0.581
Random Forest : 0.654
Extra trees : 0.667
Economic data
We added the data of official players twitter followers and the team ticket sales.
And the score goes up to 0.608 (0.699 in random forest regressors).
This did indicate that popularity did affect the players salaries, but we focus on performance (due to the small amount of popularity data that can be crawled)
So, what model we use?
● The forest of forest● 100 Extra trees models.● Each extra trees have min leaf of 3, depth of 12, 50 estimators.
And different sample of training data (70% sampled randomly)● Score can range from 0.64 to 0.68
So, what model we use?
● The forest of forest● 100 Extra trees models.● Each extra trees have min leaf of 3, depth of 12, 50 estimators.
And different sample of training data (70% sampled randomly)● Score can range from 0.64 to 0.68
Findings:
Findings:
Findings:
Findings:
As the “boss” of Lakers, Kobe indeed have all the means
to make his statistic beautiful.
And he is really really famous.
Findings:
Overrated? Maybe, as the model only
consider performance.
What surprising is, despite all the tickets
sales he “raised”, he is just overpaid by 1.5
M, suggesting he may be underpaid
performance wise.
Findings:
FYI, this guy is considered underrated in 15-16
season. As he is only paid 2M by Hornets.
Findings:
● Had breakthrough performance by successfully
defended Lebron @ 13-14 NBA Finals, got
Finals MVP + Championship ring.
● Saw more playing time and got defensive Player
of the year @ 14-15
● Contract resign @ 15-16, hence the jump in
salary. (and overrated-ness… lol)
Findings:
● A “nobody”and considered a “risky” move to
sign @ 12-13. (due to injury records)
● The rest are history.
● Contract will expire at end of 16-17 season,
expect a rocket jump.
Actionable insight
● Assuming their salary won’t
change much, Lakers can
sign those players.
● Maybe add some”overrated”
players that can mentor /
attract fans to games.
● Lakers have to pay me a data
science consultancy fee to
get the full result :)
Actionable insight
● Fire Nets’ GM / whoever made the signing
decision!
Let’s have some fun
Let’s have some fun
Challenges along the journey
● Feature engineering didn’t help much….● Can’t find feature to create● Not enough data● No economic feature
Future ideas
● Gather economic data, such as social media followers (facebook) and activity for every players, team ticket and jersey sales, and see if the additional data increase the models' score.
● Is 0.66 the ceiling of Salary prediction if just performance data is used?
● find out if MJ are really overrated/priced.
● Is there a way to create feature from the basics statistic data to improve the score?