Introduction Methods Results and Error Analysiscs229.stanford.edu/proj2016/poster/Qiu-Automatic...

Automatic Classification of Pick-and-Roll plays in the NBAWill Qiu

Introduction

The pick-and-roll is a basic and fundamental type of play frequently employed teams in the NBA. Knowingboth how to execute as well as stop the execution of a pick-and-roll is an important skill that players andcoaches must develop. This has led teams to develop a wide variety of pick-and-roll type of plays to use dur-ing game-time, and many teams have been come to known for the their particular style of play. Recognizingwhen a pick-and-roll is about to occur, its general type is vital for a defensive team’s success in stoppingits execution in a timely manner. More broadly, knowing which kinds of pick and roll tend to be effectiveis also important for teams as they learn to utilize their players on the court. Traditionally, this is a taskrelegated to the coaching staff and involves spending a tedious amount of time both in tape-watching sessionsand scrimmages. Thus, a method which can take the large volumes of available game data and recognizethe specific play-making strategies of teams in the league will help coaches better utilize their time whendeveloping new styles of play.

(coachesclipboard.net)

Data and Features

NBA SportsVu data of player and ball movements from 630 regular season games from the 2015-2016 seasonin the form of (x,y) coordinates was used. Data is discretized into individual events and further discretizedinto single frames (moments) at 25 frames per second. Relevant "pick-and-roll-able" moments and theirfeatures were extracted. Data were then labeled into several different categories for prediction: {no play,screened shot, pick-and-pop, classic pick-and-roll, pass to an open man}. I attempted both binary classifica-tion: {pick-and-roll, not pick-and-roll} as well as multi-label classification.

Feature RepresentationMoment Length (normalized to 1) mend −mstartTime of ball-screen ts = argmin

tdt(pd, ps)[1]

Ball handler’s starting distance to basket d0(ph, b)

Speed of player i Si =∑

t dtt )

Min distance (all pairwise players and basket) min({d1, d2..., dt})Max distance for (all pairwise players and basket) max({d1, d2..., dt})Average distance (all pairwise players and basket) avg({d1, d2..., dt})Total Distance Traveled by Player i sum({di(2− 1), di(3− 2)...di(t− t− 1)})

Scatter Plot of Main Features (Handler, Screener and On Ball Defender)

Methods

I train and assess the performance of two classifiers using gradient boosting and SVM –specifically, twoseparate groups of models. The first group attempts simple binary classification of pick-and-roll plays. Thesecond group attempts a multi-class classification task whereby different types of plays in addition to theclassic pick-and-roll are predicted. The data is split 70-30 in order to compute validation errors. The train-ing set contains ntraining = 134, 165 total unique plays and the test set contains ntest = 4816 total uniqueplays.

Gradient Boosting Classification

Gradient boosting is a boosting method that uses gradient descent to train a number of weak classifierswhich are then combined to form a single more robust classifier. For the case of binary classification, gradi-ent boosting is formulated in the following way:

Start with initial models for a given class i Fi, then, for each iteration until convergence:

Calculate negative gradient for i :

−gi(xi) = −∂L(yi, F (xi))

∂F (xi)

Fit a regression tree hi to −gi(xi).

Compute Ft:Ft+1 := Ft + ρh

Here, L is a loss function and 0 < ρ ≤ 1 is the learning rate. Here, L is chosen as the logistic loss function. Toget at the best classifier, I experimented with different learning rates ρ and assess each one’s performance.

Support Vector Machines

Support Vector Machines is a more traditional method which attempts to find the maximal separatinghyper-plane among two sets of points. For classes where y = {−1, 1}, The typical SVM classifier is classifiedas the following:

hw,b = g(wTx + b)

Where:

w =∑i

hiyiK(xi,x)

|SV |∑i∈SV

(yi −∑j

hjyjK(xj,xi))

and K is the Kernel function. For the classification task here, the Radial Basis Function is used as the Kenel.The RBF Kernel is defined as:

K(x,x′) = exp

(−||x− x′||2

Results and Error Analysis

Model Training and Test Error (Training size: 134,165, Test size: 4,816)Model Training Accuracy Test AccuracyBinary SVM with RBF Kernel 0.9459 0.946Multi-Class SVM with RBF Kernel 0.9918 0.695Gradient Boosting (Binary) (ρ = 0.1) 0.937 0.930Gradient Boosting (Binary) (ρ = 0.2) 0.941 0.933Gradient Boosting (Binary) (ρ = 0.5) 0.946 0.940Gradient Boosting (Multi-Class) (ρ = 0.1) 0.939 0.931Gradient Boosting (Multi-Class) (ρ = 0.2) 0.947 0.941Gradient Boosting (Multi-Class) (ρ = 0.5) 0.918 0.90

0.0 0.2 0.4 0.6 0.8 1.0False Positive Rate

Binary Classification (pick-and-roll vs not pick-and-roll)

binary_label_5000_trees_learning_rate_0.2binary_label_5000_trees_learning_rate_0.1svm_rbfbinary_label_5000_trees_learning_rate_0.5

No_play

multi_label_5000_trees_learning_rate_0.2multi_label_5000_trees_learning_rate_0.1svm_rbf_multimulti_label_5000_trees_learning_rate_0.5

Screened_shot

Pass_to_open_man_and_drive

pnp_non_roll_man

pick-n-roll

Handler_drive_to_basket

ROC Curves For Different Methods (Binary and Multi-Label Classifiers)

Discussion and Extension

All of the methods perform very well for binary classification and their differences are negligible. There ishowever greater variation in multi-label classification between methods depending on the specific label beingpredicted. Surprisingly, the multi-label SVM had terrible performance overall for multi-label classificationagainst gradient boosting.

Gradient boosting is a much better method overall. But the high false positive rates for some of the labelssuggest that the extracted features do not provide adequate predictive value for distinguishing between morenuanced types of plays.

One possible future extension is to take into account the sequential nature of the subject matter, and useas input, a sequence of values of player positions to make predictions about the next likely action or set ofsequences.

References

[1] Armand McQueen, Jenna Wiens, and John Guttag. “Automatically recognizing on-ball screens”. In: 2014MIT Sloan Sports Analytics Conference. 2014.

Introduction Methods Results and Error Analysiscs229.stanford.edu/proj2016/poster/Qiu-Automatic...

Documents

Transcript of Introduction Methods Results and Error Analysiscs229.stanford.edu/proj2016/poster/Qiu-Automatic...

Denoising Low Light Images - Machine learningcs229.stanford.edu/proj2016/...DenoisingLowLightImages-poster.pdf · Denoising Low Light Images Nitish Padmanaban, Geet Sethi, Paroma

Deep Reinforcement Learning for General Game Playing ...cs229.stanford.edu/proj2016/report/ArthursBirnbaum-Deep... · Deep Reinforcement Learning for General Game Playing ... approximates

YOLOFlow: Improved Real-time Object Tracking in Videocs229.stanford.edu/proj2016/.../YOLOFlow_Poster_BuhlerLambertVilim.pdf · • Problem: YOLO detects objects within an image but

Gary Thung, Mindy Yangcs229.stanford.edu › proj2016 › poster › ThungYang...Classification of Trash for Recyclability Status Gary Thung, Mindy Yang {gthung, mindyang}@stanford.edu

Predictive analysis on Multivariate, Time Series …cs229.stanford.edu/proj2016/report/Thakkar_Predictive_Analysis_on...1 Predictive analysis on Multivariate, Time Series datasets

Remote Surface Classiﬁcation for Robotic Platformscs229.stanford.edu/proj2016/report/RoderickAnderson... · 2017-09-23 · Remote Surface Classiﬁcation for Robotic Platforms Will

Recognition of Tourist Attractions - Machine Learningcs229.stanford.edu/proj2016/report/LiLiYue-RecognitionOf...Recognition of Tourist Attractions Yuanfang Li (yli03@stanford.edu)

Reimaging Shallow Structure - Machine Learningcs229.stanford.edu › proj2016 › poster › DePaulWood-Re...Title: Reimaging Shallow Structure Author: Greg DePaul (gdepaul@stanford.edu)

Using Gene Expression Data to Predict Clinical Information ...cs229.stanford.edu/proj2016/poster/Abell-UsingGeneExpressionData… · progesterone receptor in breast cancer was very

Monitoring%Illegal%Fishing%through%Image%Classiﬁcaon%cs229.stanford.edu/proj2016/poster/GeislerFrostMahajan-Monitoring... · Nature Conservancy, an NGO, monitors illegal ﬁshing

1 Introduction - CS229: Machine Learningcs229.stanford.edu/proj2016/report/TataruSalernoZivkovic-AutsimAnd... · Autism and The Human Microbiome Christine A. Tataru Michael D. Salerno

Alisha Rege (amr6114@stanford.edu)cs229.stanford.edu › proj2016 › poster › Rege-Viewpoint...Alisha Rege (amr6114@stanford.edu) Purpose Artificial Intelligence can play a key

Abhijit Pujare, A. Shukla - Machine Learningcs229.stanford.edu/proj2016/poster/PujareShukla-Realtime...Abhijit Pujare, A. Shukla {apujare, *}@stanford.edu Overview Harassment is an

CS229 MACHINE LEARNING, DECEMBER 2016 1 Implementing ...cs229.stanford.edu/proj2016/report/Acevedo-ImplementingMachine... · Implementing Machine Learning in Earthquake Engineering

1 Genome Dreaming - Machine learningcs229.stanford.edu/proj2016/report/MaheshwariWuElibol-GenomeDreaming... · 1 Genome Dreaming Akshay Maheshwari, Bohan Wu, and Oguz H. Elibol˘

Classification of Artist Genre through Supervised Learningcs229.stanford.edu/proj2016/report/RidleyDumovic-ClassificationOf... · 15/12/2016 · available Spotify API that allowed

Embodied Music Meditation: A Real-time Interactive Audio ...cs229.stanford.edu/proj2016/poster/RauZhangZhou-EmbodiedMusic... · Embodied Music Meditation: A Real-time Interactive

Multiple Narrative Disentanglementcs229.stanford.edu/proj2016/poster/EnglerHarvey... · Multiple Narrative Disentanglement ft. identification of narrative threads in Infinite Jest

Deep Learning for Object Classification in Retail …cs229.stanford.edu › proj2016 › poster › WeeChongBustan...Deep Learning for Object Classification in Retail Stores IdawatiBustan,

Digital Predistortion Using Machine Learningcs229.stanford.edu/proj2016/poster/Peroulas-DigitalPredistortion-poster.pdf · PA driven into nonlinear region The left power spectrum