Download - Data-Driven Ghosting using Deep Imitation Learning...1 2017 Research Papers Competition Presented by: Data-Driven Ghosting using Deep Imitation Learning Hoang M. Le1, Peter Carr2,

1

2017ResearchPapersCompetitionPresentedby:

Data-DrivenGhostingusingDeepImitationLearning

HoangM.Le1,PeterCarr2,YisongYue1,andPatrickLucey3CaliforniaInstituteofTechnology1,DisneyResearch2,andSTATSLLC3

1. IntroductionCurrentstate-of-the-artsportsstatisticscompareplayersandteamstoleagueaverageperformance.For example,metrics such as “Wins-above-Replacement” (WAR) in baseball [1], “Expected PointValue”(EPV)inbasketball[2]and“ExpectedGoalValue”(EGV)insoccer[3]andhockey[4]arenowcommonplaceinperformanceanalysis.Suchmeasuresallowustoanswerthequestion“howdoesthisplayerorteamcomparetotheleagueaverage?”Even“personalizedmetrics”whichcananswerhowa“player’sorteam’scurrentperformancecomparestoitsexpectedperformance”havebeenusedtobetteranalyzeandimprovepredictionoffutureoutcomes[5].

Thesemeasureshaveenhancedourabilitytoanalyze,compareandvalueperformanceinsport.Butthey are inherently limited because they are tied to a discrete outcome of a specific event. Forexample,EPVforbasketballfocusesonestimatingtheprobabilityofaplayermakingashotbasedonthecurrentsituation,andislearntoffenormousamountsofhistoricaldata.Thegeneralusecaseisthentoaggregatetheseoutcomes,andcompareandrankthemtoseehowvariousplayersandteamscomparetoeachother.Incontrast,whatwe’dreallyliketoknowishowteamscreatetimeandspaceforscoringopportunitiesatthefine-grainlevel.

Withthewidespread(andgrowing)availabilityofplayerandballtrackingdatacomesthepotentialtoquantitativelyanalyzeandcomparefine-grainmovementpatterns.Anexcellentexampleofthiswasthe2013ESPNarticlewrittenbyZachLowe,whichdescribedhowtheTorontoRaptorswereusing“ghosting”toanalyzeplayerdecision-makinginSTATSSportVUtrackingdata[6].Specifically,theRaptorscreatedsoftwaretopredictwhatadefensiveplayershouldhavedoneinsteadofwhattheyactuallydid. Developingthecomputerprogramrequiredsubstantialmanualannotation,butthe insightgainedturnedheadsbecause itmadetheeffectivenessofdefensivepositioningbothameasurableandviewablequantityforthefirsttime.

Motivatedbytheoriginal“ghosting”work,weshowcaseanautomatic“data-drivenghosting”methodusingadvancedmachinelearningmethodologiesappliedtoaseason’sworthoftrackingdatafromarecent professional league in soccer. An example of our approach is depicted in Figure 1whichillustrates a scoring chance that Fulham (red) created against Swansea (blue). Suppose we areinterestedinanalyzingthedefensivemovementsofSwansea.Itmightbeusefultovisualizewhattheteamactuallydidcomparedtowhatatypicalteamintheleaguemighthavedone.Usingourapproach,we are able to generate the defensive motion pattern of the “league average” team, whichinterestinglyresultsinasimilarexpectedgoalvalue(69.1%forSwanseaand71.8%forthe“leagueaverage” ghosts -- to fully appreciate the insights revealed by data-driven ghosting,we urge thereaders to view the supplemental video at http://www.disneyresearch.com/publication/data-driven-ghosting.)

2


Figure1Ourdata-drivenghostingmethodcanbeappliedtovariousgamecontextstobetterunderstanddefensivestrategies.Inthedepictedscenario,Fulham(red)scoresagoalonSwansea(blue).Ghosts(white)representwhereSwanseadefendersshouldhavebeenaccordingtoaleagueaveragemodel(LAG)andManchesterCitymodel(MUG).

Howeverinpractice,acoachoranalystmaynotwanttocomparetheirdefensetotheleagueaverage,buttoanotherspecificteam.By“fine-tuning”ourleagueaveragemodeltothetrackingdatafromaparticular team, our data-driven ghosting technique can estimate how each team might haveapproached thesituation. Forexample, thecoach/analystmaywant to seehowManchesterCitywoulddefendthesameattackingplay.Usingourapproach,wecannowseehowtheywoulddefenddifferently(Figure1(right)),andhowmuchitchangestheEGV(69.1%to41.7%).Theotherbenefitofusingourghostingapproachisthatissavesthecoach/analystfromsearchingforsimilarplaysinothermatches(whichmaynotevenexist).

To achieve automatic ghosting, we leverage a machine learning method called “deep imitationlearning”.OurmethodologyresemblestechniquesusedtoteachcomputerstoplayAtari[7]andGo[8]. We modify standard recurrent neural network training to consider both instantaneous andfuture losses,whichenablesghostedplayerstoanticipatemovementsoftheirteammatesandtheopposition.More importantly,ourapproachavoidstheneedforman-yearsofmanualannotation.Ourghostingmodelcanbetrainedinseveralhours,afterwhichitcanghosteveryplayinreal-time.

Inthenextsections,wedescribethemethodologybehindourghostingsystem,andshowcasehowautomatedghostingcanprovide insightfulanalysesandcomparisonsof teamdefensivebehavior.Wealsoemphasizethatourapproachisgeneral,andcanbeappliedtoawiderangeofsportssuchasbasketballandfootball.

2.DeepImitationLearningforModelingDefensiveSituationsWhilethemathematicalbackgroundrequiredtoimplementourdeepimitationlearningmethodcanseemcomplicated(seeAppendixAforfulldetails),thehigh-levelintuitionisquitesimple. Inthissectionwe provide an overview of deep imitation learning, and how it can be applied to soccertrackingdata.

3


Forthispaper,weused100gamesofplayertrackingandeventdatafromarecentprofessionalsoccerleague.Asweareinterestedinmodelingdefensivesituations,weonlyfocusedonsequencesofplaywhere the opposition had control of the ball. A defensive sequence is terminatedwhen a goal isscoredagainstthedefendingteam,theballgetsoutofthepitch,dead-balleventsoccur(e.g.foul,off-side),orthedefensiveteamregainspossessionoftheball.Intotal,therewereapproximately17400sequences of attacking-defending situations (~3 million frames at 10 frames per second). Theaveragelengthofallsequencesisapproximately170frames,or17seconds.

2.1.Whatis“DeepImitationLearning?”Beforeexplainingwhat“deepimitationlearning”is,wefirstexplainwhat“imitationlearning”is.Inmanycomplexsituations,itcanbeverychallengingforahumanexperttodescribeandcodifythepolicyorstrategyduetothegranularityorfidelityofthesituation.Forsuchtasks,wecanusemachinelearningtoautomaticallylearnagoodpolicyfromobservedexpertbehavior,alsoknownasimitationlearning or learning from demonstrations, which has proven tremendously useful in control androboticsapplications[9-14].

Duetothedynamic,continuousandhighlystrategicnatureofsports likesoccer,hand-craftingormanuallydescribingstrategyata fine-grain level isequallyproblematic.Forexample inFigure1,gettingahumantodescribethelocation,velocityandaccelerationofeveryplayerattheframe-levelwouldbeprohibitivelytime-consuminganderror-riddled.Evenifahumanwereabletodescribetheplayviarules,itishighlyunlikelythatanotherhumanwouldbeabletolearnfromsuchrulesasitwouldsurelymiss some importantcontextorother information. Inpractice,ahumanwould justobservemanyexamplesuntil theycouldunderstandwhat todoataconceptual level.Teachingacomputer isnodifferent.Thekey is to firstobtain the right representationwhichcanenable thecomputertolearnfromtheobservations.Inrecentyears,deeplearninghasproventobeapowerfultoolcapableoflearningamulti-layerrepresentationhiddeninthedata,enablingautomaticfeaturediscovery that saves tremendous amount of human engineering effort. In this work, we bringtogetherelementsfromautomaticformationdiscovery[16],imitationlearningcombinedwithdeeplearningmethodstolearncomplexrelationshipsfromhigh-dimensionalspatiotempotalsportdatadomainssuchassoccer.

2.2.FormationDiscoveryandDeepImitationLearningApplicationAsthedatacomesfromdifferentteamsandplayers,onekeycomponentofthepre-processingstepisrole-alignment(ororderingtheplayers ina formwherethe computer can quickly compare strategicallysimilarplays).Weextractthedominantroleforeachplayer fromboth thedefending andattacking teambased on the centroid positions throughout thesegmentofplay,regardlessofthenominalpositionofsuch player. For example, a player whose nominalroleiscentraldefendermayfindhimselfoccupyingthedominantroleofamidfielderincertainsequenceof play. Instead of enforcing a pre-determined

formationontotheteams,thecentroidpositionsfor

4


eachsequenceareautomaticallydiscoveredfromdatabyclusteringeachrole,viaalinearassignmentalgorithm, to a role centroid represented by a mixture of Gaussian distributions, in a way thatmaximizes the self-consistency within role from one segment of play to another (resembles themethodfrom[16]).Theresultisanaverageformationacrosstheseasoncloselyresemblinga4-4-2formation.

As soccer is fundamentally a spatial game, one would expect the geometric relationship amongplayersandtheballtocontainimportantsemanticandstrategicvalues.Weformthefullinputvectortoourmodelbyincludingnotonlytheabsolutecoordinatesofplayersandball,butalsotherelativepolarcoordinates(distanceandangle)ofeachplayertowardstheball,goal,andtherolethatwetrytomodel.The full featurevectorateachtimestepcontainsgeometric features forall roles in theformationintheformofonemini-blockforeachrole.Thesemini-blocksarethenstackedinafixedorder consistentwith role alignment. In order for themodel to identify one-on-one versus zonecoverage, we also indicate which mini-block of input features correspond to the closest threepositionstotherolebeingmodeledateachtimestep.Inaddition,anextrainputvectorindicatingtheteamidentityisaddedtotheinput.Thisidentityvectorisusefulforlearningparticularstructuralandstylisticelementsassociatedwithdifferent teams,allowingus thestudy the impactof tuningourghostingmodeltodifferentteamstyles.

Weuserecurrentneuralnetworks,apopulardeep-learningtool,tolearnthefine-grainedbehaviormodelforeachroleintheformationineachtimestep.AparticulartypeofrecurrentneuralnetworkscalledLongShort-TermMemory(LSTM)wasusedduetoitspowerfulabilitytocapturelong-rangedependenciesinsequentialdata.Themodeltakesinasequenceofinputfeaturevectorsasdescribedaboveand the corresponding sequenceof eachplayer’spositionsasoutput labels.Eachplayer ismodeledbyanLSTM,whichconsistsoftwohiddenlayersofnetworkswith512hiddenunitsineachlayer.Theroleofthesehiddenunitsistocapturetheinformationfromtherecenthistoryofactionsfromallplayersandmaptheinformationtothepositionofthenexttimestep,inamanneranalogoustohowanAIprogramwastrainedfromdatatomapthehistoryofthegametothenext frameofactioninAtariandGo.

Usingthestandarddeeplearningapproach,however,provesinsufficienttolearnarobustbehaviormodel,duetothetypicallylongtemporalspanofsequencesofplayandsheerhigh-dimensionalityofthelearningproblem.Intuitively,aswithanyregressionmethod,themodel’spredictionscandeviatefromthegroundtruthlabels.Insequentialmodelingsettings,thisdeviationcancompoundovertimeandcanleadtoseriousmodelingerrors.Toaddressthisissue,weleveragetechniquesfromimitationlearning. Themain idea is thatwewant to train amodel that can learn to recover from its ownpredictionmistakesso that themodelcanberobustover longsequencesofdecisions.Weuseanimitationlearningalgorithmthatlearnstocapturenotonlythebehaviorofeachroleintheteam,butalsohowmultipleplayersineachteamjointlybehavefromoneframetothenext.Afulldescriptionofourmachinelearningapproachisgivenintheappendix.

5


3.CharacterizationofTheAverageGhostingTeam

Asafirstchecktoseeifourleague-averagemodelpassestheeye-test,wedeploythetrainedmodelon held out sequences to inspect whether the trainedmodel behaves in a sensiblemanner.Weobserve that our model learned to maintain solid defensive formation and structure, with themodeledplayersmovinginamannerthatexhibitsspatialandformationalawareness.Toillustratethis,weshowthreeexamplesofplayinFigure3.

Inthefirstexample,theghostingplayers(white)fromLiverpoolmovetogetherwiththerestoftheteammates (blue) in pressing higher up the pitch in a situationwhenManchester City (red) justregained possession in the middle of the field. To avoid clutter, we display only the ghostingtrajectories(inwhite)ofthefiveleftpositionsoftheteam.

Figure3Examplesofghostingbehaviorforour“leagueaverage”model.

Because our algorithmmodels the interactions between teammates, our ghosted players exhibithigh-level coherent team behaviors. In the middle panel of Figure 3, ghost player number 5 ofSunderlandbrokefromformationtochallengetheballcarriernumber7ofAstonVilla(red).Ghostplayer number 9 of Sunderland swaps roles with ghost number 5 and drops back to mark theattackingnumber6.

Inmanysituations,thebehaviorofthe“leagueaverage”teammaybesubstantiallydifferentfromtheactualplaysuchthattheoutcomecouldbedifferent.IntherightpaneofFigure3,theghostingplayernumber8moreproactivelyclosedthepassinglanethatcouldhavepreventedtheshotongoalbyattackingnumber6(red),whichhappenedinrealityduetothefactthatbothnumber5and8fromQPR(blue)dropmoredeeplyandyieldedopenspace.

3.1VariancefromTheAverageGhostingbyPositionsandTeam

We canquantifyhow theplayers and teamsdiffer from the “average” team.Theaverage level ofdeviationacrossallplayersandteamsisabout~4meters.Toputthisnumberincontext,notethat

6


thisisahighlyaccurateaveragelevelofprecision,asthemodelhastotakeintoaccountthearbitrarilylongsequenceofplay.Incontrast,morenaivemachinelearningapproaches(thatdonotaccountforerrorpropagationthroughtime)wouldsufferfromveryhighlevelsofpredictionerror(frequentlyinthe20-30metersrange).Wecanfurtherbreakdownthedeviationintogroupsofposition.Forthedefenderpositions,themajorityofthedeviationsarelessthan3metersapartfromtheactualplayerpositions.Thedeviationincreasesasthedominantpositionsmovefurtherupthefield.Thisreflectsincreasinglevelofvariationinhowattack-orientedplayerswoulddefend.Thedefensivebehaviorofa striker, for example, could change significantly from team to team and among specific players,leading to ahigher level of varianceby the “average” ghostingpositions.We show thisdeviationaccordingtopositioninFigure4.

Figure4:Deviationfromtheaverageghostingmodelbypositions(Central Defenders (green), Central Midfielders (blue) andForwards(red)

Forourteamanalysis,weusedall20teamsinourdataset. As we can not disclose the specificperformance of these teams, we denote them asteams A-T. At the team level, this variance canindicatewhich teamdiffers themost fromaveragebehavior in terms of defensive positioning. Wequantifythisoutofpositionratiobyusingan80-20rule:aplayeratanygivenmomentisconsideredoutofpositionifhisdeviationfromhisghostisinexcessofthe80thpercentileoftheentireleagueaverage.Abreakdown of this tendency gives us a sense of

whichteamsusenon-conventionalformations.InFigure5,theteamsaresortedbythetotalnumberofgoalsconcededovertheseasoninanincreasingorder.Weanalyzedeachteam’sbehaviorviatwogroups: i) “back line” positions (backs), and ii) “front line” positions (midfielders and forwards).While this positioning variance is only onepart of thewholepicture, note that both the top andbottomteamsintheleagueintermsofthegoalconcededwerethetwooutliers.Forthetopteam,the

deviation may come in part from the attack-oriented positions (wingers and forwards)exhibiting awider range ofmovement relative toother teams. In the caseof thebottom team,whoended the season with a relegation, bothdominantlydefensiveandoffensiveplayerstendtodriftfarfromtheleagueaverage.

Figure 5: The out of position frequency (large deviations) byteamforboththebackandforwardpositions.Teamsareorderedfrom left to right sorted by the overall goals conceded in theentireseason

7


3.2ExpectedGoalValueofGhostingTeam

Anaturalusecaseforghostingistostudyhowhypotheticalscenariosmayunfold,i.e.,counterfactualreasoning.Inadditiontoqualitativeassessmentsofghosting,wealsowishtoquantitativelyassesstheeffectofalternativedefensivereactionstothesamesituation.Asacasestudy,weanalyzedsafety-criticalplaysequences,suchasgoalscoringopportunities,toseehowtheplaymaybealteredviaghostingtoimprovethedefensivepositioning.To analyze the goal scoring opportunities, we extracted shot events from the entire season.Concretely,wefocussedonthe10-secondsegmentsleadinguptoanopen-playshotevent,eitheronorofftarget(wedidnotusepenalties,free-kicks,setpieces,andgamesequenceswithlessthan11playersfromeitherside).WequantifiedtheperformanceofeachmodelusingtheExpectedGoalValue(EGV)metric[3],whichcanestimatethegoalscoringprobabilityofeachshotbasedontherecentplayerandballpositionsandevents.SimilartothesetupinFigure1,foreachshotsequence,theaverageghostisinitializedbythecurrentpositionsofthedefensiveplayersonlyfortheveryfirstframe,andthesequenceisunfoldedthereafterbyrunningtheghostingmodelacrosstheremainingtimesteps.Notethatallotherfactorsofthegivensequenceremainedunchanged(attackingplayerpositionsandballmovement).Attheendofthesequence,theexpectedgoalprobabilityiscalculatedbasedonthecollectivebehavioroftheghostingteam.TheoverallEGVisthesumofexpectedgoal

valuesoverallsequences.TheexpectedperformanceoftheleagueaverageteamversustheactualoutcomesareshowninTable1.TheEGVfromopen-playsshot events improved compared to the total numbers of goalsconcededfromall20teams(EGVof474onactualgoalcountof494fromopen-plays),primarilyduetothe“leagueaverage”teamabletolowerthescoringchanceagainstseveraloftheweakestteamsintheleague(teamI,J,K,LandT).4.QuantifyingtheEffectofTeamStyleAnalystsandcoachesoftenwanttocompareteamperformancewithnotjusttheleagueaverage,butalsowithspecificteamsandcertain characteristics (attack-oriented, possession-based etc.).Doingsoatafine-grainedlevel,however, isdifficultandnearlyimpossiblewithdiscretestatistics.Ourghostingsystemprovidesawaytonotonlymodeltheexpectedtrajectoryofeachplayer,but also incorporate the ability to impose stylistic elements ofspecific teams in order to answer the question: how woulddifferentteamsreacttothesamesituation?Toaddressthisquestion,weemployedtechniquesfrommachinelearning for domain adaptation. Intuitively, by taking an extravectorindicatingtheidentityoftheteamasinputtothemodel,ourdeep-learningbasedmodelcanextractelements

8


relevanttoeachteam’splayingstyle(suchasspatialarrangement,aggressivenessetc.).Inanygivenghostingscenario,the“average”teammodelcanthenbeadaptedtotheplayingstyleofanyteamintheleaguebychangingtheteamidentityvector,thusallowssimulatinghowtheghostteamplayingwithadifferentteamstylewouldfareunderthesamescenarios.Thisisanalogoustorecentdeeplearningadvancesinstyletransfer,wherethestylisticelementsfrompaintingsandpicturescanbeextractedwithadatasetconsistingof,forexample,VanGogh’sworks,sothatVanGogh’spaintingstylecanbetransferredtootherimages[15].Westudyhowdifferentstylescanimpactthedefensiveperformance,comparedtotheaveragemodel.Withdomainadaptation, theaveragemodel can takeon the identitiesofeachof20 teams in theleague,onall6020open-playshoteventsacrosstheentireseason.Wethencomparehowtheaverageghostingteamandteam-specificghostsperformrelativetotheactualoutcomes,andtoeachother.

Figure6Theeffectofassigningdifferentteamstylestoaverageghostingmodelintermsoftotalexpectedgoalvaluesoverseason’sopen-playshots(green)vs.totalactualnumberofgoalsconcededbyeachteam(blue)

OurresultsaresummarizedinFigure6.Interestingly,noticethedifferenceindefensiveperformanceof each team style. Since all other factors are controlled, the difference in overall expected goalsconcededcanbeattributedtodifferentdefendingstyles.Weobservea61.8%correlationbetweenthetotalEGVcoming fromdifferentghostingstyleswiththeoverallnumberofgoalsconceded inreality.Whileluckandindividualskillsmatterforanyteam,teamF’sdefensiveperformanceseemsattributable to an efficient defensive formation (4-5-1 pressing system under a new coach. TheformationdiscoverymethodofSection2.2appliedtoteamFalsoindicatestheyfrequentlyplaywith5midfielders,withthevarianceofmidfielders’movementshigherthantherestoftheleague).Duringtheactualseason,teamFalsopossessedthesecond-lowestgoalconcededperopen-playopportunityintheleague(calculatedfromTable1),astatisticconsistentwiththeefficiencyofteamF’sghost.Ofcourse,thisisonlyonepieceofthedefensiveequation,sinceshoteventsinisolationdonotreflecthowteamsorganizetheirdefenselongpriortotheshotopportunity.Inaddition,EGVforthistypeofsituationisnotperfect,sinceitdoesnotcapturethepossibilityofdefensiveghoststerminatingtheshot opportunity altogether. Nonetheless, ghosting to different teams in thismanner is a still aninterestingandnovelwaytocomparedefensivebehaviorsacrossteamsonanequalfooting,withoutrelyingondisparatestatisticscomingfromdifferentgamesandscenarios.

9


The approach we described thus far is a data-efficient way to extract “personalized” strategicelements fromdifferent teams to facilitatebothmodelingandevaluation.Weemphasize thatourapproachisgeneralandcanbeappliedtomodelingnotjusttheaverageteamintheleague,butalsoteam-specificmodels(assumingsufficienttrackingdata).5.ExampleofTheDynamicsofGhostingNowthatwehaveabetterideaofhowourghostingmethodworks,andhowthe“leagueaverage”modelvariestoaspecificteam’smodel,wecanrevisittheexampleshowninFigure1andbreak-downthisplayingreaterdetail(aseparatevideohighlightingmanydifferentscenariosisavailableathttp://www.disneyresearch.com/publication/data-driven-ghosting).Asbefore,weconsiderthesequenceofplaybetweenFulham(Red,attackingfromleft)andSwansea(Blue,defendingonright).WecomparewhatSwanseaactuallydidwiththatoftheleagueaverageghosts(LAG-white,top)andtheManchesterCityghosts(MCG-white,bottom).Theballtrajectoriesareinyellow.Notethatwedonotmodelgoal-keeperpositions,whicharehighlighted(number1).Toavoidclutter,wesuppressirrelevanttrajectoriestohighlightkeyplayersanddynamics.

Figure7Exampleofthedynamicsofghostingacrossaplaysequencebytwodifferentghostingstyles.

10


TheactualscenarioresultedinagoalscoredagainstSwanseaafterareboundfromacounter-attackbyFulham.Asmentionedintheprevioussection,bothghostingmodelswereinitializedwiththeactualpositionsofSwanseaplayersinthefirstframe.Theghostingsystemtakesoverandmakesdecisioninreal-timeabouthoweachplayershouldpositionhimselfforeveryframethereafter. Red#6 and Red#7 open the counter-attack with a “give-and-go pass” to get behind Blue#4 andBlue#7.Instage1(leftcolumn),bothLAGandMCGbehavesimilarlyastheplayunfolds,andcloselyresembletheactualSwanseadefense.TheonlyexceptionisLAG#8.ComparedtoBlue#8andMCG#8, LAG#8 proactively challenges the dribble by Red#6,who in reality drew in Blue#5 andcreatedawide-openpasstoRed#10,leadingintostage2(middlecolumn).Thisisthekeymomentin the play. In the actual sequence, Red#10 is left unmarked and gets an uncontested shot ongoalkeeperBlue#1.HerebothMCG#5andLAG#5positionedthemselvesfurtherbackandwereabletogettoRed#10intimetocontesttheshot.Suchattempthypotheticallycouldhavepreventedthereboundthatledtothegoal.Asthesequenceofeventsunfoldtostage3,thereboundfoundRed#9,whoisleftuncoveredbyactualBlue#9,leadingtothegoal(rightcolumn).ThenuanceddifferenceinpositioningbyMCG#2andLAG#2endedupmakingamajordifferenceinthescoringchance.SimilartoBlue#2,LAG#2failedtocoverRed#9.MCG#2,however,rushedbackdeeperintothebacklinefromearliermomentsoftheattackandwasinapositiontocontesttheshotontheopennet,andcouldhave covered for an attempt by both Red#7 and Red#9. The end resultwas a reduction in goalprobabilityfrom~70%(bothactualsequenceandLAGsequence)to~40%byMCG.Thefine-grainedsimulationandevaluationofdefensivescenariospresentedherewouldnothavebeenpossibleusingonlydiscretestatisticsaspopularlyusedinsportanalytics.Ourmethodisalsogeneral and has a rich potential to enable team-specific modeling for both coaching and mediaanalysispurposes.6.SummaryTheongoingexplosionoftrackingdatahasnowmadeitpossibletoapplypowerfulmodernmachinelearning techniques tobuild increasingly fine-grainedmodelsofplayerand teambehavior. Withdata-drivenghosting,we cannow, for the first time, scalablyquantify, analyzeand compare fine-graineddefensivebehavior.Inthispaper,wehavedemonstratedthevalueofourapproachinarangeofcasestudies.Weemphasizethatourapproachisalsoapplicabletosportsbeyondsoccer,suchasbasketballandfootball.

11


References[1]http://www.fangraphs.com/library/misc/war/[2]D.Cervone,A.D’Amour,L.BornnandK.Goldsberry,“POINTWISE:PredictingPointsandValuingDecisionsinRealTimewithNBAOpticalTrackingData”atMITSSAC,2014.[3]P.Lucey,A.Bialkowski,M.Monfort,P.CarrandI.Matthews,“QualityvsQuantity:ImprovedShotPredictioninSoccerusingStrategicFeaturesfromSpatiotemporalData”,inMITSSAC,2015.[4]B.Macdonald,“AnExpectedGoalsModelforEvaluatingNHLTeamsandPlayers”,inMITSSAC,2012.[5]X.Wei,P.Lucey,S.Morgan,M.ReidandS.Sridharan,““TheThinEdgeoftheWedge”:AccuratelyPredictingShotOutcomesinTennisusingStyleandContextPriors”,inMITSSAC,2016.[6]Z.Lowe.“Lights,Cameras,Revolution”.Grantland.19Mar2013.[7]R.McMillan.“Google’sAIisNowSmartEnoughtoPlayAtariLikethePros”.WIRED.25Feb2015.[8]C.Metz.“WhattheAIBehindAlphaGoCanTeachUsAboutBeingHuman”.WIRED.19May2016.[9]P.Abbeel,andA.Ng.“Apprenticeshiplearningviainversereinforcementlearning.”InInternationalConferenceonMachineLearning(ICML),2004.[10]H.Le,A.Kang,Y.YueandP.Carr,“SmoothImitationLearningforOnlineSequencePrediction”.InInternationalConferenceonMachineLearning(ICML),2016.[11]B.Argall,S.Chernova,M.Veloso,andB.Browning.“Asurveyofrobotlearningfromdemonstration”.Roboticsandautonomoussystems,57(5):469–483,2009.[12]S.RossandD.Bagnell.“Efficientreductionsforimitationlearning”.InConferenceonArtificialIntelligenceandStatistics(AISTATS),2010.[13]S.Ross,G.GordonandA.Bagnell.“Areductionofimitationlearningandstructuredpredictiontono-regretonlinelearning”.InConferenceonArtificialIntelligenceandStatistics(AISTATS),2011.[14]A.Jain,B.WojcikandT.Joachims,andA.Saxena.“Learningtrajectorypreferencesformanipulatorsviaiterativeimprovement”.InNeuralInformationProcessingSystems(NIPS),2013.[15]L.Gatys,A.EckerandM.Bethge.“Aneuralalgorithmofartisticstyle”.InNeuralInformationProcessingSystems(NIPS),2015[16]A.Bialjowski,P.Lucey,P.Carr,Y.Yue,S.SridharanandI.Matthews.“Large-scaleanalysisofsoccermatchesusingspatiotemporaltrackingdata”.In2014IEEEInternationalConferenceonDataMining(ICDM),2014

12


AppendixDataPreparationDuetothelackofhighqualityannotateddata,wefocusedonathirdoftheseasonworthofmatches(~100games).Forthepurposeofmodelingsoccerdefense,wefurthersegmentthegameeventsintopossessionsequences.Ateamisdefinedtobeinadefensivesituationwhenitisnotincontroloftheball.Adefensivesequenceisterminatedwhenagoalisscoredagainstthedefendingteam,theballgetsoutofthepitch,dead-balleventsoccur(e.g.foul,off-side),orthedefensiveteamregainspossessionoftheballafterhaving2consecutivetouches.Thispre-processingstepresultsinapproximately17400sequencesofattacking-defendingsituations(~3millionframesat10framespersecond).

Onekeycomponentofthepre-processingofthedataisrole-alignment.Role-alignmentisnecessaryforreducingthedimensionalityofthelearningproblem,essentiallyprovidingadditionalcontexttoimposeorderingonthetraininginput.Atthesimplestlevel,onecanviewthelearningproblemasamappingfromthepreviouspositionsof22playersandtheball,tothepositionofaparticularplayeratthecurrenttimestep.Onekeyissuewithlearningthiskindofmappingwithoutimposinganorderingisthatthedatarequirementtolearna“permutation-invariant”behaviorwillneedtoincreasebyafactorofbillions((10!)2specifically).Toreducethisdataburden,wefirstextractthedominantroleforeachplayerfromboththedefendingandattackingteambasedonthecentroidpositionthroughoutthesegmentofplay,regardlessofthenominalpositionofsuchplayer.Assuch,aplayerwhosenominalroleiscentraldefendermayfindhimselfoccupythedominantroleofamidfielderincertainsequenceofplay.Theleagueaverageapproximatesa4-4-2formation.Ineachsegmentofpossession,weorderthetrainingdatabasedonthedominantroles,wheretheassignmentofroleattemptstomatchthisaverage4-4-2formation,usingtheHungarianalgorithm(forminimumcostassignment,wherecostmeasurethedistanceofeachroletoeachofthe10possibleGaussiandistributionsofspatialcoveragelearnedfromdata).

Figure8Statefeaturevectorextractedfromanorderedlistofplayerpositionateachtimestep

13


Assoccerisaspatialgame,onewouldexpectthegeometricrelationshipamongplayersandtheballtocontainimportantsemanticandstrategicvalues.Weformthefullinputvectorforthepurposeoftrainingbyincludingnotonlytheabsolutecoordinatesofplayersandball,butalsotherelativepolarcoordinatesofeachplayertowardstheball,goal,andtherolethatwetrytomodel.Figure1describesthestructureofafullinputvectorateachtimestep.Heretherolebeingmodelistheleftbackposition.Thefullinputfeaturevectorrepresentingtheleftbackpositionconsistofmini-blocksoffeaturesforeachrolefromthedefendingandattackingteam,sortedbyafixedorder.Themini-blockforthegoal-keeper,forexample,containsabsolutepositionandvelocity,thedistanceandanglerelativetotheleft-backposition,thedistanceandangleofthegoal-keepertothedefendinggoal,andthedistanceandangletotheballatthecurrenttimestep.Inadditiontostackingthemini-blocksoffeaturestogether,wealsoduplicatethefeaturesfromthemini-blockscorrespondingtotheclosest3positionstotheleft-backateachtimesteps.Thisresultsinafullinputfeatureofdimension399foreachroleateachframe.Wecallthisinputvectorthestatefeaturevectorforeachrole.

DeepMulti-agentImitationLearningTheimitationlearningtaskistomapthestatefeaturevectorateachtimesteptothecorrespondingactionoftheplayerbeingmodel,whereactionisdefinedastheplayerpositionatthefollowingtimesteps.Inprinciple,thisisanonlinesequencepredictionproblem,wherethemodeloutputsactionofaplayerconditionedonthestateofsuchplayer,asrepresentedbytherecenthistoryofactionsofthemodeledplayer,aswellasotherplayers.Anaturalcandidatetoaddresssuchonlinesequencepredictionproblemisrecurrentneuralnetworks(RNN).AparticularlypopularclassofRNNs,LongShort-TermMemory(LSTM)hasbeensuccessfullyusedinrecentdeeplearningapplicationssuchasmachinetranslation,speechrecognition,handwritingsynthesis,etc.TwomajordifferencesbetweenoursettingandpreviousapplicationsofLSTMare(i)playersneedtomakedecisioninreal-time,renderingthepopularencoder-decoderapproachtosequence-to-sequencemodelingimpractical(ii)ourlearningset-upbelongstotheclassofdynamicalsystemlearningwhereactioncouldaltersubsequentstatedistribution,potentiallycausingamismatchbetweentrainingandinferencethatcouldseverelylimittheperformanceoftraditionalLSTMmodels.

Figure9Exampleofphase1ofalgorithmwithk=2

14


Ourproposedmethodjointlycombinestrainingandinferencetoaddressbothoftheseissues.Thismethodsimulatesonlinepredictioninanofflinefashion,whileallowingthemodeltograduallylearnlonger-rangeprediction.Phase1ofthealgorithmlearnsamodelforeachofthe10rolesofthe“average”defendingteam(seefigure9foranillustration).Inphase2,weusedthesepre-trainedmodelslearnedfromphase1toscaleupthetrainingofsingleplayerintojointtrainingofmultipleplayerstomodelcollaborativemulti-agentlearning(figure10showcasesthejointtrainingof2players).

DeepImitationLearningAlgorithm:Todescribeourlearningalgorithminmoredetails,wegenericallydenoteasequenceofstate-actionpairsas{(s0,a0),(s1,a1),…,(sT,aT)}.WeuseT=50inforourghostingapplicationtosoccerdefense.

Phase1:Learnsingleplayermodelforeachrolejin{1,2,..,10}o Initializearecurrentnetworkmodelgiventheground-truthdataset{(s0,a0),

(s1,a1),…,(sT,aT)}forafewiterationso Fork=1,2,…,T:

▪ Fort=0,k,2k,..,T:● Fori=0,1,..,k-1:

o Applythemodeltost+itoobtainactiona’t+io Usetheactiona’t+itoupdatethenextstatest+i+1(similarto

structureinfigure1withthenewactiona’t+i)● Fori=0,1,…,k-1:

o Usethelossbetweenpredictiona’t+iandgroundtruthactionat+itoupdatetherecurrentnetworkmodelusingstochasticgradientdescent

Phase2:Learnmultiplayermodelsimultaneouslyforallrolejin{1,2,..,10}o Fork=1,2,…,T:

▪ Fort=0,k,2k,…,T:● Fori=0,1,…,k-1:

o Applythepreviouslytrainedmodelsforeachrolejtostatevectors(j)t+itoobtainactiona’(j)t+i

o Usingpredictedactiona’(j)t+itoupdatestatefeaturevectorforthenexttimesteps(j)t+i+1forrolejands(j’)t+i+1forallotherrolej’

● Fori=0,1,…,k-1:o Usethelossbetweenpredictiona’(j)t+iandgroundtruthaction

a(j)t+itoupdatetherecurrentnetworkmodelforrolejusingstochasticgradientdescent

15


Figure10Illustrationofphase2ofalgorithmwith2playersandk=1