Data-Driven Ghosting using Deep Imitation Learning

2016ResearchPapersCompetitionPresentedby:

1

Data-DrivenGhostingusingDeepImitationLearning

PaperTrack:OtherSports(Soccer)Paper1671

1. IntroductionCurrent state-of-the-art sports statistics compare players and teams to league averageperformance. For example, metrics such as “Wins-above-Replacement” (WAR) in baseball [1],“ExpectedPointValue”(EPV) inbasketball [2]and“ExpectedGoalValue”(EGV) insoccer[3]andhockey[4]arenowcommonplaceinperformanceanalysis.Suchmeasuresallowustoanswerthequestion“howdoesthisplayerorteamcomparetotheleagueaverage?”Even“personalizedmetrics”which can answer how a “player’s or team’s current performance compares to its expectedperformance”havebeenusedtobetteranalyzeandimprovepredictionoffutureoutcomes[5].

Thesemeasures have enhanced our ability to analyze, compare and value performance in sport.Buttheyareinherentlylimitedbecausetheyaretiedtoadiscreteoutcomeofaspecificevent.Forexample,EPVforbasketballfocusesonestimatingtheprobabilityofaplayermakingashotbasedonthecurrentsituation,andislearntoffenormousamountsofhistoricaldata.Thegeneralusecaseis thentoaggregatetheseoutcomes,andcompareandrankthemtoseehowvariousplayersandteamscomparetoeachother.Incontrast,whatwe’dreallyliketoknowishowteamscreatetimeandspaceforscoringopportunitiesatthefine-grainlevel.

Withthewidespread(andgrowing)availabilityofplayerandballtrackingdatacomesthepotentialtoquantitativelyanalyzeandcomparefine-grainmovementpatterns.Anexcellentexampleofthiswas the2013GrantlandarticlewrittenbyZachLowe,whichdescribedhowtheTorontoRaptorswere using “ghosting” to analyze player decision-making in STATS SportVU tracking data [6].Specifically, the Raptors created software to predict what a defensive player should have doneinsteadofwhattheyactuallydid. Developingthecomputerprogramrequiredsubstantialmanualannotation, but the insight gained turned heads because it made the effectiveness of defensivepositioningbothameasurableandviewablequantityforthefirsttime.

Motivated by the original “ghosting” work, we showcase an automatic “data-driven ghosting”method using advancedmachine learningmethodologies applied to a season’sworth of trackingdatafromarecentprofessionalleagueinsoccer.AnexampleofourapproachisdepictedinFigure1whichillustratesascoringchancethatFulham(red)createdagainstSwansea(blue). Supposeweare interested in analyzing the defensivemovements of Swansea. Itmight be useful to visualizewhattheteamactuallydidcomparedtowhatatypicalteamintheleaguemighthavedone.Usingourapproach,weareabletogeneratethedefensivemotionpatternofthe“leagueaverage”team,whichinterestinglyresultsinasimilarexpectedgoalvalue(69.1%forSwanseaand71.8%forthe“leagueaverage”ghosts--tofullyappreciatetheinsightsrevealedbydata-drivenghosting,weurgethereaderstoviewthesupplementalvideoathttps://youtu.be/N6x-iRgXLEo.)


2

Figure1Ourdata-drivenghostingmethodcanbeappliedtovariousgamecontextstobetterunderstanddefensivestrategies.Inthedepictedscenario,Fulham(red)scoresagoalonSwansea(blue).Ghosts(white)representwhereSwanseadefendersshouldhavebeenaccordingtoaleagueaveragemodel(LAG)andManchesterCitymodel(MUG).

However in practice, a coach or analyst may not want to compare their defense to the leagueaverage, but to another specific team. By “fine-tuning” our league averagemodel to the trackingdatafromaparticularteam,ourdata-drivenghostingtechniquecanestimatehoweachteammighthaveapproachedthesituation. Forexample,thecoach/analystmaywanttoseehowManchesterCitywoulddefendthesameattackingplay. Usingourapproach,wecannowseehowtheywoulddefenddifferently(Figure1(right)),andhowmuchitchangestheEGV(69.1%to41.7%).Theotherbenefitofusingourghostingapproachisthatissavesthecoach/analystfromsearchingforsimilarplaysinothermatches(whichmaynotevenexist).

To achieve automatic ghosting, we leverage a machine learning method called “deep imitationlearning”.OurmethodologyresemblestechniquesusedtoteachcomputerstoplayAtari[7]andGo[8]. We modify standard recurrent neural network training to consider both instantaneous andfuture losses,whichenablesghostedplayerstoanticipatemovementsoftheirteammatesandtheopposition.More importantly,ourapproachavoidstheneedforman-yearsofmanualannotation.Ourghostingmodelcanbetrainedinseveralhours,afterwhichitcanghosteveryplayinreal-time.

Inthenextsections,wedescribethemethodologybehindourghostingsystem,andshowcasehowautomatedghostingcanprovide insightfulanalysesandcomparisonsof teamdefensivebehavior.Wealsoemphasizethatourapproachisgeneral,andcanbeappliedtoawiderangeofsportssuchasbasketballandfootball.

2.DeepImitationLearningforModelingDefensiveSituationsWhile themathematical background required to implement our deep imitation learningmethodcanseemcomplicated(seeAppendixAforfulldetails),thehigh-level intuitionisquitesimple. Inthissectionweprovideanoverviewofdeepimitationlearning,andhowitcanbeappliedtosoccertrackingdata.


3

For this paper,weused100 games of player tracking and event data froma recent professionalsoccerleague.Asweareinterestedinmodelingdefensivesituations,weonlyfocusedonsequencesof playwhere theoppositionhad control of theball.Adefensive sequence is terminatedwhenagoalisscoredagainstthedefendingteam,theballgetsoutofthepitch,dead-balleventsoccur(e.g.foul, off-side), or the defensive team regains possession of the ball. In total, there wereapproximately17400sequencesofattacking-defendingsituations(~3millionframesat10framespersecond).Theaveragelengthofallsequencesisapproximately170frames,or17seconds.

2.1.Whatis“DeepImitationLearning?”Beforeexplainingwhat“deepimitationlearning”is,wefirstexplainwhat“imitationlearning”is.Inmanycomplexsituations,itcanbeverychallengingforahumanexperttodescribeandcodifythepolicy or strategy due to the granularity or fidelity of the situation. For such tasks, we can usemachinelearningtoautomaticallylearnagoodpolicyfromobservedexpertbehavior,alsoknownas imitation learning or learning from demonstrations, which has proven tremendously useful incontrolandroboticsapplications[9-14].

Duetothedynamic,continuousandhighlystrategicnatureofsports likesoccer,hand-craftingormanuallydescribingstrategyata fine-grain level isequallyproblematic.Forexample inFigure1,gettingahumantodescribethelocation,velocityandaccelerationofeveryplayerattheframe-levelwouldbeprohibitively time-consuminganderror-riddled.Even ifahumanwereable todescribetheplayviarules,itishighlyunlikelythatanotherhumanwouldbeabletolearnfromsuchrulesasitwouldsurelymisssomeimportantcontextorotherinformation.Inpractice,ahumanwouldjustobservemanyexamplesuntil theycouldunderstandwhat todoataconceptual level.Teachingacomputer isnodifferent.Thekey is to firstobtain the right representationwhichcanenable thecomputer to learn from the observations. In recent years, deep learning has proven to be apowerful tool capable of learning a multi-layer representation hidden in the data, enablingautomatic feature discovery that saves tremendous amount of human engineering effort. In thiswork, we bring together elements from automatic formation discovery [16], imitation learningcombined with deep learning methods to learn complex relationships from high-dimensionalspatiotempotalsportdatadomainssuchassoccer.

2.2.FormationDiscoveryandDeepImitationLearningApplicationAsthedatacomesfromdifferentteamsandplayers,one key component of the pre-processing step isrole-alignment (or ordering the players in a formwhere the computer can quickly comparestrategicallysimilarplays).Weextractthedominantrole for each player from both the defending andattacking team based on the centroid positionsthroughout the segment of play, regardless of thenominal position of such player. For example, aplayerwhosenominalroleiscentraldefendermay


4

find himself occupying the dominant role of amidfielder in certain sequence of play. Instead ofenforcingapre-determinedformationontotheteams,thecentroidpositionsforeachsequenceareautomaticallydiscoveredfromdatabyclusteringeachrole,viaalinearassignmentalgorithm,toarolecentroidrepresentedbyamixtureofGaussiandistributions,inawaythatmaximizestheself-consistencywithinrolefromonesegmentofplaytoanother(resemblesthemethodfrom[16]).Theresultisanaverageformationacrosstheseasoncloselyresemblinga4-4-2formation.

As soccer is fundamentally a spatial game, one would expect the geometric relationship amongplayers and the ball to contain important semantic and strategic values.We form the full inputvectortoourmodelbyincludingnotonlytheabsolutecoordinatesofplayersandball,butalsotherelativepolarcoordinates (distanceandangle)ofeachplayer towards theball,goal,and therolethatwe try tomodel.The full featurevectorateach timestepcontainsgeometric features forallroles in the formation in the form of one mini-block for each role. These mini-blocks are thenstackedinafixedorderconsistentwithrolealignment.Inorderforthemodeltoidentifyone-on-oneversus zone coverage,wealso indicatewhichmini-blockof input features correspond to theclosest three positions to the role being modeled at each time step. In addition, an extra inputvectorindicatingtheteamidentityisaddedtotheinput.Thisidentityvectorisusefulforlearningparticular structuralandstylisticelementsassociatedwithdifferent teams,allowingus thestudytheimpactoftuningourghostingmodeltodifferentteamstyles.

Weuserecurrentneuralnetworks,apopulardeep-learningtool,tolearnthefine-grainedbehaviormodel for each role in the formation in each time step. A particular type of recurrent neuralnetworkscalledLongShort-TermMemory(LSTM)wasusedduetoitspowerfulabilitytocapturelong-rangedependenciesinsequentialdata.Themodeltakesinasequenceofinputfeaturevectorsas described above and the corresponding sequence of each player’s positions as output labels.Each player ismodeled by an LSTM,which consists of two hidden layers of networkswith 512hiddenunits in each layer.The role of thesehiddenunits is to capture the information from therecenthistoryofactionsfromallplayersandmaptheinformationtothepositionofthenexttimestep,inamanneranalogoustohowanAIprogramwastrainedfromdatatomapthehistoryofthegametothenextframeofactioninAtariandGo.

Usingthestandarddeeplearningapproach,however,provesinsufficienttolearnarobustbehaviormodel,duetothetypicallylongtemporalspanofsequencesofplayandsheerhigh-dimensionalityof the learning problem. Intuitively, aswith any regressionmethod, themodel’s predictions candeviatefromthegroundtruthlabels.Insequentialmodelingsettings,thisdeviationcancompoundover timeandcan lead toseriousmodelingerrors.Toaddress this issue,we leverage techniquesfromimitation learning.Themain idea is thatwewant to trainamodel thatcan learntorecoverfromitsownpredictionmistakessothatthemodelcanberobustoverlongsequencesofdecisions.Weuseanimitationlearningalgorithmthatlearnstocapturenotonlythebehaviorofeachroleintheteam,butalsohowmultipleplayersineachteamjointlybehavefromoneframetothenext.Afulldescriptionofourmachinelearningapproachisgivenintheappendix.


5

3.CharacterizationofTheAverageGhostingTeam

Asafirstchecktoseeifourleague-averagemodelpassestheeye-test,wedeploythetrainedmodelon held out sequences to inspect whether the trainedmodel behaves in a sensiblemanner.Weobserve that our model learned to maintain solid defensive formation and structure, with themodeledplayersmovinginamannerthatexhibitsspatialandformationalawareness.Toillustratethis,weshowthreeexamplesofplayinFigure3.

Inthefirstexample,theghostingplayers(white)fromLiverpoolmovetogetherwiththerestoftheteammates (blue) in pressing higher up the pitch in a situationwhenManchester City (red) justregained possession in the middle of the field. To avoid clutter, we display only the ghostingtrajectories(inwhite)ofthefiveleftpositionsoftheteam.

Figure3Examplesofghostingbehaviorforour“leagueaverage”model.

Because our algorithmmodels the interactions between teammates, our ghosted players exhibithigh-level coherent team behaviors. In the middle panel of Figure 3, ghost player number 5 ofSunderlandbrokefromformationtochallengetheballcarriernumber7ofAstonVilla(red).Ghostplayer number 9 of Sunderland swaps roles with ghost number 5 and drops back to mark theattackingnumber6.

Inmanysituations,thebehaviorofthe“leagueaverage”teammaybesubstantiallydifferentfromtheactualplaysuchthattheoutcomecouldbedifferent.IntherightpaneofFigure3,theghostingplayernumber8moreproactivelyclosed thepassing lane thatcouldhaveprevented theshotongoalbyattackingnumber6(red),whichhappenedinrealityduetothefactthatbothnumber5and8fromQPR(blue)dropmoredeeplyandyieldedopenspace.

3.1VariancefromTheAverageGhostingbyPositionsandTeam

We canquantifyhow theplayers and teamsdiffer from the “average” team.Theaverage level ofdeviationacrossallplayersandteamsisabout~4meters.Toputthisnumberincontext,notethat


6

this is a highly accurate average level of precision, as the model has to take into account thearbitrarilylongsequenceofplay.Incontrast,morenaivemachinelearningapproaches(thatdonotaccountforerrorpropagationthroughtime)wouldsufferfromveryhighlevelsofpredictionerror(frequently in the20-30meters range).We can furtherbreakdown thedeviation into groupsofposition. For the defender positions, themajority of the deviations are less than 3meters apartfromtheactualplayerpositions.Thedeviationincreasesasthedominantpositionsmovefurtherupthe field. This reflects increasing level of variation in how attack-oriented playerswould defend.Thedefensivebehaviorofastriker,forexample,couldchangesignificantlyfromteamtoteamandamongspecificplayers,leadingtoahigherlevelofvariancebythe“average”ghostingpositions.WeshowthisdeviationaccordingtopositioninFigure4.

Figure 4: Deviation from the average ghosting model bypositions (Central Defenders (green), CentralMidfielders (blue)andForwards(red)

Forourteamanalysis,weusedall20teamsinourdataset. As we can not disclose the specificperformance of these teams, we denote them asteams A-T. At the team level, this variance canindicatewhich teamdiffers themost fromaveragebehavior in terms of defensive positioning. Wequantifythisoutofpositionratiobyusingan80-20rule: a player at any given moment is consideredoutofposition ifhisdeviation fromhisghost is inexcess of the 80th percentile of the entire leagueaverage. A breakdown of this tendency gives us a

sense ofwhich teams use non-conventional formations. In Figure 5, the teams are sorted by thetotalnumberofgoals concededover theseason inan increasingorder.Weanalyzedeach team’sbehaviorviatwogroups:i)“backline”positions(backs),andii)“frontline”positions(midfieldersandforwards).Whilethispositioningvarianceisonlyonepartofthewholepicture,notethatboththetopandbottomteamsintheleagueintermsofthegoalconcededwerethetwooutliers.Forthe

topteam,thedeviationmaycomeinpartfromtheattack-oriented positions (wingers and forwards)exhibiting awider range ofmovement relative toother teams. In the caseof thebottom team,whoended the season with a relegation, bothdominantly defensive and offensive players tendtodriftfarfromtheleagueaverage.

Figure 5: The out of position frequency (large deviations) byteam for both the back and forward positions. Teams areordered fromleft torightsortedbytheoverallgoalsconcededintheentireseason


7

3.2ExpectedGoalValueofGhostingTeam

A natural use case for ghosting is to study how hypothetical scenarios may unfold, i.e.,counterfactual reasoning. In addition to qualitative assessments of ghosting, we also wish toquantitatively assess theeffectof alternativedefensive reactions to the same situation.As a casestudy, weanalyzedsafety-criticalplaysequences, suchasgoal scoringopportunities, to seehowtheplaymaybealteredviaghostingtoimprovethedefensivepositioning.To analyze the goal scoring opportunities, we extracted shot events from the entire season.Concretely,wefocussedonthe10-secondsegments leadinguptoanopen-playshotevent,eitheronorofftarget(wedidnotusepenalties,free-kicks,setpieces,andgamesequenceswithlessthan11playersfromeitherside).WequantifiedtheperformanceofeachmodelusingtheExpectedGoalValue(EGV)metric[3],whichcanestimatethegoalscoringprobabilityofeachshotbasedontherecentplayerandballpositionsandevents.SimilartothesetupinFigure1,foreachshotsequence,theaverageghost is initializedbythecurrentpositionsof thedefensiveplayersonly for theveryfirst frame, and the sequence is unfolded thereafter by running the ghosting model across theremaining time steps. Note that all other factors of the given sequence remained unchanged(attacking player positions and ball movement). At the end of the sequence, the expected goalprobability iscalculatedbasedonthecollectivebehavioroftheghostingteam.TheoverallEGVis

thesumofexpectedgoalvaluesoverallsequences.The expected performance of the league average team versustheactualoutcomesareshowninTable1.TheEGVfromopen-plays shot events improved compared to the total numbers ofgoals conceded from all 20 teams (EGV of 474 on actual goalcount of 494 from open-plays), primarily due to the “leagueaverage” teamable to lower thescoringchanceagainstseveraloftheweakestteamsintheleague(teamI,J,K,LandT).4.QuantifyingtheEffectofTeamStyleAnalystsandcoachesoftenwanttocompareteamperformancewith not just the league average, but also with specific teamsand certain characteristics (attack-oriented, possession-basedetc.). Doing so at a fine-grained level, however, is difficult andnearly impossiblewith discrete statistics. Our ghosting systemprovidesawaytonotonlymodeltheexpectedtrajectoryofeachplayer, but also incorporate the ability to impose stylisticelementsofspecificteamsinordertoanswerthequestion:howwoulddifferentteamsreacttothesamesituation?To address this question, we employed techniques frommachine learning for domain adaptation. Intuitively, by takinganextravectorindicatingtheidentityoftheteamasinputtothemodel,ourdeep-learningbasedmodelcanextractelements


8

relevant to each team’s playing style (such as spatial arrangement, aggressiveness etc.). In anygivenghostingscenario,the“average”teammodelcanthenbeadaptedtotheplayingstyleofanyteamintheleaguebychangingtheteamidentityvector,thusallowssimulatinghowtheghostteamplayingwithadifferentteamstylewouldfareunderthesamescenarios.Thisisanalogoustorecentdeeplearningadvances instyletransfer,wherethestylisticelementsfrompaintingsandpicturescanbeextractedwithadatasetconsistingof, forexample,VanGogh’sworks, so thatVanGogh’spaintingstylecanbetransferredtootherimages[15].We study how different styles can impact the defensive performance, compared to the averagemodel.Withdomainadaptation,theaveragemodelcantakeontheidentitiesofeachof20teamsintheleague,onall6020open-playshoteventsacrosstheentireseason.Wethencomparehowtheaverage ghosting team and team-specific ghosts perform relative to the actual outcomes, and toeachother.

Figure6Theeffectofassigningdifferentteamstylestoaverageghostingmodelintermsoftotalexpectedgoalvaluesoverseason’sopen-playshots(green)vs.totalactualnumberofgoalsconcededbyeachteam(blue)

Our results are summarized in Figure 6. Interestingly, notice the difference in defensiveperformance of each team style. Since all other factors are controlled, the difference in overallexpected goals conceded can be attributed to different defending styles. We observe a 61.8%correlationbetweenthetotalEGVcomingfromdifferentghostingstyleswiththeoverallnumberofgoalsconcededinreality.While luckandindividualskillsmatterforanyteam,teamF’sdefensiveperformanceseemsattributabletoanefficientdefensiveformation(4-5-1pressingsystemunderanew coach. The formationdiscoverymethod of Section2.2 applied to teamF also indicates theyfrequently playwith 5midfielders,with the variance ofmidfielders’movements higher than therest of the league). During the actual season, team F also possessed the second-lowest goalconcededperopen-playopportunity inthe league(calculatedfromTable1),astatisticconsistentwith the efficiency of team F’s ghost. Of course, this is only one piece of the defensive equation,sinceshoteventsinisolationdonotreflecthowteamsorganizetheirdefenselongpriortotheshotopportunity.Inaddition,EGVforthistypeofsituationisnotperfect,sinceitdoesnotcapturethepossibilityofdefensiveghosts terminating theshotopportunityaltogether.Nonetheless,ghostingto different teams in this manner is a still an interesting and novel way to compare defensivebehaviors across teams on an equal footing,without relying on disparate statistics coming fromdifferentgamesandscenarios.


9

The approach we described thus far is a data-efficient way to extract “personalized” strategicelements fromdifferent teams to facilitatebothmodelingandevaluation.Weemphasize thatourapproachisgeneralandcanbeappliedtomodelingnotjusttheaverageteamintheleague,butalsoteam-specificmodels(assumingsufficienttrackingdata).5.ExampleofTheDynamicsofGhostingNowthatwehaveabetterideaofhowourghostingmethodworks,andhowthe“leagueaverage”modelvariestoaspecificteam’smodel,wecanrevisittheexampleshowninFigure1andbreak-downthisplayingreaterdetail(aseparatevideohighlightingmanydifferentscenariosisavailableathttps://youtu.be/N6x-iRgXLEo).As before, we consider the sequence of play between Fulham (Red, attacking from left) andSwansea(Blue,defendingonright).WecomparewhatSwanseaactuallydidwiththatoftheleagueaverageghosts(LAG-white,top)andtheManchesterCityghosts(MCG-white,bottom).Theballtrajectoriesareinyellow.Notethatwedonotmodelgoal-keeperpositions,whicharehighlighted(number 1). To avoid clutter, we suppress irrelevant trajectories to highlight key players anddynamics.


10

TheactualscenarioresultedinagoalscoredagainstSwanseaafterareboundfromacounter-attackbyFulham.Asmentionedintheprevioussection,bothghostingmodelswereinitializedwiththeactualpositionsofSwanseaplayersinthefirstframe.Theghostingsystemtakesoverandmakesdecisioninreal-timeabouthoweachplayershouldpositionhimselfforeveryframethereafter. Red#6 and Red#7 open the counter-attack with a “give-and-go pass” to get behind Blue#4 andBlue#7.Instage1(leftcolumn),bothLAGandMCGbehavesimilarlyastheplayunfolds,andcloselyresembletheactualSwanseadefense.TheonlyexceptionisLAG#8.ComparedtoBlue#8andMCG#8, LAG#8 proactively challenges the dribble by Red#6,who in reality drew in Blue#5 andcreatedawide-openpasstoRed#10,leadingintostage2(middlecolumn).Thisisthekeymomentin the play. In the actual sequence, Red#10 is left unmarked and gets an uncontested shot ongoalkeeper Blue#1. Here bothMCG#5 and LAG#5 positioned themselves further back andwereabletogettoRed#10intimetocontesttheshot.Suchattempthypotheticallycouldhavepreventedthe rebound that led to thegoal.As the sequenceof eventsunfold to stage3, the rebound foundRed#9,who is left uncovered by actual Blue#9, leading to the goal (right column). The nuanceddifferenceinpositioningbyMCG#2andLAG#2endedupmakingamajordifferenceinthescoringchance.SimilartoBlue#2,LAG#2failedtocoverRed#9.MCG#2,however,rushedbackdeeperintothebackline fromearliermomentsof theattackandwas inaposition to contest the shoton theopennet,andcouldhavecoveredforanattemptbybothRed#7andRed#9.Theendresultwasareduction in goal probability from~70% (both actual sequence and LAG sequence) to~40%byMCG.Thefine-grainedsimulationandevaluationofdefensivescenariospresentedherewouldnothavebeenpossibleusingonlydiscretestatisticsaspopularlyusedinsportanalytics.Ourmethodisalsogeneral and has a rich potential to enable team-specific modeling for both coaching and mediaanalysispurposes.6.SummaryThe ongoing explosion of tracking data has now made it possible to apply powerful modernmachinelearningtechniquestobuildincreasinglyfine-grainedmodelsofplayerandteambehavior.Withdata-drivenghosting,wecannow, for the first time, scalablyquantify,analyzeandcomparefine-graineddefensivebehavior.Inthispaper,wehavedemonstratedthevalueofourapproachinarangeofcasestudies.Weemphasizethatourapproachisalsoapplicabletosportsbeyondsoccer,suchasbasketballandfootball.


11

References[1]http://www.fangraphs.com/library/misc/war/[2]D.Cervone,A.D’Amour,L.BornnandK.Goldsberry,“POINTWISE:PredictingPointsandValuingDecisionsinRealTimewithNBAOpticalTrackingData”atMITSSAC,2014.[3]P.Lucey,A.Bialkowski,M.Monfort,P.CarrandI.Matthews,“QualityvsQuantity:ImprovedShotPredictioninSoccerusingStrategicFeaturesfromSpatiotemporalData”,inMITSSAC,2015.[4]B.Macdonald,“AnExpectedGoalsModelforEvaluatingNHLTeamsandPlayers”,inMITSSAC,2012.[5]X.Wei,P.Lucey,S.Morgan,M.ReidandS.Sridharan,““TheThinEdgeoftheWedge”:AccuratelyPredictingShotOutcomesinTennisusingStyleandContextPriors”,inMITSSAC,2016.[6]Z.Lowe.“Lights,Cameras,Revolution”.Grantland.19Mar2013.[7] R.McMillan. “Google’s AI is Now Smart Enough to Play Atari Like the Pros”.WIRED. 25 Feb2015.[8]C.Metz.“WhattheAIBehindAlphaGoCanTeachUsAboutBeingHuman”.WIRED.19May2016.[9]P.Abbeel,andA.Ng.“Apprenticeshiplearningviainversereinforcementlearning.”InInternationalConferenceonMachineLearning(ICML),2004.[10]H.Le,A.Kang,Y.YueandP.Carr,“SmoothImitationLearningforOnlineSequencePrediction”.InInternationalConferenceonMachineLearning(ICML),2016.[11]B.Argall,S.Chernova,M.Veloso,andB.Browning.“Asurveyofrobotlearningfromdemonstration”.Roboticsandautonomoussystems,57(5):469–483,2009.[12]S.RossandD.Bagnell.“Efficientreductionsforimitationlearning”.InConferenceonArtificialIntelligenceandStatistics(AISTATS),2010.[13]S.Ross,G.GordonandA.Bagnell.“Areductionofimitationlearningandstructuredpredictiontono-regretonlinelearning”.InConferenceonArtificialIntelligenceandStatistics(AISTATS),2011.[14]A.Jain,B.WojcikandT.Joachims,andA.Saxena.“Learningtrajectorypreferencesformanipulatorsviaiterativeimprovement”.InNeuralInformationProcessingSystems(NIPS),2013.[15]L.Gatys,A.EckerandM.Bethge.“Aneuralalgorithmofartisticstyle”.InNeuralInformationProcessingSystems(NIPS),2015[16]A.Bialjowski,P.Lucey,P.Carr,Y.Yue,S.SridharanandI.Matthews.“Large-scaleanalysisofsoccermatchesusingspatiotemporaltrackingdata”.In2014IEEEInternationalConferenceonDataMining(ICDM),2014


12

AppendixDataPreparationDuetothelackofhighqualityannotateddata,wefocusedonathirdoftheseasonworthofmatches(~100games).Forthepurposeofmodelingsoccerdefense,wefurthersegmentthegameeventsintopossessionsequences.Ateamisdefinedtobeinadefensivesituationwhenitisnotincontroloftheball.Adefensivesequenceisterminatedwhenagoalisscoredagainstthedefendingteam,theballgetsoutofthepitch,dead-balleventsoccur(e.g.foul,off-side),orthedefensiveteamregainspossessionoftheballafterhaving2consecutivetouches.Thispre-processingstepresultsinapproximately17400sequencesofattacking-defendingsituations(~3millionframesat10framespersecond).

Onekeycomponentofthepre-processingofthedataisrole-alignment.Role-alignmentisnecessaryforreducingthedimensionalityofthelearningproblem,essentiallyprovidingadditionalcontexttoimposeorderingonthetraininginput.Atthesimplestlevel,onecanviewthelearningproblemasamappingfromthepreviouspositionsof22playersandtheball,tothepositionofaparticularplayeratthecurrenttimestep.Onekeyissuewithlearningthiskindofmappingwithoutimposinganorderingisthatthedatarequirementtolearna“permutation-invariant”behaviorwillneedtoincreasebyafactorofbillions((10!)2specifically).Toreducethisdataburden,wefirstextractthedominantroleforeachplayerfromboththedefendingandattackingteambasedonthecentroidpositionthroughoutthesegmentofplay,regardlessofthenominalpositionofsuchplayer.Assuch,aplayerwhosenominalroleiscentraldefendermayfindhimselfoccupythedominantroleofamidfielderincertainsequenceofplay.Theleagueaverageapproximatesa4-4-2formation.Ineachsegmentofpossession,weorderthetrainingdatabasedonthedominantroles,wheretheassignmentofroleattemptstomatchthisaverage4-4-2formation,usingtheHungarianalgorithm(forminimumcostassignment,wherecostmeasurethedistanceofeachroletoeachofthe10possibleGaussiandistributionsofspatialcoveragelearnedfromdata).

Figure8Statefeaturevectorextractedfromanorderedlistofplayerpositionateachtimestep

Assoccerisaspatialgame,onewouldexpectthegeometricrelationshipamongplayersandtheballto


13

containimportantsemanticandstrategicvalues.Weformthefullinputvectorforthepurposeoftrainingbyincludingnotonlytheabsolutecoordinatesofplayersandball,butalsotherelativepolarcoordinatesofeachplayertowardstheball,goal,andtherolethatwetrytomodel.Figure1describesthestructureofafullinputvectorateachtimestep.Heretherolebeingmodelistheleftbackposition.Thefullinputfeaturevectorrepresentingtheleftbackpositionconsistofmini-blocksoffeaturesforeachrolefromthedefendingandattackingteam,sortedbyafixedorder.Themini-blockforthegoal-keeper,forexample,containsabsolutepositionandvelocity,thedistanceandanglerelativetotheleft-backposition,thedistanceandangleofthegoal-keepertothedefendinggoal,andthedistanceandangletotheballatthecurrenttimestep.Inadditiontostackingthemini-blocksoffeaturestogether,wealsoduplicatethefeaturesfromthemini-blockscorrespondingtotheclosest3positionstotheleft-backateachtimesteps.Thisresultsinafullinputfeatureofdimension399foreachroleateachframe.Wecallthisinputvectorthestatefeaturevectorforeachrole.

DeepMulti-agentImitationLearningTheimitationlearningtaskistomapthestatefeaturevectorateachtimesteptothecorrespondingactionoftheplayerbeingmodel,whereactionisdefinedastheplayerpositionatthefollowingtimesteps.Inprinciple,thisisanonlinesequencepredictionproblem,wherethemodeloutputsactionofaplayerconditionedonthestateofsuchplayer,asrepresentedbytherecenthistoryofactionsofthemodeledplayer,aswellasotherplayers.Anaturalcandidatetoaddresssuchonlinesequencepredictionproblemisrecurrentneuralnetworks(RNN).AparticularlypopularclassofRNNs,LongShort-TermMemory(LSTM)hasbeensuccessfullyusedinrecentdeeplearningapplicationssuchasmachinetranslation,speechrecognition,handwritingsynthesis,etc.TwomajordifferencesbetweenoursettingandpreviousapplicationsofLSTMare(i)playersneedtomakedecisioninreal-time,renderingthepopularencoder-decoderapproachtosequence-to-sequencemodelingimpractical(ii)ourlearningset-upbelongstotheclassofdynamicalsystemlearningwhereactioncouldaltersubsequentstatedistribution,potentiallycausingamismatchbetweentrainingandinferencethatcouldseverelylimittheperformanceoftraditionalLSTMmodels.

Figure9Exampleofphase1ofalgorithmwithk=2

Ourproposedmethodjointlycombinestrainingand


14

inferencetoaddressbothoftheseissues.Thismethodsimulatesonlinepredictioninanofflinefashion,whileallowingthemodeltograduallylearnlonger-rangeprediction.Phase1ofthealgorithmlearnsamodelforeachofthe10rolesofthe“average”defendingteam(seefigure9foranillustration).Inphase2,weusedthesepre-trainedmodelslearnedfromphase1toscaleupthetrainingofsingleplayerintojointtrainingofmultipleplayerstomodelcollaborativemulti-agentlearning(figure10showcasesthejointtrainingof2players).

DeepImitationLearningAlgorithm:Todescribeourlearningalgorithminmoredetails,wegenericallydenoteasequenceofstate-actionpairsas{(s0,a0),(s1,a1),…,(sT,aT)}.WeuseT=50inforourghostingapplicationtosoccerdefense.

Phase1:Learnsingleplayermodelforeachrolejin{1,2,..,10}o Initializearecurrentnetworkmodelgiventheground-truthdataset{(s0,a0),

(s1,a1),…,(sT,aT)}forafewiterationso Fork=1,2,…,T:

▪ Fort=0,k,2k,..,T:● Fori=0,1,..,k-1:

o Applythemodeltost+itoobtainactiona’t+io Usetheactiona’t+itoupdatethenextstatest+i+1(similarto

structureinfigure1withthenewactiona’t+i)● Fori=0,1,…,k-1:

o Usethelossbetweenpredictiona’t+iandgroundtruthactionat+itoupdatetherecurrentnetworkmodelusingstochasticgradientdescent

Phase2:Learnmultiplayermodelsimultaneouslyforallrolejin{1,2,..,10}o Fork=1,2,…,T:

▪ Fort=0,k,2k,…,T:● Fori=0,1,…,k-1:

o Applythepreviouslytrainedmodelsforeachrolejtostatevectors(j)t+itoobtainactiona’(j)t+i

o Usingpredictedactiona’(j)t+itoupdatestatefeaturevectorforthenexttimesteps(j)t+i+1forrolejands(j’)t+i+1forallotherrolej’

● Fori=0,1,…,k-1:o Usethelossbetweenpredictiona’(j)t+iandgroundtruthaction

a(j)t+itoupdatetherecurrentnetworkmodelforrolejusingstochasticgradientdescent


15

Figure10Illustrationofphase2ofalgorithmwith2playersandk=1

Data-Driven Ghosting using Deep Imitation Learning

Documents

Transcript of Data-Driven Ghosting using Deep Imitation Learning