RoboCup Standard Platform League: Strategies and Challenges€¦ · RoboCup Standard Platform...
Transcript of RoboCup Standard Platform League: Strategies and Challenges€¦ · RoboCup Standard Platform...
RoboCup Standard Platform League:Strategies and Challenges
Aris Valtazanos
School of InformaticsStructure and Synthesis of Robot Motion
March 3, 2011
slide 1 of 33 www.inf.ed.ac.uk
Talk overview
• The RoboCup Standard Platform League• Team EdInferno
• Overall framework• Novel techniques and algorithms
• Future endeavours
slide 2 of 33 www.inf.ed.ac.uk
Humanoid league
• Custom-made robots, focus on hardware and control
slide 3 of 33 www.inf.ed.ac.uk
Middle Size league
• More standardised design
• Fully autonomous
• On-board sensing and omnidirectional vision
• Only ball is colour-coded - not even the goals!
slide 4 of 33 www.inf.ed.ac.uk
Small Size league
• Very fast wheeled robots
• Can probably already beat humans!
• But not fully autonomous - off-board, overhead vision system
slide 5 of 33 www.inf.ed.ac.uk
Simulation league
• Two categories: 2-D and 3-D league
• 2-D: focus on multi-agent coordination, team strategies, etc.
• 3-D: simulated matches between teams of NAO robots, (basic)modelling of dynamics
slide 6 of 33 www.inf.ed.ac.uk
So why a Standard Platform League?
• A league that provides a testbed with realistic constraints . . .
• . . . without the need to invest too much effort on hardware anddynamic primitives
• Moreover, all teams use the same robot platform!
• So, success depends solely on algorithmic merit• Various domains of interest:
• Physical actions (locomotion, kicking)• Decision making algorithms• Multi-robot communication and cooperation• Vision-based localisation• Belief estimation• . . . and several more
slide 7 of 33 www.inf.ed.ac.uk
Platform
• 2000(?)-2007: SONY Aibo (4-legged league)
• 2008-present: Aldebaran NAO (SPL replaces the 4-leggedleague)
slide 8 of 33 www.inf.ed.ac.uk
History
• 2008: Total disaster (according to eyewitnesses)
• 2009-2010: Improvement and expansion
• Winners: B-Human (x2)• But state-of-the-art still consists of:
• Fast, robust locomotion• Good, strong kicks• Good enough vision-based localisation
• Very little in the way of:• Team cooperation (e.g. passing)• Team coordination (e.g. role assignment)• . . . and anything else you would label as “artificial intelligence”
slide 9 of 33 www.inf.ed.ac.uk
Some technicalities
• Pitch size: 6x4m
• Team size: 4 robots (as of 2011 - previously 3)• Visual cues:
• Goalmouths: one blue, one yellow (localisation)• Localisation beacons• Ball: orange• Waistbands: pink for one team, light blue for the other (swap at
half time)• Lines, boxes, penalty spots, etc.
slide 10 of 33 www.inf.ed.ac.uk
Some more technicalities - NAO robot
• Height: ∼ 60cm
• Built-in closed loop walking engine - max speed: 9.5cm/s (someteams have their own, faster engines)
• Two cameras - top & bottom (normally only use the latter)• Field of view: 58◦ (diagonal)• To change their FoV, robots can either move or turn their head
• Two sonar sensors accross chestboard - range up to 2m
• Various other sensors: touch, force-sensitive etc.
slide 11 of 33 www.inf.ed.ac.uk
Team EdInferno
• Sep. 2009: team established (i.e. first robots arrived)
• Sep. 2009 - Mar. 2010: familiarisation with the platform -walking engine did not exist at that time, lots of frustratingproblems
• Apr. 2010 - Aug. 2010: first serious attempt to create acomplete framework, 2 more robots acquired
• Sep. 2010 - Dec. 2010: main development period, qualificationfor RoboCup
• Jan. 2011 - present: 5 more robots acquired, work towards fullintegration of all modules
slide 12 of 33 www.inf.ed.ac.uk
Behaviour module
• Basically everything that doesn’t involve vision, localisation, orinter-robot communication
• Main functionalities:• Belief estimation and sensor fusion• Role assignment and decision making• Path planning and action execution
• Required inputs: locations of salient objects in field of view,communicated info from teammates, other sensor readings(sonar)
slide 13 of 33 www.inf.ed.ac.uk
(Main) Behaviour module components
• Vision “helper”
• Belief estimator
• Decison maker
• Path target selector
• Path planner
• Action executor
slide 14 of 33 www.inf.ed.ac.uk
Vision “helper”
• Basic trigonometric functions
• E.g. convert image coordinates to real world distances throughrobot’s kinematics
• Ball tracker
• Field of view bounds calculation
slide 15 of 33 www.inf.ed.ac.uk
Belief estimator
• Preliminaries:• Observation: Anything the robot sees or senses, e.g. “a robot at
location (0.5,0.5)”• Belief: A confidence-based deduction based on a history of
observations, “I am 80% confident that the robot at (0.5, 0.5) isteammate #1”.
• Sensor fusion: combine vision and sonar readings into a singleset of observations
• Information sharing: update these observations fromcorresponding teammate observations
• Observation assignment: for each current observation, find bestmatching past belief
slide 16 of 33 www.inf.ed.ac.uk
Belief estimator (cont.)
• Particle filtering: for each teammate/adversary, maintain a set ofhypotheses (particles) over their possible states
• Two main steps:• Predict: Given a (probabilistic) motion model, estimate how each
particle might next move• Update: Compute the likelihood of each update based on the
current sensor readings
• Subject to consistent observations, particles may converge
• Role assignment: egocentrically determine each teammember’s role(e.g. who should go kick the ball)
slide 17 of 33 www.inf.ed.ac.uk
Particle filter toy example
slide 18 of 33 www.inf.ed.ac.uk
Decision maker
• Based on own inferred role and current beliefs, determine theappropriate action
• Possible actions: move(dx,dy,dθ), kick(type,speed),scan(dyaw,dpitch), getup(front/back)
• Choice of action should depend on belief confidence (e.g. ifwe’re not sure where the ball is, scanning should be the highestpriority)
• Also requires fine-tuned thresholds, e.g. for kicking
slide 19 of 33 www.inf.ed.ac.uk
Path target selector
• Invoked if selected action == move
• Chooses an appropriate target for path planning• More challenging than it sounds! E.g., for kickers:
• Determine where we would like the ball to eventually be, from alist of candidate affordable locations, and subject to a set ofconstraints
• Compute best kicking position and posture that will allow us tokick ball to this desired location
slide 20 of 33 www.inf.ed.ac.uk
Path planner
• As with path target selector, invoked only if selected action ==move
• As name suggests, plans a path that will lead robot to desiredlocation
• Two cases:• If no objects (e.g. other robots, goal posts) in view, simply plan a
straight path• Else, plan a path, every point of which is at least some safety
distance from each obstacle
slide 21 of 33 www.inf.ed.ac.uk
Action executor
• Simply executes the selected action!
• If selected action == move, executes first step of computedtrajectory
• May also execute two moves at once, e.g. move and scan
slide 22 of 33 www.inf.ed.ac.uk
Research contributions (in progress!)
• Reachable sets: improve particle filtering algorithm byaccounting for the physical capabilities of the adversaries
• Intent inference, escape, deceit: synthesise more intelligentbehaviours that exploit the observability constraints andstrategic limitations of the adversaries
• Bringing the above together in a closed-loop sense
slide 23 of 33 www.inf.ed.ac.uk
Reachable sets
slide 24 of 33 www.inf.ed.ac.uk
Composable reachable sets
• Initial idea: composable reachable sets
• Compute different sets for each capability hypothesis for theadversary offline
• Online, always pick the one that most closely matches theadversary’s observed behaviour, based on particle filterestimates
• Result: more flexible decision making that adapts locally, in theface of noisy observations
slide 25 of 33 www.inf.ed.ac.uk
Composable reachable sets
• Offers some performance improvement
• But sensory information is too noisy to allow accurate estimationof velocities
• Need a more flexible approach that adapts to adversary overtime
• New approach: use the reachable set as a proposaldistribution inside the particle filter
• State estimation is still probabilistic and data-driven, but withadditional physical constraints
slide 26 of 33 www.inf.ed.ac.uk
Intent inference, escape and deceit
• Very difficult for robots to execute complicated strategies(passing, attack formations etc)
• But they can be strategic in different ways!
• Approach: form flexible probabilistic models of the adversaries,through which their capabilities may be exploited
slide 27 of 33 www.inf.ed.ac.uk
Intent inference
• Decompose adversary’s behaviour into a set of coarse classes(intent templates), and define a probability distribution over them
• E.g. {Move towards ball, move towards me, move randomly,stand still}
• At time t :• Compute the expected moves for each template• Pick a template randomly (proportionally to its weight)
• At time t + 1, adjust intent template weights based on the actualmove of the robot
slide 28 of 33 www.inf.ed.ac.uk
Escape strategies
• Idea: Robots are faced with strong sensory limitations. . .
• . . . but this is also true of their opponents!
• Select actions and trajectories so they exploit these capabilitiesand hide information from the adversary:
β̂ = argmaxβ∈BT
1|β|
|β|∑k=1
dist(βk , vbsij ) (1)
ρ̂ = argmaxρ∈RT
1|ρ|
|ρ|∑k=1
dist(ρk , sbsij ) (2)
slide 29 of 33 www.inf.ed.ac.uk
Deceit
• Escape strategies are one-step predictive
• Can we extend this to greater time horizons?
• Deceptive move: maximise deviation from the move theadversary expects you to do, while minimising the distance toyour own goal:
d̂m = argminm∈DM
wDDtµ(m) + wUUt
µ(m) (3)
whereDtµ(dm) = −dist(dm,E t
µ), (4)
Utµ(dm) = dist(dm,Gt
µ) (5)
slide 30 of 33 www.inf.ed.ac.uk
Regret minimisation
• Well-studied game-theoretic concept
• Aim: learn and adapt to adversary’s strategic model
• Our approach: adjust weight distributions for intent templatesand deceit online, based on difference between expected andactual moves
slide 31 of 33 www.inf.ed.ac.uk
Regret minimisation algorithm
slide 32 of 33 www.inf.ed.ac.uk
Complete decision making algorithm
slide 33 of 33 www.inf.ed.ac.uk