Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni
description
Transcript of Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni
![Page 1: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/1.jpg)
DEI, Politecnico di Milano
Extending Algorithms for Mobile Robot Patrollingin the Presence of Adversaries
to More Realistic SettingsNicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni
{basilico,ngatti,ceppi,amigoni}@elet.polimi.it, [email protected]
![Page 2: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/2.jpg)
DEI, Politecnico di Milano
Outline
• Background– State of the art– Basic model– Solving algorithm
• Contributions– Modeling intruder’s movements– Modeling intruder’s visibility limitations– Complexity reduction techniques– Experimental results
• Conclusions and Future Works
![Page 3: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/3.jpg)
DEI, Politecnico di Milano
Part 1: Background
![Page 4: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/4.jpg)
DEI, Politecnico di Milano
Related Works
• The patrolling strategy problem:
• The patrolling strategy drives the robot in the patrolling task• Problem: given an environment, compute the best patrolling strategy
• Approaches:
• Not considering a model of the adversary (the intruder)• Frequency/coverage based approaches
• Explicitly considering a model of the adversary (the intruder)• It can provide better strategies (Amigoni et a, IAT 2008)
• Model of the adversary• Without preferences (Agmon et al., AAMAS 2008, perimeter-like environments)• With preferences (Paruchuri et al., AAMAS 08, fully connected environments)
![Page 5: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/5.jpg)
DEI, Politecnico di Milano
Patrolling Setting
• Patroller: • Equipped with sensors to detect intrusions in the patrolling setting• It can move between adjacent vertexes in one time unit
• Intruder:• It observes the patroller remaining hidden outside the environment• It can decide to enter the environment at any turn
• For each target T, the intruder must spend time dT to successfully attack the target
• When attempting to attack target T at time t, the intruder can be detected during [t, t+dT)
• Time is discretized in turns• Grid map composed by free cells (white), obstacle cells
(black) and targets (green circles)• Targets: cells with some value for both players
![Page 6: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/6.jpg)
DEI, Politecnico di Milano
Patrolling Strategy
• Patrolling strategy: it specifies the next move of the patroller at each turn
• Randomized strategy: a probability distribution over the next move, it can be the only effective strategy against an observing intruder
• Objective: finding the optimal randomized patrolling strategy while considering a model of the adversary (the intruder)
• Strongest intruder: a rational agent that knows the patrolling strategy and considers it when deciding its action
• Approach: to study the interactions between patroller and intruder agents within a game-theoretical framework
![Page 7: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/7.jpg)
DEI, Politecnico di Milano
The Patrolling Game
Game Outcomes• At turn k the indruder enters cell T when the patroller is in cell G: enter-when(T,G)
• If the patroller does not sense cell T in the interval [k, k+ dT) the intruder wins
• Otherwise the intruder is captured and the patroller wins• The intruder never enters: stay-out
P
I I I
move(10) move(12)move(7)
…P P P
waitenter(13)enter(1) …
… …
… … 1 turn
21 3 4 5
6 7 8
9 10 12 13
![Page 8: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/8.jpg)
DEI, Politecnico di Milano
Solving the Game
• The patrolling problem can be modeled as a leader-follower game• Two players• The leader commits to a strategy• The follower observes such commitment and acts as a best responder
• Patroller’s strategy: A = {αi,j}, where αi,j is the probability of doing move(j) when i is the current node
• Intruder’s strategy: enter-when(T,G), enter in target T when the patroller is in cell G
• The optimal A can be derived by computing the equilibrium of the leader-follower game resorting to a bilevel optimization problem (Conitzer and Sandholm, 2006)
![Page 9: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/9.jpg)
DEI, Politecnico di Milano
Solving the Game
For very intruder’s action ai
Find A’ such that EUp is maximums.t. ai is best response to A’
max EUp
A*,a* Leader-FollowerEquilibrium
Optimal Patrolling Strategy
If the intruder’s action is to attack target T, the patroller’s expected utility is computed as:
P(intrusion T) *XT + (1 - P(intrusion T)) * X0
A’1,a1 …A’
2,a2 A’3,a3 A’
n,an
P(intrusion T) depends on• the attacked target • the position of the patroller• the patrolling strategy
![Page 10: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/10.jpg)
DEI, Politecnico di Milano
Part 2: Contributions
![Page 11: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/11.jpg)
DEI, Politecnico di Milano
Objective
• The basic model is general but it makes a lot of simplifying assumption• E.g., the intruder can directly enter in any target
• We introduce two different extensions in order to model a more realistic patrolling scenario
• We refine the intruder’s model considering aspects that are not addressed in game theoretical patrolling literature
• We experimentally evaluate the computational complexity of the extended model and provide techniques to reduce it
![Page 12: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/12.jpg)
DEI, Politecnico di Milano
Intruder’s Movements
• The environment can be accessed by access areas• As soon as the patroller is in C, the intruder:
• Enters from an access area
• Follows a path P from the access area to a target T, and then stays there for dT turns to complete the intrusion attempt
• The intrusion probability of an intruder’s strategy has to be computed in a different way with respect to the basic model
Basic model assumption: the intruder can directly enter in any target
The intruder’s strategy is represented as:enter-when(T,C)
T
C
Now the intruder’s strategy is represented as:enter-when(P,C)
![Page 13: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/13.jpg)
DEI, Politecnico di Milano
• We can reduce the computational burden by discarding players’ dominated actions
• An action a is dominated by an action b if the player prefers to undertake b independently of the opponent‘s strategy
• Patroller’s actions reduction: • Smaller setting, less variables• Forcing the patroller to cover shortest paths between targets
• Intruder’s actions reduction: • Less optimization problems, less constraints for each optimization problem
Reduction
![Page 14: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/14.jpg)
DEI, Politecnico di Milano
• Indentify the minimal set of paths that a rational intruder would consider in its actions enter-when(P,C)s
• Obiously enter-when(P1, *) dominates enter-when(P2, *)
• P3 is not dominated: there can be a patrolling strategy such that P3 is better than P1
• We select all irreducible paths, i.e., those paths that do not strictly contain any other path
Reduction
![Page 15: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/15.jpg)
DEI, Politecnico di Milano
• Indentify the minimal set of cells {C} that a rational intruder would consider in its actions enter-when(P,C)s
• Obiously enter-when(P1, C) is dominated by stay-out
• enter-when(P1, C1) is dominated by enter-when(P1, C2): from C2 the patroller should always cover a longer distance to reach the target within dT turns than from C1
• For every irreducible path we find the set {C} resorting to a tree based search technique
C C1C2
Reduction
![Page 16: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/16.jpg)
DEI, Politecnico di Milano
Intruder’s Limited Observation Capabilities
• Hi is the set of hidden cells when entering from access area i• Actions enter-when(T,G) cannot be performed if G is an hidden
cell belonging to Hi
• We introduce a state of the game s = <G,O> where:• G is the last cell where the intruder saw the patroller• O is the number of turns from such last observation
• Example s = <G,3>
• Basic model: the intruder can observe the patroller and derive a correct belief on the patrolling strategy
• Limited visibility: when acting the intruder has a limited knowledge about the current position of the patroller
G
?
?
?
• The intruder can compute a probability distribution over the patroller’s position using the strategy it knows:
![Page 17: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/17.jpg)
DEI, Politecnico di Milano
Intruder’s Limited Visibility
• If c is hidden, we consider c’ from which the patroller can reach c without passing from any non hidden cell
• We consider state s = <c’,k> such that the probability for the patroller of being in c starting from c’ is maximum after k turns since it disappeared
• We can find it by resorting to Markov chains properties, in the example s = <c’,3>
• Now the intruder’s strategy is represented as : enter-when(T,s) where s is a state
• To determine non dominated actions we have to compute the minimal set of states {s}
• Compute the minimal set of cells {c} to consider as patroller positions (like in the previous case)
• For every c of {c}:• If c is not hidden then s = <c,0> has to be considered
c
c’
![Page 18: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/18.jpg)
DEI, Politecnico di Milano
Experimental Results
intruder's paths0
500
1000
1500
2000
2500
partially reducedfully reduced
Tota
l tim
e (s
econ
ds)
Opt
imiz
atio
n pr
oble
ms
original intruder's path
050
100150200250300350
without re-ductionpartially reducedfully reduced
intruder's limited visibil-ity
0
200
400
600
800
1000
1200
partially reducedfully reduced
Tota
l tim
e (s
econ
ds)
Opt
imiz
atio
n pr
oble
ms
original
intruder's
limite
d visib
ility
020406080
partially re-ductionfully reduced
![Page 19: Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni](https://reader035.fdocuments.net/reader035/viewer/2022062501/568165c2550346895dd8cb62/html5/thumbnails/19.jpg)
DEI, Politecnico di Milano
Conclusions and Future Works
• Conclusions:• We presented a game theoretical model to find the best patrolling
strategy in a patrolling setting, together with some extensions to capture more realistic situations
• Future Works:• Further extensions to refine the model of the patroller• Real / simulated robot implementation• Multi-patroller scenarios