Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni

DEI, Politecnico di Milano

Extending Algorithms for Mobile Robot Patrollingin the Presence of Adversaries

to More Realistic SettingsNicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni

{basilico,ngatti,ceppi,amigoni}@elet.polimi.it, [email protected]


Outline

• Background– State of the art– Basic model– Solving algorithm

• Contributions– Modeling intruder’s movements– Modeling intruder’s visibility limitations– Complexity reduction techniques– Experimental results

• Conclusions and Future Works


Part 1: Background


Related Works

• The patrolling strategy problem:

• The patrolling strategy drives the robot in the patrolling task• Problem: given an environment, compute the best patrolling strategy

• Approaches:

• Not considering a model of the adversary (the intruder)• Frequency/coverage based approaches

• Explicitly considering a model of the adversary (the intruder)• It can provide better strategies (Amigoni et a, IAT 2008)

• Model of the adversary• Without preferences (Agmon et al., AAMAS 2008, perimeter-like environments)• With preferences (Paruchuri et al., AAMAS 08, fully connected environments)


Patrolling Setting

• Patroller: • Equipped with sensors to detect intrusions in the patrolling setting• It can move between adjacent vertexes in one time unit

• Intruder:• It observes the patroller remaining hidden outside the environment• It can decide to enter the environment at any turn

• For each target T, the intruder must spend time dT to successfully attack the target

• When attempting to attack target T at time t, the intruder can be detected during [t, t+dT)

• Time is discretized in turns• Grid map composed by free cells (white), obstacle cells

(black) and targets (green circles)• Targets: cells with some value for both players


Patrolling Strategy

• Patrolling strategy: it specifies the next move of the patroller at each turn

• Randomized strategy: a probability distribution over the next move, it can be the only effective strategy against an observing intruder

• Objective: finding the optimal randomized patrolling strategy while considering a model of the adversary (the intruder)

• Strongest intruder: a rational agent that knows the patrolling strategy and considers it when deciding its action

• Approach: to study the interactions between patroller and intruder agents within a game-theoretical framework


The Patrolling Game

Game Outcomes• At turn k the indruder enters cell T when the patroller is in cell G: enter-when(T,G)

• If the patroller does not sense cell T in the interval [k, k+ dT) the intruder wins

• Otherwise the intruder is captured and the patroller wins• The intruder never enters: stay-out

P

I I I

move(10) move(12)move(7)

…P P P

waitenter(13)enter(1) …

… …

… … 1 turn

21 3 4 5

6 7 8

9 10 12 13


Solving the Game

• The patrolling problem can be modeled as a leader-follower game• Two players• The leader commits to a strategy• The follower observes such commitment and acts as a best responder

• Patroller’s strategy: A = {αi,j}, where αi,j is the probability of doing move(j) when i is the current node

• Intruder’s strategy: enter-when(T,G), enter in target T when the patroller is in cell G

• The optimal A can be derived by computing the equilibrium of the leader-follower game resorting to a bilevel optimization problem (Conitzer and Sandholm, 2006)


Solving the Game

For very intruder’s action ai

Find A’ such that EUp is maximums.t. ai is best response to A’

max EUp

A*,a* Leader-FollowerEquilibrium

Optimal Patrolling Strategy

If the intruder’s action is to attack target T, the patroller’s expected utility is computed as:

P(intrusion T) *XT + (1 - P(intrusion T)) * X0

A’1,a1 …A’

2,a2 A’3,a3 A’

n,an

P(intrusion T) depends on• the attacked target • the position of the patroller• the patrolling strategy


Part 2: Contributions


Objective

• The basic model is general but it makes a lot of simplifying assumption• E.g., the intruder can directly enter in any target

• We introduce two different extensions in order to model a more realistic patrolling scenario

• We refine the intruder’s model considering aspects that are not addressed in game theoretical patrolling literature

• We experimentally evaluate the computational complexity of the extended model and provide techniques to reduce it


Intruder’s Movements

• The environment can be accessed by access areas• As soon as the patroller is in C, the intruder:

• Enters from an access area

• Follows a path P from the access area to a target T, and then stays there for dT turns to complete the intrusion attempt

• The intrusion probability of an intruder’s strategy has to be computed in a different way with respect to the basic model

Basic model assumption: the intruder can directly enter in any target

The intruder’s strategy is represented as:enter-when(T,C)

T

C

Now the intruder’s strategy is represented as:enter-when(P,C)


• We can reduce the computational burden by discarding players’ dominated actions

• An action a is dominated by an action b if the player prefers to undertake b independently of the opponent‘s strategy

• Patroller’s actions reduction: • Smaller setting, less variables• Forcing the patroller to cover shortest paths between targets

• Intruder’s actions reduction: • Less optimization problems, less constraints for each optimization problem

Reduction


• Indentify the minimal set of paths that a rational intruder would consider in its actions enter-when(P,C)s

• Obiously enter-when(P1, *) dominates enter-when(P2, *)

• P3 is not dominated: there can be a patrolling strategy such that P3 is better than P1

• We select all irreducible paths, i.e., those paths that do not strictly contain any other path

Reduction


• Indentify the minimal set of cells {C} that a rational intruder would consider in its actions enter-when(P,C)s

• Obiously enter-when(P1, C) is dominated by stay-out

• enter-when(P1, C1) is dominated by enter-when(P1, C2): from C2 the patroller should always cover a longer distance to reach the target within dT turns than from C1

• For every irreducible path we find the set {C} resorting to a tree based search technique

C C1C2

Reduction


Intruder’s Limited Observation Capabilities

• Hi is the set of hidden cells when entering from access area i• Actions enter-when(T,G) cannot be performed if G is an hidden

cell belonging to Hi

• We introduce a state of the game s = <G,O> where:• G is the last cell where the intruder saw the patroller• O is the number of turns from such last observation

• Example s = <G,3>

• Basic model: the intruder can observe the patroller and derive a correct belief on the patrolling strategy

• Limited visibility: when acting the intruder has a limited knowledge about the current position of the patroller

G

?

?

?

• The intruder can compute a probability distribution over the patroller’s position using the strategy it knows:


Intruder’s Limited Visibility

• If c is hidden, we consider c’ from which the patroller can reach c without passing from any non hidden cell

• We consider state s = <c’,k> such that the probability for the patroller of being in c starting from c’ is maximum after k turns since it disappeared

• We can find it by resorting to Markov chains properties, in the example s = <c’,3>

• Now the intruder’s strategy is represented as : enter-when(T,s) where s is a state

• To determine non dominated actions we have to compute the minimal set of states {s}

• Compute the minimal set of cells {c} to consider as patroller positions (like in the previous case)

• For every c of {c}:• If c is not hidden then s = <c,0> has to be considered

c

c’


Experimental Results

intruder's paths0

500

1000

1500

2000

2500

partially reducedfully reduced

Tota

l tim

e (s

econ

ds)

Opt

imiz

atio

n pr

oble

ms

original intruder's path

050

100150200250300350

without re-ductionpartially reducedfully reduced

intruder's limited visibil-ity

0

200

400

600

800

1000

1200

partially reducedfully reduced

Tota

l tim

e (s

econ

ds)

Opt

imiz

atio

n pr

oble

ms

original

intruder's

limite

d visib

ility

020406080

partially re-ductionfully reduced


Conclusions and Future Works

• Conclusions:• We presented a game theoretical model to find the best patrolling

strategy in a patrolling setting, together with some extensions to capture more realistic situations

• Future Works:• Further extensions to refine the model of the patroller• Real / simulated robot implementation• Multi-patroller scenarios

Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni

Documents

Transcript of Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni