Adaptive Importance Sampling for Estimation in Structured Domains L.E. Ortiz and L.P. Kaelbling.

Adaptive Importance

Sampling for Estimation in

Structured Domains

L.E. Ortiz and L.P. Kaelbling

2

Contents

Notations Importance Sampling Adaptive Importance Sampling Empirical Results

3

Notations Bayesian network (BN) and influence diagram

(ID) (A: decision node, U: utitity node)

4

Probabilities of interest

(O: variables of interest, Z: remaining ones) Best strategy: The strategy with the highest

expected utility. The action ‘a’ maximizing the value associated with the evidence ‘o’ (i.e. the parents of ‘a’).

Importance sampling is needed to calculate the above summations

Z

oOZPoOP ),()(

Zo aAoOZUaAoOZPaV ),,()|,()(

5

Importance Sampling

quantity of interest: Z ~ important sampling distribution f(z):

estimation of G :

(sampling of w from f) Cf. Estimation of

Z

ZgG )(

Z Z

ZwZfZfZgZfG )()())(/)()((

N

l

lzwN

G1

)( )(1ˆ

)()( ZgZE

fgw /

6

BN: likelihood weighting

(prior)

(likelihood)

ID:

oO

n

j jj

n

i ii OPaOPZPaZPoOZPZg 21

11))(|())(|(),()(

oO

n

i ii ZPaZPZf |))(|()( 1

1

oO

n

j jj OPaOPZw |))(|()( 2

1

),,()|,()()( aAoOZUaAoOZPaVzg o

aAoO

n

i ii ZPaZPZf ,1|))(|()( 1

aAoO

n

j jj AOZUOPaOPZw ,1|),,())(|()( 2

7

Eg.

G can be calculated by sampling of w’s. Cf.

),(),|()|(),|(

),|()|()|()()(

73255244637

2613121

aAXUXXxXPXxXPXXXP

aAXXPXXPXXPXPZg

),|(),|()|()|()()( 6372613121 XXXPaAXXPXXPXXPXPZf

),(),|()|()( 73255244 aAXUXXxXPXxXPZw

),,,,,,,()|,,,,,,( 54763215476321

1 2 3 6 7

aAxxXXXXXUaAxxXXXXXPGx x x x x

8

Variance of the weights:

Minimum variance importance sampling

distributions: (taking a derivitive from above)

The weights have 0 variance in this case(w=G)

f (z) must have “ Fat Tail ”:

as

for at least one value of Z.

Z

GZwZfZwVar 22)()())((

Z

ZgZgZf )(/)()(*

))(( ZwVar 0)( Zf

9

Adaptive Importance Sampling

Parameterizing the importance sampling distribution (tabularizing)

Update rules based on gradient descent

)(

)|)(,(

1

)|(iZPa iZ

ii

j k

ZjZPakZIijk

n

i

Zf

)()( )()()1( tptt et

10

Three different forms of gradient minimize variance directly minimize distance between the current sampling

distribution and approximate optimal sampling distribution

minimize distance between the current sampling distribution and empirical optimal distribution

f

eZ

ZjZPakZIE

ZZjZPakZI

Zfe

ijk

ii

Zijk

ii

ijk

,),()|)(,(

),()|)(,(

)|()(

),()|)(,(

)(

1)(ˆ )(),()(

),()(

1

)(tlt

ijkt

ltiitN

lijk

t

ZzZjZPakZI

tN

e

11

Minimizing variance:

via approximate optimal distribution:

2

22

)|(),(

)|()|())|(()(

ZwZ

GZwZfZwVare

Var

ZVar

)(ˆ/)()(ˆ tt GZgZf

)1ˆ/)|()(|()|()(),(

))()|((2

1)(

)()(),()(),(*

2*

2

2

ttlttlt

L

ZL

GzwzfZfZfZ

ZfZfe

)()(),(*

**

ˆ/)|()|(/)(),(

))|(/)(log()()(

1

1

ttltKL

ZKL

GzwZfZfZ

ZfZfZfe

1)ˆ/)|(log(1))|(/)(log(),(

))(/)|(log()()(

)()(),(*

**

2

2

ttlt

KL

ZKL

GzwZfZfZ

ZfZfZfe

21 2

1

2

1KLKLKL eee

s

12

via parameterized empirical distribution:

( , if RHS=0)

)(

1

)(),(),(

)(

1

)(),(),()(

)|()|)((

)|()|)(,(ˆtN

l

tltlti

tN

l

tltltii

ijkt

zwzZjZPaI

zwzZjZPakZI

ijkt

ijkt )()(ˆ

),ˆ()(ˆ

),ˆ()(

)()()(

ijkt

ijkt

ijk

t

ijkijkijk

e

e

1)/ˆlog(),ˆ()ˆ/log()(

/ˆ),ˆ()/ˆlog(ˆ)(

ˆ),ˆ()ˆ(2

1)(

22

11

22

,,

,,

,,

2

ijkijkijkijkKLkji ijkijkijkKL

ijkijkijkijkKLkji ijkijkijkKL

ijkijkijkijkLkji ijkijkL

e

e

e

13

Remarks

’s are proportional to square, linear, logarithmic of the weights.

L2 is positive if w/G > 1 (under estimation of g)

The size and sign of are related to under or over estimation of g.

15

Empirical Results Problem: Calculate VMP(t)(A) for A=2,

MP(t)=1 in the computer mouse problem. Evaluation: by MSE between the true value

and the estimation from sampling method. Var and L2 are better than LW(traditional

method) L2 is more stable than other methods

Adaptive Importance Sampling for Estimation in Structured Domains L.E. Ortiz and L.P. Kaelbling.

Documents

Transcript of Adaptive Importance Sampling for Estimation in Structured Domains L.E. Ortiz and L.P. Kaelbling.