© Enn Tyugu1 Algorithms of Artificial Intelligence Lecture 4: Search E. Tyugu.

© Enn Tyugu 1

Algorithms of Artificial Intelligence

Lecture 4: Search

E. Tyugu

© Enn Tyugu 2

Backtracking on and-or-tree

• Let us have the following specific functions describing a search tree:

succ(p) - set of immediate descendants of the node p

terminal(p) - true iff p is a terminal node.

• We assume that the root of a search tree is an and-node.

• The following algorithm produces a subtree of an and-or tree given by its root p. This subtree is empty, if the search is not succesful. It starts with empty state given as the second parameter .

© Enn Tyugu 3

Backtracking on and-or-tree

A.2.17: ANDOR(p,state):if terminal(p) and good(p)

then state := p; success(state)elif terminal(p) and not good(p) then failure()else for x succ(p) do L: for y succ(x) do

if empty(ANDOR(y,state)) then continue else

state:=(state,y(ANDOR(y,sate))); break Lfi

od; failure(); od; success(state)

fi

© Enn Tyugu 4

Dependency-directed backtracking0

1 8

2

3 4 6 7 10 11 13 14

125 9

A B

D

E F E F E F E F

contradictions:

A & CB & E

C C D

© Enn Tyugu 5

Branch-and-bound searchselect(M) -- selects a set to be split from the set of sets Msplit(A) -- splits the set A into a number of smaller setsprune(M) -- finds the dominated sets in M and prunes them out

card(M) -- gives maximum of cardinalities of sets in Mbestin(M) -- finds the best element in M.

A.2.18: M:={X};

while card(M) > 1 do A:=select(M); M:=M split(A); M:=prune(M)

od; bestin(M);

© Enn Tyugu 6

Branch-and-bound search continued

A.2.18-a:lb(X) – lower bound of values of elements in Xub(X) – upper bound of values of elements in X

prune(M):T := selectFrom(M):for X M do

if lb(X) > ub(T) then T := X fiod; for X M do

if lb(T) > ub(X) then M := M\{X} fi od

© Enn Tyugu 7

Stochastic branch-and-bound search

It may be difficult to estimate the maximal and minimal values of a goal function f on a set. One can use the following idea:

instead of a deterministic search perform a stochastic search with a distribution of probability of testing that is higher for the areas of space where the found values of f are higher.

This gives us a good optimization algorithm for nonlinear multimodal optimization problems (see Strongin).

© Enn Tyugu 8

Stochastic branch-and-bound searchdistr - probability distribution for selecting a point in the seach space

random(distr) - generator of random points with distribution distr in the search space

modify(distr,x,f(x))- procedure of adjustment of the distribution that increases the probability of selecting a point with higher value of f.

A.2.19: distr:= even;x:= random(distr);while not good(x) do

x:= random(distr);distr:= modify(distr,x,f(x))

od

© Enn Tyugu 9

Binary search as branch-and-bound

The functions left(p), right(p) and val(p) are for selecting the left or right

subtree and taking the value of the node p (of the root of the tree).

A.2.19:binsearch(x,p) =

if empty(p) then failureelif val(p) = x then successelif val(p) < x then

binsearch(x,right(p))else binsearch(x,left(p))

fi

© Enn Tyugu 10

Alpha-beta pruning

This algorithm is for heuristic search of best moves on game trees of games of two players, assuming that both players

apply the best tactics, i.e. they do as good moves as possible on the basis of the available knowledge. It is a concretisation of the branch and bound search.

This algorithm gives the same result as the minimax procedure

for the games of two players. This algorithm lies in the ground of many game programs for chess and other games.

© Enn Tyugu 11

Example

maximize

minimize

maximize

4 1 8 5 1 2 7 5

22

1 2

11 12 21

111 . . .

© Enn Tyugu 12

Example of alpha-beta pruning

maximize

minimize

maximize

4 2

4 8 2 7

4 1 8 5 1 2 7 5

111 112 121 122 211 212 221 222

22

1 2

11 12 21

© Enn Tyugu 13

Alpha-beta pruningalpha - minimal value of the end result that can be obtained for

certain by the maximizing player;beta - maximal value of the end result that can must be given away

for certain by the minimizing player.

A.2.20: alphabeta(p, alpha,beta) = if empty(succ(p)) then return(f(p))

else m:=alpha;for x succ(p) do

t:=-alphabeta(x,-beta,-m);if t>m then m:=t fi;

odif m > beta then return(m) else return(beta)

fi fi

© Enn Tyugu 14

A* algorithm The problem it solves is finding a shortest path between two nodes of

a graph, where besides the lengths of arcs of the graph also estimates of

the lower bound of the distance to the end node are given for every node of the

graph. We shall use the following notations:

s -- given initial node of the pathg -- given end node of the pathp(n,i) -- length of the arc from the node n to the node il(n) -- length of the shortest path from the initial node s to the node nh(n) -- length of the shortest path from the node n to the end node gH(n) -- estimate of the lower bound of h(n)Open -- set of nodes to be checkedsucc(n) -- successor nodes of n.

© Enn Tyugu 15

A* algorithm continued

A.2.21: l(s) := 0; OPEN := {s};

while not empty(Open) do x := select (Open);

n:=x;for i Open\ {x} do if l(i) + H(i) < l(n) + H(n)

then n := i fiod;if n = g then success() else

Open := Open succ (n) \ {n};

for i succ (n) dol(i) := l(n) + p(n,i)

od;fi

od; failure()

© Enn Tyugu 16

A* continued

The A* algorithm has some interesting properties: 1. Under the conjecture that H(n) < h(n) for each node n

of the graph, this algorithm gives precise answer -- it finds an actual shortest path from s to g.

2. This is the fastest algorithm which can be constructed for solving the given problem precisely.

© Enn Tyugu 17

Unification A substitution is a tuple of variable - expression pairs, for

example:((x,A),(y,F(B)),(z,W)).

We shall additionally require that no variable of a substitution can appear in an expression of the same substitution. A substitution describes a transformation of expressions where variables of an expression will be replaced by their corresponding expressions. Application of a substitution s to an expression E is denoted by E°s.

Example:P(x,x,y,v)°((x,A),(y,F(B)),(z,W)) = P(A,A,F(B),v).

© Enn Tyugu 18

Unification continued

Substitutions s and t are independent, if their replaceable variables are different. In this case we can build a composition s*t of the substitutions s and t which is a substitution consisting of the pairs of both s and t .

If E,...,F are expressions, s is a substitution, and E°s = ... = F°s then s is a unifier of E,..., F. The most general unifier of E,...,F denoted by mgu(E,...,F) is the unifier s of E,...,F with the property that for any unifier t of E,...,F there exists a unifier r, such that E°s*r=E°t, ... , F°s*r=F°t. This unifier s is in a sense the least unifier of E,...,F. If there exists mgu(E,...,F) for expressions E,...,F, then it is unique. This follows immediately from the definition of mgu.

© Enn Tyugu 19

Examples

Unification, i.e. finding of mgu, can be directly used for answering queries to

knowledge bases. For example, if we have a knowledge base about people

which contains the facts:

married(Peter, Ann)married (Jan, Linda)

one can ask a question about Lindas marriage by asking to unify the formula married(x,Linda) with the facts of the knowledge base. The answer ((x,Jan)) will show that Linda is married with Jan.

© Enn Tyugu 20

Unification algorithm

Let us have the functions:

var(x) -- true, if x is a variableconst(x) -- true, if x is a constantMguVar(x,y) -- gives mgu of the variable x and expression y (not a variable), if such exists and fails if x occurs in y.includes(x,y) -- true, if y includes x.

Then the unification algorithm is as follows.

© Enn Tyugu 21

Unification algorithmA.2.22: Mgu(x,y)= if x=y then succes( )

elif var(x) then succes(MguVar(x,y))elif var(y) then succes(MguVar(y,x))elif (const(x) sconst(y)) & x ylength(x) length(y) then failure()else g := ( );

for i := to length(x) dos := Mgu (part(x,i), part(y,i));if s= failure then failure() fi;g := compose (g,s);x := subst(x,g);y := subst(y,g)

od;success(g)

fiMguVar(x,y) = if includes(x,y) then failure()

else ((x/y)) fi

© Enn Tyugu 22

Dictionary search

Find a given word from a dictionary where the words are ordered in the lexicographic order. The frequency of occurrence of the i-th letter of the alphabet is pi. Total frequency of occurrence of letters preceding the i-th letter of the alphabet is

i-1qi = pj j=1

If a dictionary contains l pages then an approximate value of the page number h(w) where the word w consisting of letters of the alphabet with numbers i1,i2,...,in can be computed from the frequencies of these letters as follows:h(w) = 1+ l(qi1 + pi1(qi2 + ... + pin qin) ...).

© Enn Tyugu 23

Dictionary search

• Let any word w be selected on a given page k and expected page number h(w) be computed according to the formula given above. This page number is the value of a new function denoted by f(k). In order to find the page number j for the word v, we have to solve the equation f(j) = h(v) with respect to j (h(v) is constant for the word v).

• We use the following variables:– j - currently expected page number of the word v (j is

equal to h(v) in the beginning of search)– a - number of the first page to search– b - number of the last page to search

• The algorithm computes recursively a better approximation to j, until the right value is found. In this case, good(j) means that the given word is on the page j.

© Enn Tyugu 24

Dictionary search

A.2.23:dict (j, a, b, v)== if good(j) then j elif j=a then dict (j+1, a+1, b,v) elif j=b then dict (j -1 , a, b-1,v) else s= f(j);

if s > h(v) then dict (j –(j-a)*(s-h(v ))/(s-f(a)), a, j,

v) else dict (j+(b-j)*(h(v)-s)/(f(b)-s) , j,

b,v)fi

fi

© Enn Tyugu 25

Viterbi algorithm

This is an algorithm of so called compound decision theory that uses the

dynamic programming paradigm. It is a word recognition algorithm,

specifically for finding the most likely word for a sequence of patterns x1, …, xm

which represent symbols (letters). The goal is to choose the assignment w1, …,

wm, of letters from a given alphabet L which has the maximum probability over

all possible rm assignments (r is the number of letters of the alphabet L).

Assuming the first-order Markov property of the sequence of patterns, i.e. that

probability of appearance of a pattern in the sequence depends only on the

preceeding pattern, we get an algorithm with time complexity O(m*r2) instead of

O(rm). An improved version of the algorithm is called DVA (Dictionary Viterbi

Algorithm) and it takes into the account also a dictionary - only words from the

dictionary are accepted.

© Enn Tyugu 26

Viterbi algorithmp1(x,w) - probability of the letter w with the pattern x being the first letter in a wordp(x,w,u) - probability of the letter w with the pattern x following the letter u in a word

pw – probability of the prefix ending with the letter w

prefw – best prefix ending with the letter wselectFrom(L) - produces an element of L

A2.24: for w1 L do pw1 := p1(x1,w); prefw1 := (w1) od;for i=2 to m do for wi L do

zz:= selectFrom(L); p := pzz*p(xi ,wi,zz);for z L do

if pz*p(xi ,wi,z) > p then zz:= z; p := pz*p(xi ,wi,z) fiod; pwi:=p; prefwi:=(prefzz,wi);

odod;

© Enn Tyugu 27

Dictionary Viterbi AlgorithminDictionary(z) = true, iff there is a word beginning with z in dictionaryselectFrom(pref,L) gives a letter that suits for the prefix pref, so that the prefix is

always in dictionary ).

A2.25: for w1 L do pw1 := p1(x1,w); prefw1 := (w1) od;for i=2 to m do for wi L do

zz:= selectFrom(L); p := pzz*p(xi ,wi,zz);for z L do

if pz*p(xi ,wi,z) > p and inDictionary(prefzz,wi) then zz:= z; p := pz*p(xi ,wi,z) fi

od; pwi:=p; prefwi:=(prefzz,wi);

odod;

© Enn Tyugu 28

Hierarchy of search methods

branch and bound

search

breadth-first

backtracking

depth-first

lru rlu

unificationA*

beam search

search-space modifications

dictionary search

best first

dependency- directed

Viterbi search

stochasticbranch-and-bound

binarysearch

© Enn Tyugu1 Algorithms of Artificial Intelligence Lecture 4: Search E. Tyugu.

Documents

Transcript of © Enn Tyugu1 Algorithms of Artificial Intelligence Lecture 4: Search E. Tyugu.