A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the...

18
A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out of one or more possible goals.

Transcript of A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the...

Page 1: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

A* Search

A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out of one or more possible goals.

Page 2: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Definitions

A* uses a distance-plus-estimate heuristic function denoted by f(x) to determine the order in which the search visits nodes in the tree induced by the search. The distance-plus-estimate heuristic is a sum of two functions:

• the path-cost function denoted g(x) from the start node to the current node and

•an admissible "heuristic estimate" of the distance to the goal denoted h(x).

• an admissible h(x) must not overestimate the distance to the goal. For an application like routing, h(x) might represent the straight-line distance to the goal, since that is physically the smallest possible distance between any two points (or nodes for that matter).

Page 3: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

An A* algorithm for Edit Distance

Edit Distance DE (X,Y) measures how close string X is to string Y.

DE(X,Y) is the cost of the minimum cost transformation t : X t Y where t is a sequence of operations (insertion, equal substitution, unequal substitution, and deletion). The cost of t is the sum of the operation costs where each operation costs 1 except for equal substitution which costs 0.

A B B A C

B A A C A

The cost of this transformation is 3 which happens to be minimal.

Page 4: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Dynamic programming Solution (an O(mn) solution)

Decomposition : Last Operation Delete, Substitute, or InsertAtomic Problems : X prefix or Y prefix emptyTable :

Rows for 0 .. M for X prefix characters, Columns 0 .. N for Y prefix characters

Table Entry : DE (Xi , Yj)

Composition : = cost(Substitution) = 1 if xi != yj and 0 otherwise.DE (Xi ,Yj ) = min{ DE (Xi-1 ,Yj ) + 1,

DE (Xl-1 ,Yj-1 ) + ,DE (Xi ,Yj-1 ) + 1 }

Page 5: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Edit Distance as a Shortest Path Problem

Define a transformation graph GXY = (V,E) as follows:

• The set V of nodes (vertices) = {0 .. M} {0 .. N} where node npq represents the state of transforming a p length

prefix of X into a q length prefix of Y.

• The set E of edges represent the operations of

• deletion , connecting node np,q to np+1,q with length 1

• substitution , connecting node np,q to np+1,q+1 with length 0 or 1 depending on whether Xp+1 = Yq+1 or not

• insertion , connecting node np,q to np,q+1 with length 1

The start and goal nodes are n0,0 and nM,N

Page 6: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Introduction

Edit Distance – Based on Single Character Edit Operations

Insertion : a Inserts an “a” into target without effecting the source; cost = 1

Equal Substitution : a a Substitutes an “a” into target for an “a” in source; cost = 0

Unequal Substitution : a b Substitutes a “b” into target for an “a” in source; cost = 1

Deletion : a Deletes an “a” from source without effecting the target; cost = 1

Page 7: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Example of a Transformation Graph

The vertices of T correspond to prefix pairs of X and Y. The edges of T are directed and correspond to the single character edit operations which would transform one prefix pair into another.

Example of a Transformation Graph•X = abbab

•Y = bbaba

Page 8: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

DE(X,Y) = cost of shortest pathstart vertex to goal vertex = 2

Page 9: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

A frequency based Lower Bound function h

• Let Xi be the suffix of X beginning with the ith character and Yj be similarly defined.

• If X = abbab and Y = bbaba

• X2 = bbab and Y2 = baba

• Excess(X2,Y2,a) = 0

• Def(X2,Y2,a) =1

• Excess(X2,Y2,b) = 0

• Def(X2,Y2,b) =0

• Excess(X2,Y2) is sum of excesses over alphabet and Def(X2,Y2) is sum of deficiencies.

• h( X2,Y2 ) = max{Excess(X2,Y2),Def(X2,Y2)} is a lower bound to the length of the shortest path from vertex to goal.

Page 10: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Classification and Strings

Page 11: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Applications of Edit Distance

• DNA analysis

• Classification of heart beats.

• Handwriting recognition.

• Spelling correction.

• Error correction of variable length codes.

• Speech recognition.

Page 12: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Discrete Directional Alphabet

Page 13: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Mapping EKG’s to Strings

Page 14: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Classification as Path Problem

• LB(Start,Goal-1) = 0

• LB(Start,Goal-2) = 3

Page 15: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Lower Bounds to Edit DistanceLower Bound Based on Frequency

Let fa(X) and fa(Y) be the frequencies of a in X and Y.

Define Ex(a,X,Y) = fa(X) – fa(Y) if fa(X) > fa(Y) else 0Define Def(a,X,Y) = fa(Y) – fa(X) if fa(Y) > fa(X) else 0For any a, both Ex(a,X,Y) and Def(a,X,Y) D(X,Y) Ex(a,X,Y) + Ex(b,X,Y) D(X,Y).max { a Ex(a,X,Y), a Def(a,X,Y) } D(X,Y)LB(i,j,X,Y) computed for the ith suffix of X and the jth suffix of Y is a lower bound to the remaining distance after having computed the edit distance for the ith and jth prefixes of X and Y.

Page 16: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

Lower Bounds to Edit DistanceLower Bound Based on Frequency

• Since X has a deficiency of 1 b with Y1 as a target, 1 is a lower bound to D(X,Y1).

• Since X has a deficiency of 2 a’s with Y2 as a target and an excess of 1 b, 2 is a lower bound to D(X,Y2).

• Since X has a deficiency of 3 b’s with Y3 as a target and an excess of 2 a’s, 3 is a lower bound to D(X,Y3).

• Consequently the initial vertices of the 3 transformation graphs are organized into a priority queue as shown to the left.

Page 17: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

A* Search for Closest Target

f = h + gKeeping track of last operation since insertion cannot be followed by deletion and vise versa

Page 18: A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.

A* Search for Closest Target

• Finds distance of 1 to Y1 in 3 steps.

• Y1 must be a closest goal since bnd + dist is minimized.