Answer Set Programming: A new Paradigm for Knowledge Representation and Constraint Programming

Answer Set Programming:A new Paradigm for Knowledge

Representation and Constraint Programming

Russell and Norvig 10.7

Lecture Notes for Cmput 366

(Some slides ,especially those with pictures, are taken from C. Baral’s talk at AAAI’05)

Intelligent Agent

• Can acquire knowledge through various means such as learning from experience, observations, reading, etc., and

• Can reason with this knowledge to make plans, explain observations, achieve goals, etc.

To learn knowledge and to reason with it

• we need to know how to represent knowledge in a computer readable format.

• McCarthy 1959 in Programs with commonsense:

“In order for a program to be capable of learning something it must first be capable of being told it.”

Importance of KR

• KR is the starting point of building intelligent entities (or AI systems), and leads to the next steps of acquiring knowledge and reasoning with knowledge.

What does KR entail?

• We need languages and corresponding methodologies to represent various kinds of knowledge.

Importance of inventing suitable KR languages

Development of a suitable knowledge representation language and methodology is as important to AI systems

as

Calculus is to Physics and Engineering.

Historical perspective

• AI pioneers (especially McCarthy and Minsky) realized the importance of KR to AI.

• McCarthy 1959: Programs with commonsense

(perhaps the first paper on logical AI).

• Minsky 1974: A framework for representing knowledge.

John McCarthy

Marvin Minksy

What are the properties of a good KR language.

• To start with: should be non-monotonic– i.e., allow revision of

conclusion in presence of new knowledge.

– Hayes 1973 (Computation and Deduction) mentions monotonicity (calls it “extension property”) and notes that rules of default do not satisfy it.

– Minsky 1974 (A framework for representing knowledge)

criticizes monotonicity of logistic systems.

Pat Hayes

Marvin Minsky

Inadequacy of first order logic

• They are monotonic: More information one has, more consequences one gets.

• Human communication is typically based on closed world assumption.

An Example of Closed World Assumption

ground-wet watering. ground-wet raining.

• In an open world, there could be others that cause ground-wet (we simply don’t know, or have not said).

• But in a closed world, what we said is all that we know, for Horn clauses, this is called Clark Completion,

Ground-wet watering raining

Problem with Clark Completion

a a.

When completed, it becomes

a a

Two models {a} and { }. The first model doesn’t seem to make sense (how can we have a?)

The desired model should be { } – since there is no way to establish a, hence a is (believed to be) false.

Transitive Closure – graph reachability

• reach(a).• reach(X) reach(Y),

edge(Y,X).

• a and b are reachable but c and d are not.

• But c is not reachable, neither is d. Can we infer these?

a b

c d

edge(a,b).

edge(c,d). edge(d,c)

Reachability – Clark completion (cf. page 355 of R&N)

XY edge(X,Y) ((X=a Y =b) (X =c Y=d) (X =d Y =c))

X reach(X) (X = a or Y (reach(Y) edge(Y,X))

• Equality axioms.

• {edge(a,b), edge(c,d), edge(d,c), reach(a), reach(b) } is a model.

• But so is {edge(a,b), edge(c,d), edge(d,c), reach(a), reach(b), reach(c), reach(d) }.

• Hence one can not conclude ~reach(c), ~reach(d).

• Need to go beyond first order logic.

Pre-1980 history of non-monotonic logics –from Minker’s 93 survey

• THNOT in PLANNER [Hewitt in 1969]

• Prolog [Colmerauer et al. 1973]

• Circumscription [McCarthy 1977]

• Default Reasoning [Reiter 1978]

• Closed World Assumption (CWA) [Reiter 1978]

• Negation as failure [Clark 1978]

• Truth maintenance systems [Doyle 1979]

• AIJ Volume 13, 1980, a special issue

Circumscription

Only consider minimal models for the circumscribed predicates

E.g.

bird(X) ~ab(X) flies(X)

To circumscribe predicate ab, we can assume ~ab(X)

unless ab(X) is known to be true. Thus, in lack of

information about a bird being abnormal, we conclude it

flies.

Circumscription

bird(X) ~ab(X) flies(X) bird(tweety)Models (after propositionalizing) : M1={bird(tweety), ab(tweety),flies(tweety)} M2={bird(tweety), ab(tweety)} M3={bird(tweety), flies(tweety)}M3 is “smaller” than others wrt predicate ab. Thus,flies(tweety) follows from the given formulas under

circumscription.

Default Logic

We write default rules.

E.g. bird(X) : ~ab(X) ----------------------- flies(X) Reads: if X is a bird, and it can be consistently assumed that it is not abnormal, then it flies.

Have we invented “calculus” of KR yet?

• What basic properties should it have? – have a simple and intuitive syntax and

semantics;– be non-monotonic;– allow us to represent and reason with

incomplete information; and– allow us to express and answer problem

solving queries such as planning queries, explanation queries and diagnostic queries.

Have we invented “calculus” of KR yet? - continued.

• What properties will make it useful?– should have building block results;– should have interpreters for reasoning with the

language; – should have existing applications; and– should have systems that can learn knowledge

in this language.

Is ASP a good candidate?

• An ASP program (late 1980s) is a collection of rules of the form:

A0 or … or Al B1, …, Bm, not C1, …, not Cn.

where Ais, Bjs and Cks are literals.

Michael Gelfond Jack MinkerVladimir Lifschitz

Ray Reiter

Is ASP a good candidate?

• Its syntax uses the intuitive If-then form.• It is non-monotonic.• Can express defaults and their exceptions.• Can represent and reason with incomplete information.• Can express and answer problem solving queries.• Large body of building block results.• Various implementations: Smodels, DLV, Prolog.• Many applications built using it.• Learning systems: Progol.• Its initial paper among the top 5 AI source documents in

terms of citeseer citation.

How ASP differs from …

• Prolog: ordering matters in Prolog; can not handle cycles with “not”; has extra-logical features; does not have disjunction and classical negation; and is not declarative.

• Logic Programming: is a class of languages and many different semantics are proposed for “not”.

• Classical Logic: – Classical logic is monotonic. in AnsProlog, which helps in expressing causality, is not reverse

implication.– Disjunction symbol “or” in AnsProlog is non-classical.– The negation as failure symbol “not” in AnsProlog is non-classical.

Normal program

A normal program in ASP is a collection of rules of the form:

A B1, …, Bm, not C1, …, not Cn.

where A, Bjs and Cks are function-free atoms.

If the body is empty, we write A .Or simply A.

Semantics

A function-free program can be grounded (called propositionalization in textbook)

p(X) q(X), not s(X) . % Function-free

p(X) q(f(X)), not s(X). % Not function-free

Semantics

Suppose we have constants a,b,c in our program, the rule

p(X) q(X), not s(X).is a compact representation of three ground

rules p(a) q(a), not s(a). p(b) q(b), not s(b). p(c) q(c), not s(c).

Semantics

Informally, a stable model M of a ground program P is a set of ground atoms such that

• Every rule is satisfied, i.e., for any rule in P

A B1, …, Bm, not C1, …, not Cn.

if Bjs are satisfied (Bjs are in M) and Cjs are also satisfied (not Cj is satisfied if Cj is not in M), then A is in M.

• Every A M can be derived from a rule by a non-circular reasoning.

Examples

P1 = { a a. } M = {a} is not a stable model but M={} is.

P2 = {a not b.} {a} is the only stable model

P3 = {a not a.} It has no stable model

Examples

P4 = {a not b.; b not a.}

Two stable models: {a} and {b}.

Examples

P4 = {a not b.; b not a.}

Two stable models: {a} and {b}.

P5 = {a not b.; b not a.; a not a.}

{a} is the only stable model.

Does tweety fly?

• fly(X) bird(X), not ab(X). ab(X) penguin(X). bird(X) penguin(X). bird(tweety).

– We conclude fly(tweety).

• But if we add– penguin(tweety).– We can no longer conclude fly(tweety) – and conclude ~fly(tweety), by virtue of CWA.

Constraints for disallowing …

The head of a rule may be empty:

B1, …, Bm, not C1, …, not Cn.

It says no stable model may contain all Bjs and none of Cjs.

Generate-and-constrain: first generate

To specify both possibilities: a is in a solution or not, we can use a dummy a’

a not a’.

a’ not a.

Two stable models {a}, {a’}; the latter represents that a is not in solution

Generate-and-constrain: first generate

To specify all subsets of {a,b,c}, we can write

a not a’. b not b’. c not c’.

a’ not a. b’ not b. c’ not c.

Eight stable models each corresponding to a subset, e.g. {a, b’,c’} represents that a is in it, but not b, nor c.

Generate-and-constrain: then constrain

Any subset of {a,b,c} such that a and b cannot be together.

a not a’. b not b’. c not c’. a’ not a. b’ not b. c’ not c. a ,b.

• What if we want to say “whenever a is in a stable model, so is b?

Hamiltonian Cycle

Given a set of facts defining the vertices and edges of a directed graph and a starting vertex v0, find a path that visits every vertex exactly once.

Hamiltonian Cycle

Any edge could be on such a path. We use in(U,V) to represent that edge(U,V) is on such a path.

in(U,V) edge(U,V), not out(U,V).

out(U,V) edge(U,V), not in(U,V).

out(U,V) is a dummy representing edge(U,V) is not on such a path.

Hamiltonian Cycle

A path must be chained to form a sequence over the edges on it:

reachable(V) in(v0,V).

reachable(V) reachable(U), in(U,V).

Hamiltonian Cycle

A vertex cannot be visited more than once.• This can be defined as “no more than one edge on such

a path that goes into any vertex (similarly out of such an edge):

edge(U,V),in(U,V), edge(W,V)in(W,V), U W. edge(U,V),in(U,V), edge(U,W),in(U,W), V W.

Hamiltonian Cycle

Don’t forget to say that every vertex must be reached.

vertex(U), not reachable(U).

3-colorability

Whether 3 colors, say red, blue, and yellow, are sufficient to color a map

A map is represented by a graph, with facts about nodes and arc as given, e.g,

vertex(a).vertex(b).arc(a,b).

3-colorabilityEvery vertex must be colored with exactly one color: color(V,r) vertex(V), not color(V,b), not color(V,y). color(V,b) vertex(V), not color(V,r), not color(V,y). color(V,y) vertex(V), not color(V,b), not color(V,r).

No adjacent vertices may be colored with the same color: vertex(V), vertex(U), arc(V,U),col(C ), color(V,C), color(U,C).Of course, we need to say what colors are: col(r). col(b). col(y).

3-colorabilityA different encoding:

color(V,C) node(V), col(C), not otherColor(V,C).

otherColor(V,C) node(V), col(C), not color(V,C).

node(V), col(C1), col(C2), color(V,C1), color(V,C2), C1 C2.

node(V), col(C), not color(V,C).

node(V), node(U), V U, arc(V,U), col(C ), color(V,C), color(U,C).

So, what exactly is a stable model of a normal program P

Idea: you guess a set of atoms and verify it is indeed exactly the set of atoms that can be derived (page 357 of textbook)

Reduct of P w.r.t. M = {h b1, …, bm |

h b1, …, bm, not c1, …, not cn is in P

and no ci is in M }

M is a stable model of P iff the set of (atomic) consequences of the reduct of P is precisely M

Stable model

P: a not b. b not a. M = {a} is a stable model, since the reduct

of P wrt. M is {a .} its set of (atomic) consequences is precisely

M itself.

Stable model

Why

a not a.

has no stable model?• The empty set {} is not a stable model. (Why?)• If M={a} were a stable model, the reduct of

program wrt {a} is the empty set, whose (atomic) consequences is also empty, not the same as M.

Extensions: Cardinality constraint

A cardinality constraint is of form

L {a1, …, am, not b1, …, not bk }U The constraint is satisfied in a model if the cardinality of the

subset of the literals satisfied by the model is between integers L and U, inclusive.

A cardinality constraint can be used anywhere in a rule.E.g. P = { 0{a, b, not d}2 . } {a} is a stable model, but is {a,b} a stable model?

Cardinality constraint

Generate all subsets of {a,b,c,d} such that whenever a is in it so is b:

0{a, b, c, d}4 .

b a.

As 4 is the max number of literals that may be satisfied, you may omit it for simplicity

0{a, b, c, d} .

Cardinality constraint

Generate all subsets of {a,b,c,d} such that if a is not in it, then b is in it.

0{a, b, c, d} .

b not a.

Are they stable models?

M1= {a,b,c} M2 = {b,c,d,e}

ASP Systems

• Smodels (Helsinki Univ. of Tech.)

• DLV (Vienna Univ. of Tech.)

• ASSAT (HK Univ. of Sci. and Tech.)

• Cmodel (U. of Texas at Austin)

The Smodels System

An efficient system for computing answer sets of normal programs (later exteneded for disjunctive programs).

Consists of two parts• Lparse: ground a program• Smodels: compute the stable models of the

grounded program, based on DPLL.

Smodels

• Syntax largely borrowed from Prolog.

a :- not b.

b :- not a.

:- a.

• A number of language constructs for convenience

Conditional Literals in Smodels

A short hand to express a set: take the form l : d where l is an atom and d a domain predicate.E.g. Set a vertex v to exactly one color among red, blue and yellow: 1 {setColor(v,C): color(C)}1. color(red). color(blue). color(yellow).

is equivalent to 1{setColor(v,red), setColor(v,blue), setColor(v,yellow)} 1.

N-colorability

% every vertex is colored with exactly one color.

1 {setColor(V,C) : col(C) } 1 :- vertex(V).

% facts representing colors

col(1..colors).

% no adjacent vertices are colored with the same color

:- vertex(V), vertex(U), arc(V,U), col(C ), UV, setColor(V,C), setColor(U,C).

% Typical command line % lparse -c colors=3 coloring.lp | smodels

Conditional literals in Smodels

Example

1 { p(I,J) : d(I,J) } 1.

d(I,J) :- d(I),d(J).

d(1..2).

The first rule above is equivalent to

1 {p(1,1),p(1,2),p(2,1),p(2,2) } 1.

Conditional literals in Smodels

Note the difference with the following program: 1 { p(I,J) : d(I) } 1 :- d(J). d(1..2).The first rule above is equivalent to 1 { p(I,1) : d(I) } 1 :- d(1). 1 { p(I,2) : d(I) } 1 :- d(2).which are equivalent to 1 {p(1,1),p(2,1)} 1 :- d(1). 1 {p(1,2),p(2,2)} 1 :- d(2).

Hamiltonian Cycle Revisited

Any subset of edges can be on such a path

in(U,V) edge(U,V), not out(U,V).

out(U,V) edge(U,V), not in(U,V).

Now can be programmed as:

0 {in(U,V) : edge(U,V) }.

Wumpus World

• There is exactly one wumpus: 1 {at(I,J,wumpus) : room(I,J)} 1. room(I,J) :- col(I), row(J).

For a 4 by 4 grid, this is equivalent to exactly one atom being true in the set of 16:

1 {at(1,1,wumpus), at(1,2,wumpus),…….} 1.

Wumpus World

One or more adjacent rooms has a pit if breeze at current room (cf. Assignment 3):

1 {at(Ni,Nj,pit): adjacent(I,J,Ni,Nj)} :-

room(I,J),

sensor(I,J,none,breeze).

N-queens problem#hide.#show q(X,Y). d(1..queens).1 {q(X,Y):d(Y)} 1 :- d(X). :- d(X), d(Y), d(X1), q(X,Y), q(X1,Y), X1 != X. :- d(X), d(Y), d(Y1), q(X,Y), q(X,Y1), Y1 != Y. :- d(X), d(Y), d(X1), d(Y1), q(X,Y), q(X1,Y1), X != X1, Y != Y1, abs(X - X1) == abs(Y -

Y1). :- d(X), not hasq(X). hasq(X) :- d(X), d(Y), q(X,Y). % Typical command line % lparse -c queens=8 queens.lp | smodels

Weight constraints

We can replace cardinality by weights

L {l1, = w1 …, lm = wm }U

where each li is an atom or a not_atom. It’s satisfied when the sum of the satisfied li’’s is between L and U.

When all wi = 1, it becomes a cardinality constraint.

(We don’t need to use weight constraints in this course)

Classic Negation

safe train. vs. safe not train.

Use a new name e.g., no_train, to represent it:

safe no_train.

of course, they cannot be both in a stable model.

train, no_train.

Answer Set Programming: A new Paradigm for Knowledge Representation and Constraint Programming

Documents

Transcript of Answer Set Programming: A new Paradigm for Knowledge Representation and Constraint Programming