Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf ·...

26
Artificial Intelligence Applications by Phil Grant Department of Computer Science University of Wales Swansea

Transcript of Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf ·...

Page 1: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Artificial IntelligenceApplications

by

Phil GrantDepartment of Computer Science

University of Wales Swansea

Page 2: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Chapter 1

Introduction

In these notes we are interested in applications of AI techniques. This will include ExpertSystems or Knowledge–based Systems and what might be termed Soft Computing. SoftComputing covers such topics as:

• Fuzzy Logic and Fuzzy Systems

• Genetic and Evolutionary Algorithms

• Genetic Programming

• Artificial Life

• Molecular/DNA computation

• Artificial Neural Nets

Only the last topic will not be covered in this course as it appears in SCAs.Knowledge–based systems can solve problems for humans or help produce solutions.

They can advise users on what to do in certain situations or suggest other possible lines ofattack to the problem. The successful systems are not general purpose and so are restrictedto certain problem domains — medical diagnosis, car repair, financial advice etc. It is trueto say that KBSs are now used in many commercial and industrial environments.

Knowledge–based systems contain some knowledge of a restricted domain, in someformat, together with some facility to reason about and ask questions concerning thisstored knowledge.

1.1 AI Programming Languages

The main programming languages for prototyping AI systems have been LISP and Prolog.Both are based on mathematical principles. LISP (John McCarthy) on recursive functiontheory and lambda calculus and Prolog (Kowalski and Colmeraurer) on Horn clause logic.

1

Page 3: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

In both languages, programs are easily expressed in the data structures of the language(S-expressions in LISP and terms in Prolog). It is then much easier to write meta-programs(programs manipulating programs) than in conventional procedural languages such as Pas-cal or C.

Nowadays, most expert systems are written in an expert system shell language (e.g.OPS5, Leonardo, flex or CLIPS). We shall be using the system CLIPS to build simpleexpert systems.

1.2 Expert Systems

An Expert System (ES ) is a computer system (program) designed to solve problems in adomain in which there is human expertise. The knowledge built into the system is usuallyobtained from experts in the field.

It will consist of a knowledge base, where the experts’ knowledge is stored in some formof representation, together with an inference engine used for deducing answers from theknowledge base. There are numerous ways to represent the knowledge and undertake theinference.

There are many reasons one may wish to build an expert system:

• The human expert may not always be available or even in the location.

• A system does not tire.

• An ES can be used for training and passing on the knowledge.

• An ES will be more consistent.

• By pooling knowledge of many experts an ES may be better than any one human inits overall performance.

• An ES may produce answers faster than a human.

• The ES may be cheaper in the long term.

1.2.1 Examples

Thousands of expert systems have been constructed in the last few years. We list a few(early) examples of expert systems in Table 1.1.

2

Page 4: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

DENDRAL A system developed at Stanford to interpretmass spectograms.

Drilling Advisor A system developed by Elf to help determinewhy a drill sticks.

MYCIN A medical diagnosis system.XCON A computer configuration system developed

by DEC for VAX computers.LENDING ADVISOR Used for evaluating the risks on possible loans.CARMA Advising on real estate.SYNCHEM2 For advice on the synthesis of organic chemicals.PROUST For finding semantic bugs in novice Pascal

programmers’ code.SewEx System for sewage works management.

Table 1.1: Examples of Expert Systems

Expert systems have been applied in many areas, such as:- business, chemistry, educa-tion, finance, law, mathematics, medicine, mining and space technology.

They are used for control, design, diagnosis, prediction, planning, simulation etc. Thou-sands of systems have been developed and are in use throughout the world. Expert systemshave moved from the research labs to the general market place and industry. This has re-sulted from the better understanding of the technology and the production of tools forbuilding such systems.

1.2.2 Implementation

Most ESs were originally programmed in the AI languages such as LISP, Prolog and OPS5.However,nowadays it is usual to develop them using expert system shells. It is estimatedthat about a half of systems have been produced using shells and many are designed torun on PCs.

3

Page 5: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Chapter 2

Overview of Expert Systems

In this chapter we outline the basic features to be found in expert systems.

2.1 Architecture

We can illustrate the underlying architecture of an expert system as in Figure 2.1.

Knowledge Base

Working Memory

InferenceEngine

User

Chapter 2

Overview of Expert Systems

In this chapter we outline the basic features to be found in expert systems.

2.1 Architecture

We can illustrate the underlying architecture of an expert system as in Figure 2.1.

Figure 2.1: Expert System Architecture

4

Figure 2.1: Expert System Architecture

4

Page 6: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

The main components are

• Knowledge Base.

• Working Memory.

• Inference Engine.

• User Interface.

We shall consider each in turn.

2.1.1 Knowledge Base

This contains the domain knowledge of the system. It is typically represented as a col-lection of IF/THEN rules. An example rule from an expert system designed for medicalprescription might be

IF the patient has flu

THEN

give the patient asprin and send to bed

The rules must be extracted from a human expert in the field.

2.1.2 Working Memory

During the execution of an ES new facts are discovered. These together with informationentered by the user are stored in the working memory of the system. For example in theabove, if it is discovered that the patient has flu, then give the patient aspirin

and send to bed is stored in the working memory.

2.1.3 Inference Engine

The inference engine is the part of the system which performs the deduction of new factsfrom previously derived facts and rules in the knowledge base. So if the premise of a rulematches some facts in the working memory, then the conclusion of the rule is added to theworking memory.

Example

If we have the rules

RULE 1

IF X is a man AND X is lawyer

THEN

X is rich

5

Page 7: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

RULE 2

IF X is rich

THEN

X is happy

then from the assertions

bill is a man

bill is a lawyer

we can first of all prove bill is rich and then bill is happy.

2.1.4 User Interface

The interaction with an ES is usually through natural language with nowadays a graphical(WIMP) user interface. Question design is important to obtain the required answers andinformation from the user. Facilities may be required by the user to edit the contents ofthe working memory.

2.1.5 Explanation

An ES should provide a means of explaining its behaviour. It should offer facilities forexplaining:

• HOW — how did the ES arrive at its answer? This involves replaying the proof orinference in some form.

Expert: Give the patient asprin.

User: HOW

Expert: Because I know the patient has flu.

• WHY — why did the system ask a particular query? In this case the ES can indicatethat it needs to know the premises of a certain rule are true so that the conclusioncan be inferred.

Expert: Does the patient have temperature.

User: WHY

Expert: Because then the patient may have flu.

2.1.6 Separation of Knowledge & Control.

The inference engine and the actual knowledge base are separate modules of the system.So change in the knowledge is achieved by adding/retracting rules from this module. Theinference module does not need to be touched.

6

Page 8: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

2.1.7 Heuristics

Human experts often use rules of thumb or heuristics in their reasoning. It is the purposeof the knowledge engineer to elicit this kind of information from the expert so that it canbe coded up as a rule in the knowledge base.

Heuristic: Red sky at night sunny in the morning.

RULE: IF sky is red at night

THEN sunny next morning

2.1.8 Inexact Reasoning

It may well be that rules/heuristics people use may not always be correct or applicable.In this case a probability or certainty factor is attached to the rule which reflects theconfidence the expert has in this rule. For example, the area of medical diagnosis is notabsolutely precise and so an expert system for this domain would probably use inexactreasoning. The inference engine must also calculate the confidence the system has in theresulting computed answers.

Consider the following inexact rule:

RULE: IF the man is hot

THEN the man will buy a coke 0.8

If we only believe that the man is hot with probability 0.9 then we might concludethat he will buy a coke with probability 0.8 * 0.9 = 0.72.

Assessment Study feasibility. Isolate goals.Determine source of expertise.

Elicitation Obtain knowledge from experts.Us prototype ES later.

Design Represent knowledge.Decide on processing technique.

Testing Validate system with expert orknown correct results.

Documentation Documentation for system.Description of knowledge.

Maintenance Update system in the light ofdiscovered bugs or requests from users.

Table 2.1: Knowledge Engineer’s Tasks

7

Page 9: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

2.2 Knowledge Engineering

The Knowledge Engineer’s job is to produce an expert system. His tasks can be summarisedin Table 2.1.

The personnel involved in the construction of an ES are:

• Domain expert — has the expertise; can communicate knowledge; sympathetic toproject.

• Knowledge Engineer — has expert system programming skills; recognises the relevantrules; codes the rules into the ES.

• End User — influences on UI design; helps in knowledge acquisition.

8

Page 10: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Chapter 3

Knowledge Representation

In order to store knowledge in a machine for manipulation and computation we have tohave some sort of concrete representation. Different sorts of knowledge are summarised inTable 3.1:

Procedural StrategiesAgendasProcedures

Declarative RulesAssertionsObjects

Meta Knowledge about KnowledgeRules about rules

Heuristics Rules of thumbEmpirical

Table 3.1: Knowledge Types

3.1 Methods of Representation

3.1.1 Object-Attribute-Value

Many objects can be represented as an object-attribute-value triple. For example, the makeof a car is Ford could be denoted by (make car ford) or make(car,ford) or some othersyntactic variation. (The former would be the preferred notation in LISP based languagesand the latter in Prolog). Pictorially we can represent the o-a-v triple as in Figure 3.1.

A probability or certainty factor could be attached to the o-a-v representation if thefact is not known precisely, e.g.,(make car ford .9) or make(car,ford,.9). In practice,we will often use a representation which is not strictly a triple.

9

Page 11: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

makecar ford

Figure 3.1: Object-attribute-value

3.1.2 Rules

The other main form of representation used in the knowledge base is the rule. This consistsof a number of antecedents (or the premise) so that if we know they are all true, then wecan infer the conclusion of the rule. These are normally called IF/THEN rules. It is alsopossible to include disjunctions in the premise. When the premise is known to be true withrespect to the working memory, the the rule is said to active. If it is then selected by somestrategy then it is said to fire

IF

antecedents/premises

THEN

consequents/conclusions

Examples

IF joe has his toy car

AND (mary comes to play OR bill comes to play)

THEN

joe is happy

Rules may also contain some procedural information, which will involve some explicitcalculation.

IF X is a square

AND (edge of X is E)

THEN

area A is E^2

AND

circumference C is 4*E

10

Page 12: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

IF X is a rectangle

AND width of X is W

AND depth of X is D

THEN

area A is W*D

AND

circumference C is 2*(W+D)

Types of rules

We can classify various rules in an ES as illustrated by Table 3.2 (not necessarily disjoint,so some rules could be in several categories).

Relationship IF X is father of Y

THEN X is parent of Y

Recommendation IF patient has flu

THEN send patient to bed

Directive IF horn doesn’t work

THEN check fuse two

Strategy IF car stalls

THEN check points first then carburettor

Heuristic IF car is a mini

AND car won’t start

THEN check starter motor

Table 3.2: Types of Rules

11

Page 13: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

3.1.3 Meta rules

These are rules about rules.

Example

IF suspect patient has AIDS

THEN load AIDS knowledge base

This is giving instructions to load a new rule set for a particular application.

3.1.4 Frames

The use of frames is very similar to object-oriented programming, where a frame instanceis equivalent to an object and a frame class equivalent to a class. There are methods calledfacets which can control the values of certain properties.

Frame Name

Class

Prop1Prop2

... .........

Val1

Val2Properties

Obj1

Obj2

3.1.3 Meta rules

These are rules about rules.

Example

IF suspect patient has AIDS

THEN load AIDS knowledge base

This is giving instructions to load a new rule set for a particular application.

3.1.4 Frames

The use of frames is very similar to object-oriented programming, where a frame instanceis equivalent to an object and a frame class equivalent to a class. There are methods calledfacets which can control the values of certain properties.

Figure 3.2: Frame Structure

A class represents the characteristics of a collection of objects. For example Figure 3.3represents the class of events which have a name Event and properties/slots time, date andplace. Some of these properties could have default values. Below, the date has defaultvalue 1/1/95.

An instance frame is then a particular object from a class. In Figure 3.4 we see anobject from the class Event.

Classes can be organised in hierarchies. We say that one class is a subclass of anotherclass. Properties of frames lower in the hierarchy are inherited from ancestors or possiblyoverwritten. Figure 3.5 illustrates a hierarchy of classes which are subclasses of the Eventframe.

Constraints on values of properties of a given frame or other frames can be specified bygiving facets. These are methods which tell the system how to evaluate a property under

12

Figure 3.2: Frame Structure

A class represents the characteristics of a collection of objects. For example Figure 3.3represents the class of events which have a name Event and properties/slots time, date andplace. Some of these properties could have default values. Below, the date has defaultvalue 1/1/95.

An instance frame is then a particular object from a class. In Figure 3.4 we see anobject from the class Event.

Classes can be organised in hierarchies. We say that one class is a subclass of anotherclass. Properties of frames lower in the hierarchy are inherited from ancestors or possiblyoverwritten. Figure 3.5 illustrates a hierarchy of classes which are subclasses of the Eventframe.

12

Page 14: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Frame Name

TimeDate

TimeType1/1/95

Properties

Event

Place String

Frame Name

Class

TimeDate

10:30

12/3/94

Properties

event1

Event

Place Swansea

Figure 3.3: Event Class

Figure 3.4: Frame Instance

certain conditions. For example in CLIPS defaults values themselves have to be specifiedby means of a default facet.

A facet may be of the form IF-CHANGED which indicates how a property of someframe may be affected if the property of a given frame is changed. For example in thehierarchy in Figure 3.5, the Hosts slot in Celebration should be constrained by the Nameslot in Birthday or the Bride and Groom in Wedding. If for instance we change the nameof the person having the birthday then the Hosts slot should be changed to this singletonlist.

3.1.5 Logic

Propositional and predicate logic provide a very general method for representing knowledge,but is usually too low level in practice. The language Prolog is based on a subset ofpredicate logic.

Propositional logic formulas are built up from the constants true or false, variableswhich can only take the Boolean values true or false and the connectives→,∨,∧ and ¬.These denote implication, or, and and not and are interpreted in the standard manner.

13

Figure 3.3: Event Class

Frame Name

TimeDate

TimeType1/1/95

Properties

Event

Place String

Frame Name

Class

TimeDate

10:30

12/3/94

Properties

event1

Event

Place Swansea

Figure 3.3: Event Class

Figure 3.4: Frame Instance

certain conditions. For example in CLIPS defaults values themselves have to be specifiedby means of a default facet.

A facet may be of the form IF-CHANGED which indicates how a property of someframe may be affected if the property of a given frame is changed. For example in thehierarchy in Figure 3.5, the Hosts slot in Celebration should be constrained by the Nameslot in Birthday or the Bride and Groom in Wedding. If for instance we change the nameof the person having the birthday then the Hosts slot should be changed to this singletonlist.

3.1.5 Logic

Propositional and predicate logic provide a very general method for representing knowledge,but is usually too low level in practice. The language Prolog is based on a subset ofpredicate logic.

Propositional logic formulas are built up from the constants true or false, variableswhich can only take the Boolean values true or false and the connectives→,∨,∧ and ¬.These denote implication, or, and and not and are interpreted in the standard manner.

13

Figure 3.4: Frame Instance

Constraints on values of properties of a given frame or other frames can be specified bygiving facets. These are methods which tell the system how to evaluate a property undercertain conditions. For example in CLIPS defaults values themselves have to be specifiedby means of a default facet.

A facet may be of the form IF-CHANGED which indicates how a property of someframe may be affected if the property of a given frame is changed. For example in thehierarchy in Figure 3.5, the Hosts slot in Celebration should be constrained by the Nameslot in Birthday or the Bride and Groom in Wedding. If for instance we change the nameof the person having the birthday then the Hosts slot should be changed to this singletonlist.

13

Page 15: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Event

TimeDate

TimeType

1/1/95

Place String

Disaster

Damage

Fatalities Integer

Celebration

Hosts

Guests List

List

Flood

River

Height Integer

String

Earthquake

Fault

Mag Integer

String

Wedding

Bride

Groom String

String

Birthday

Name

Age Integer

String

Figure 3.5: Frame Hierarchy

As propositional logic is of limited power we shall go on to describe predicate logic (orfirst order logic).

Predicate Calculus

The language will contain symbols for constants, relations (predicates), functions andvariables. Terms are constructed from functions, constants and variables and are theexpressions which can denote objects. For example

father(bill); mother(father(joe)); child(mary,joseph)

We can then have atomic formulas of the form p(t1, . . . , tn) where p is a predicate orrelational symbol and ti’s are terms. Formulas can then be built up from atomic formulasusing the boolean connectives and the quantifiers ∀ and ∃.Examples

∀M(∃Wmarried(M,W ) ∧man(M)→ happy(M))

has intended meaning all married men are happy.

∀P (person(P ) ∧ lives in(P,LA)→ ∃C(car(C) ∧ own(P,C)))

has intended meaning everybody who lives in LA owns a car.

14

Figure 3.5: Frame Hierarchy

3.1.5 Logic

Propositional and predicate logic provide a very general method for representing knowledge,but is usually too low level in practice. The language Prolog is based on a subset ofpredicate logic.

Propositional logic formulas are built up from the constants true or false, variableswhich can only take the Boolean values true or false and the connectives →,∨,∧ and ¬.These denote implication, or, and and not and are interpreted in the standard manner.

As propositional logic is of limited power we shall go on to describe predicate logic (orfirst order logic).

Predicate Calculus

The language will contain symbols for constants, relations (predicates), functions andvariables. Terms are constructed from functions, constants and variables and are theexpressions which can denote objects. For example

father(bill); mother(father(joe)); child(mary,joseph)

We can then have atomic formulas of the form p(t1, . . . , tn) where p is a predicate orrelational symbol and ti’s are terms. Formulas can then be built up from atomic formulasusing the boolean connectives and the quantifiers ∀ and ∃.

14

Page 16: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Examples

∀M(∃Wmarried(M, W ) ∧man(M) → happy(M))

has intended meaning all married men are happy.

∀P (person(P ) ∧ lives in(P, LA) → ∃C(car(C) ∧ own(P, C)))

has intended meaning everybody who lives in LA owns a car.Once one has a representation of some knowledge in predicate calculus then one can use

some standard proof technique to reason from this knowledge. For example, from the above,if we know lives in(bill, LA) and person(bill) then we can infer own(bill, car1)∧ car(car1)(using the proof rule modus ponens — see in later chapters).

15

Page 17: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Chapter 4

Methods of Inference

In this chapter, we outline the basic methods used by expert systems to infer results froma knowledge base. Results are proved by reference to the knowledge base and the workingmemory using proof rules applied under some control strategy.

4.1 Human Reasoning

In this section, we outline some of the reasoning strategies employed by humans (andpossibly by machine) to obtain some kind of conclusion.

4.1.1 Deductive

This is the method of using standard proof rules, such as modus ponens to infer new facts.This was illustrated at the end of the last chapter.

4.1.2 Inductive

This is the process of generalising from some particular instances or examples. For example

Observed Facts: Oak trees have green leaves

Pine trees have green leaves

Inferred Conclusion: All trees have green leaves

So from the particular we have inferred something more general. Of course this might notalways be valid.

4.1.3 Analogical

The process of reasoning from a solved problem which seems similar to the problem to besolved. For example:

Observed/Know Fact: In UNIX ls p* lists all files begining with p

Conclusion: ls x* will list all files begining with x

16

Page 18: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

4.1.4 Common Sense

The process of making use of knowledge which would normally be expected to be knownby most humans. Example:

A ball thrown in the air will fall back to earth

Humans are usually less than 100 years old and shorter than 3m

This sort of knowledge has been naturally acquired over the years by a human. It has tobe programmed into a computer system in some explicit form. The CYC project, led byLenat, is at present trying to put a great deal of common sense knowledge into the systembeing produced.

4.1.5 Non-monotonic

In the classic form of deduction if new results are derived then they will not contradictresults already derived. This is called monotonic reasoning. In contrast, non-monotonicreasoning can produce results which can contradict facts previously derived.

One type of non-monotonic reasoning can occur when dealing with temporal facts. Sup-pose we are dealing with an airline reservation system. Initially, the airplane has vacantseats which we can denote by seats available(plane1). Eventually it is filled and wehave not seats available(plane1). But suppose we have a cancellation, then we haveagain seats available(plane1). So we see that the truth value of seats available(plane1)

has changed over time. In this simple example we can avoid this problem by explicitly in-troducing an extra parameter denoting time, but this then complicates the representation.

Another classic example of non-monotonic reasoning is when dealing with default rea-soning. We present the standard example.Example

We have the default rule for dealing with birds:

bird(X) → fly(X)

all birds fly, and the exception rule

bird(X) ∧ penguin(X) → ¬fly(X)

penguins don’t fly.Now suppose we know bird(penny) then using the default rule we’d conclude fly(penny).

But once we have the extra information penguin(penny) we now conclude ¬fly(penny).The truth value of fly(penny) has changed.

4.2 Machine Inference

An ES will use machine inference to deduce new facts/results from the knowledge base andinformation in the working memory. This is illustrated in Figure 4.1.

17

Page 19: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

Knowledge Base

Working Memory

InferenceEngine

Rules Facts

Facts

Figure 4.1: Inference Engine

The actual operation of the inference engine can be quite complex. There are essentiallytwo basic forms forward and backward chaining engines. The former has its basis in naturaldeduction where the main rule is modus ponens and the latter is exemplified the resolutionproof method.

4.2.1 Modus Ponens

As previously stated, the rule of modus ponens is

FROM:

A and A→ B

INFER

B

We can repeatedly use modus ponens to derive new results from some facts and rules. Asimple example follows:

student(S) ∧ takes(S, ai) → takes(S, prolog)

student(T ) ∧ takes(T, expsys) → takes(T, ai)

student(joe)

takes(joe, expsys)

18

Figure 4.1: Inference Engine

The actual operation of the inference engine can be quite complex. There are essen-tially two basic forms forward and backward chaining engines. The former has its basis innatural deduction where the main rule is modus ponens and the latter is exemplified bythe resolution proof method.

4.2.1 Modus Ponens

As previously stated, the rule of modus ponens is

FROM:

A and A → B

INFER

B

We can repeatedly use modus ponens to derive new results from some facts and rules. Asimple example follows:

student(S) ∧ takes(S, ai) → takes(S, prolog)

18

Page 20: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

student(T ) ∧ takes(T, expsys) → takes(T, ai)

student(joe)

takes(joe, expsys)

From the facts and the rules with two applications of modus ponens we can derive

takes(joe, prolog)

This is the type of proof method used if we wish to generate new information from somegiven facts.

4.2.2 Resolution

If we just wish to try and prove a particular goal from some facts and rules, then themethod of resolution is more directed. The simplest form of resolution is for propositionalclauses. Given two clauses of the form

A1 ∨ A2 ∨ . . . ∨ An ∨B and ¬B ∨ C1 ∨ . . . ∨ Cm

a resolvent of these clauses is

A1 ∨ A2 ∨ . . . ∨ An ∨ C1 ∨ . . . ∨ Cm

We can think of the B’s cancelling out.To introduce the idea of resolution for clauses with variables, we must first define

substitutions. A substitution is an assignment to the variables in a clause, and if θ isa substitution then Cθ is the clause obtained by the simultaneous substitution of thevariables throughout.

Given two clauses of the form

A1 ∨ A2 ∨ . . . ∨ An ∨B and D ∨ C1 ∨ . . . ∨ Cm

so that there is a θ which makes Bθ and Dθ negations of each other, then a resolvent ofthe clauses is

(A1 ∨ A2 ∨ . . . ∨ An ∨ C1 ∨ . . . ∨ Cm)θ

i.e. the elimination is carried out after the substitution has been made.Now if we have two clauses C1 and C2 which have a resolvent R, then if C1 and C2 are

both satisfiable, then R will also be satisfiable. The method of resolution theorem provingis then based on the following idea:

Take the negation ¬G of the goal to be proved. Keep on forming resolvents usingthis negated goal and the rest of the facts and rules, Ax, and any resolvents previouslygenerated. If we arrive at the empty clause, which cannot be satisfiable, then it meansthat Ax together with ¬G cannot be satisfied. If ϕ is the composition of the substitutionsthen we know Gϕ must be true in any structure which satisfies Ax.

19

Page 21: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

This type of proof is called proof by refutation.Example

We can re-express the previous example as clauses:

¬student(S) ∨ ¬takes(S, ai) ∨ takes(S, prolog)

¬student(T ) ∨ ¬takes(T, expsys) ∨ takes(T, ai)

student(joe)

takes(joe, expsys)

Now suppose we want to find a solution to the goal takes(S, prolog). Then we first takeits negation

¬takes(S, prolog)

Resolving with clause 1 we get

¬student(S) ∨ ¬takes(S, ai)

This can then be resolved with clause 2 (eliminating the repeated occurrence of ¬student(S))

¬student(S) ∨ ¬takes(S, expsys)

Resolving this with clause 4 gives

¬student(joe)

and finally resolving with clause 3 yields the empty clause. So we have proved

takes(S, prolog)

for some S, and furthermore obtained joe as a solution for S.The method of resolution is the basis of Logic Programming and Prolog in particular.

For an actual implementation of a resolution theorem prover, it would be necessary to givedetails of the control provided the inference engine — for example which clause is chosenfor resolution when several are possible.

20

Page 22: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

4.3 Forward chaining

The forward chaining inference strategy proceeds by matching the premises in a rule withfacts in the working memory (by instantiating certain variables) and then adding theinstantiated conclusion to the working memory. When a match of the premises is madeand the conclusion added to the WM, we say the rule fires. The actual control could begiven by the following algorithm:

add new information to working memory

REPEAT

rule <- first rule

WHILE NOT(new match of premise with working memory) AND more rules

rule <- next rule

IF new match of premise with working memory

THEN add conclusion to working memory

UNTIL NOT(new match of premise with working memory)

Example

1 IF patient has sore throat

AND suspect bacterial infection

THEN patient has strep throat

2 IF patient temp > 38 C

THEN patient has fever

3 IF patient sick over 1 month

AND patient has fever

THEN suspect bacterial infection

If we add the following information to the working memory:

patient temp = 39 C

patient sick for 2 months

patient has sore throat

then three pieces of new information are deduced:

patient has fever

suspect bacterial infection

believe patient has strep throat

The forward chaining method produces all facts which are deducible from the knowledgebase and the initial working memory. In this sense it is not goal-directed and can produceresults in which the user has no interest.

If the user is only interested to see if a particular goal is deducible, then the iterationcan be stopped when and if the goal arises. However, there could still be many unwantedresults inferred.

21

Page 23: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

4.3.1 Conflict Resolution

At any time during a deduction, it is possible that several rules could fire. In the previoussection, one control strategy was given which determined which rule should fire next (ba-sically search the KB top to bottom until find rule which can fire and that match has notalready been used). The choice of selection of the firing rules can lead to a different ordersin the generation of the results. It can also lead to possible conflicting results or advice.

Example

1 IF temperature > 40

AND

steam from outlet

THEN

open valve A

2 IF pressure < 20

AND

inlet open

THEN

close valve A

If all the premises are true (which seems possible) then we have a conflict — to open andclose valve A!

One possible solution is to stop the deductions when a required goal has first beenestablished. The ordering of the rules will then determine which rules fire. This is notalways desirable and there are many other methods of em conflict resolution.

For example a priority can be attached to rules, which reflects their importance. Therules with the higher priorities are then fired first if there are several matches.

Given some procedure for resolving conflicts, the inference engine cycles through therules processing as follows:

1. Match premises against WM and collect rules which can fire.

2. Use the conflict resolution strategy to select a rule from this set.

3. Fire this rule, adding conclusion to WM.

Some conflict strategies are listed below:

1. Use first match.

2. Highest priority.

3. Most specific.

4. Use rule referring to most recent addition to WM.

22

Page 24: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

5. Don’t reuse a rule.

6. Fire all rules and maintain tree of possible deductions.

7. Random

1 and 2 have been mentioned. Strategy 3 means use the rule with the most premises asit appears to make use of more information. Strategy 4 assumes that the newer informationwill be more important in firing rules. Strategy 5 prevents looping. Strategy 6 fires all therules and creates separate working memories for each choice and 7 just chooses the rule atrandom. CLIPS provides about seven different strategies.

4.4 Backward chaining

We start with a goal to be proved. If it is not already in the working memory, then look fora rule which has then-part matching the goal. The premises are then launched as subgoals(and when proved added to the working memory). Continue in this recursive manner untila subgoal is found which is not in the working memory and there is no rule whose then-partmatches it. The system then asks the user to determine its validity (possible instantiatingsome free variables). In essence a tree is generated with the goal at the root and subgoalsat descendant nodes. At the leaves are nodes which are in the working memory or havebeen justified by asking the user.Example

1 IF temperature too high

AND

pressure too low

AND

valve stuck

THEN

shutdown reactor

2 IF gauge1 red

AND

gauge2 yellow

THEN

temperature too high

3 IF gauge3 blue

AND

gauge2 yellow

THEN

pressure too low

4 IF red light on

THEN

valve stuck

23

Page 25: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

If we wish to know whether to shut down the reactor we use backward chaining from thegoal shutdown reactor. This produces the proof tree

shutdown reactor

temperature too high pressure too low valve stuck

gauge1 red gauge3 bluegauge2 yellow gauge2 yellow red light on

If we wish to know whether to shut down the reactor we use backward chaining from thegoal shutdown reactor. This produces the proof tree

Figure 4.2: Proof Tree

The truth values of the leaves in Figure 4.2 are provided by the user.

4.4.1 Goal Agenda

A goal agenda is a sequence of goals to be launched in a given order. The behaviour to befollowed once a goal has been found true or false can be determined by the user. So forexample we could have:

1. Try and prove all goals on the agenda;

2. Stop when first true goal found.

exampleSuppose we have the following agenda

1. Change spark plugs

2. Change points

3. Clean carburettor

Then in one case we could find out whether we do all items on the agenda or in the secondcase just perform the first one which succeeds.

More complicated agendas can be given, where the items are arranged in a hierarchy.

24

Figure 4.2: Proof Tree

The truth values of the leaves in Figure 4.2 are provided by the user.

4.4.1 Goal Agenda

A goal agenda is a sequence of goals to be launched in a given order. The behaviour to befollowed once a goal has been found true or false can be determined by the user. So forexample we could have:

1. Try and prove all goals on the agenda;

2. Stop when first true goal found.

exampleSuppose we have the following agenda

1. Change spark plugs

2. Change points

3. Clean carburettor

Then in one case we could find out whether we do all items on the agenda or in the secondcase just perform the first one which succeeds.

More complicated agendas can be given, where the items are arranged in a hierarchy.

24

Page 26: Artificial Intelligence Applications by - Department of ...csphil/CS345/chapts1-4.pdf · Artificial Intelligence Applications by ... theory and lambda calculus and Prolog ... flex

4.5 Advantages & Disadvantages

There are pros and cons for both approaches which we now consider.Forward Chaining

Pros

• Natural approach when start by collecting information and deducing results from it.

• May obtain large amount of information from small KB.

• Ideal for planning, monitoring or control.

Cons

• Not directed by goal and so can generate irrelevant queries and results.

Backward Chaining

Pros

• Natural approach when starting from a hypothesis to be proved.

• The search is focussed on a goal. Questions seem relevant.

• Only searches tree relevant to proving the goal.

• Ideal for diagnosis, prescription and debugging.

Cons

• Can continue down branch of tree when should have abandoned search.

4.6 Choice of Approach

A rough guide on choosing the method consider the human expert.

• If data collected and then results inferred, use forward chaining.

• If hypothesis created to be proved use backward chaining.

It is possible to combine the two approaches. The system may be separated into modulesso that modules use their own preferred proof technique. This could use meta-rules to selectthe appropriate module.

25