21 : Methodology and Computer Implementation · Foundation, Grant No. NSF MCS 74-03514. ... ~ost ....

I

· Inductive Inference in the Variable Valued Predicate Logic

System VL21 : Methodology and Computer Implementation

: by

James B. Larson

Department of Computer Science University of Illinois

Urbana, Illinois

May 1977

This work was submitted in partial fulfillment of the requirements for the degree of Docotr of Philosophy in Computer Science in the Graduate College of the University of Illinois and was supported in part by the National Science Foundation, Grant No. NSF MCS 74-03514.

!NDUC!IVE INFERENCE IN THE 'ARIABLE VALUED PREDICATE LOGIC

SYST~~ 'L ~ETHODOLOGY AND COepOTEp. I!PLEHENTATION

21

James Burton Larson, Ph.D.

Department of Computer science

University of Illinois at Urbana-Champaign, 1977

A formal methodolo~y and computer progran arp.

prespnted for the transformation of a set of user supplied

logical decision rules into a new, genpralized set of

decision rules which is near optimal ~ccording to a user

supplied criterion. The VLS2 logic system (a multi-valued

version of a first order predicate calculus) is used as the

framework for defining and expressing decision rules and

transformations on decision rules. The program INDUCE_1

which implements certain inductive inference rules using a

graphical representation of VLS2 expressions is described and

some examples of inductive problems solved by the program are

qiven.

iii

ACKNOWLEDGEMENTS

I would like to express my qratitude to the folloving

people for their help on this thesis: First, Professor R.

S. Michalski for his inspiration, challenging problems, and

many significant suggestions especially regaraing

.meta-functions, my committee, Professor D. Plaisted and Dr•

Don Friesen for encouragement and many helpful discussions.

I especially wisb to recognize tbe valuable support of

~ichard Chilausky during the early stages of this research,

Barr Segal for proofreading of the manuscript, and the many

predecessors in the area of inductive inference and variable

valued logic whose efforts have been much appreciated in the

development of this thesis. I am greateful for the financial

support of the National Scien=e Foundation, the Department of

Computer Science, the Research Board of the University of

Illinois, and the Computing Services Offices, the latter for

supplying the computer time to implement the text fornatter

used for this paper.

Finally, without the moral encouragement of my wife,

Rhonda, this vork surely would have ended years ago from

utter frustration and discourage~ent.

iv

PREPACE

This paper was prepared on a CYBE~ 175 computer at

the University of Illinois using a text foonatter written by

the author. Since several special characters were not

available on the print train of the printer, the following

character combinations have special meaning:

Character Beaning

logical disjunction

logical conjunction

set intersection

the existential quantifier

the universal quantifier

v

TABLE OP CONTENTS

CHAPTER PAGE

1. Introduction.............................. •••••••••••• 1

1.1 The Problem........................................ 2

1.2 Previous Specific Applications......... •••••••••••• 4

1.3 Formal Systems for Inductive Inference............. 7

1.4 Overview of the Following Chapters •••••••••••••••• 12

2. Representing Decisions in the VL System•••••••••••••• 14

2

2.1 VL System Structure.................... •••••••••••• 16

2.2 Selector Formation and Interpretation Rules ••••••• 17

2.3 Vt Formation Rule..................... •••••••••••• 21

2.4 Interpretation Rules............................... 22

2.5 Vt Decision Rules...................... •••••••••••• 26

3. VL Transformation Bules................... •••••••••••• 30

3.1 Eqaivalence Transformation............. •••••••••••• 31

3.2 Generalizing Transformations....................... 33

3.3 Specializing Transformations....................... 37

3.4 Transformation Rules Involving the D~cision Part... 38

3.5 Example of Application of Transformation Rules..... 40

3.6 Efficient Application of Generalization Rules...... 45

4. Computer Representation of YL Decision Rules.......... 49

5. Algorithms and Computer Implementation................ 54

5.1 Input to the Program................... •••••••••••• 54

5.2 Proqram Output......................... •••••••••••• 61

5.3 Pormation of a Complete Generalization. •••••••••••• 64

5.4 Determine Cover and Intersection of 2 Pormulas..... 65

5.~ Trimming a Set of c-formulas........... •••••••••••• 70

5.6 Formation of a set of Consistent Generalizations... 13

5.7 Extending the References of a Consistent c2-formula 75

5.8 Adding Hev Punctions and Predicates to c-formulas.. 79

6. Examples of Decision Rule Generation using INDUCE_1... 84

6.1 Pigures Example (EX 1)............................. 89

6.2 ArcD Example (EI2) •••••••••••••••••••••••••••••••• 102

6.3 Trains (EX 3) •••••••••••••••••••••••••••••••••••••• 101

6.4 Textures (EX 4) •••••••••••••••••••••••••••••••••••• 112

7. Current Limitations and Possible Extentions........... 117

L!ST OF REPERENCES......................... •••••••••••• 123

lPPENDIX 1••••••••••••••••••••••••••••••••••••••••••••• 129

APPENDIX 8................................. ............ 138

VITA •••••••••••••••••••••••••••••••••••••••••••••••• 145

1. Introduction

An important problem which is often presented to

computer systems is that of ex~ractinq relevan~ informa~ion

from complex data in order to gain a better understanding of

the meaning behind such data. ~ost current methods are

incapable of adequately describinq highly structured

situations and produce results wbicb are difficult to

interpret~ A selection of those systems which overcome these

difficulties is giTen in sections 1.2 and 1.3.

The following chapters deal with finding useful (as

defined by the user with an optimality criterion),

generalized information about sets of situations represented

as logical VL decision rules. 1 decision rule is a form

CONDITION => DECISION

where CONDITION describes some set of situations and DECISION

describes some new situation or action which is indicated if

an existing situation satisfies tbe description in

CONDITION. If no situation satisfies the CONDITION in a

decision rule, the rule makes a HULL decision. The

descriptions in CONDITION and DECISION are represented in the

VL logic system. This system is a variable valued first

order predicate caluculus with a rich set of operators and

2

•

•

•

• •

• •

• •

•

•

•

2

tbe facility for allowinq user defined dOMain sizes and

structures for yariables and functions appropriate for tbe

problem at band. The approach taken bere is to apply

inductive inference rules to loqical decision .rules which

express some decisions .ade with sets of situations in order

to form new, near-optimal decision rules wbich retain the

decision makinq capabilities of the oriqinal rules.

1.1 The Problem

The specific induction problem being inYestiqated is

as follows: /

Given a set of decision rules:

C =>0,-· C =>D • ••• C =>0 1,1 1,2 1 - 1.t1 1 C =>0 · C =>D • ••• C =)D,2,1 2- 2,2 2- 2,t2

(1. 1)

C =>0 C =>D C =)D· · •••n,' n - n,2 n - n,tn n where C and 0 are expressions in the YL system which

i, j i 2

represent the CONDITIOI and OBCISIOI parts of decision rules

respecti.,el" then find, through an application of

generali'Zation rules, a set ofYL decision rules:

3

C =>0 1 1

C =)02 2

•

(1. 2)•

•

C =>0 n n

which are with regard to the rules (1.1):

1) consistent

2) complete

3) optimal with regard to a user supplied optimality

criteria.

The new rules are ~Qn§ist~! if for any situation for which

the new rules assiqn a decision (a non-lULL decision), the

initial rules assign the same decision or a BULL decision.

They are &2m~l~~~ if for any situation for which the initial

roles assign a decision, the new rules assign a decision.

Prom the initial roles, it is usually possible to derive many

sets of rules which are consistent and complete. Therefore,

a criteriJn of opti.ality (defined by a oser according to his

problem) is used to select a few alternatiges which are most

•

desirable according to the specific induction problem. The

attention is restricted to sets of rules which .ake only one

decision for a gi.en situation.

1.2 Preyious Specific lpplications

InductiYe inference is used here to describe a forMal

method for rewriting or generalizing ayailable data in order

to gi.e new information about a problem and aake new

decisions which could not be obtained before. statistical

methods are probably the most widely used forms of inducti.e

inference. These methods require a great deal of a-priori

knowledge in~lQding the a.ailability of a large set of data,

knowledge about the interpendence of yariables on each other,

and an understanding of the type of underlying distribution

of the data [Croft-71]. In addition, statistical results may

be difficult to read and interpret [Larson 16] (e.g. a

conditional probabilitT matrix).

The first approach to automated inductiYe inference

using logic was most likely de.eloped by Bunt [Hunt 66]. He

described a number of different schemes for generating

decision trees which can be used to distinguish between sets

of letter seguences. Although a decision tree produces an

elegant procedure which can be easily executed on a co.puter,

it lacks the flexibility necessary to represent more general

concepts.

5

The BEURISTIC-DEJDBIL program [Buchanan et.al. 69]

provides a model appropriate for representing the structure

of chemical compounds and some transformations representing

possible chemical reactions which can be applied to the

compound representations under certain known physical

constraints. The program finds a set of possible structures

of a compound knowing its empirical formula and mass

spectromet~r data by suggesting various structures for the

compound and applying transformations to the structures under

the quidence .of a set of heuristics based on the mass

spectrometer data. The meta-DENDRAL Frogram [Buchanan

et.al. 72] finds a general mechanism or theory wbich explains

the transformations which take place relying on. tbe knowledge

of those transformations which are plausible and those which

are forbidden.

computer aided medical diagnosis is another area in

which logical inductive inference methods have been

suggested. Pople [Pople et.al. 72] have suggested a graph

structure representation of biomedical facts and an approach

to forming theories by finding common subgrapbs using user

supplied suggestions. Of particular note is work in the area

of computer aided medical diagnosis and plant pathology

[Michalski 73,7Q, Cbilausky et.al 76, Larson 76] use the 'L

system (a variable valued logic system which is the precursor

1

6

to the VL syste. used here) and the proqram lQY1L/1-1Q7 to 2

infer descriptions of clases of liYer diseases and soybean

diseases. The latter work 1s a specific application of a the

:VL, logic syste. to seyeral problems.

Winston eVinston 701 demonstrates specific procedures

which discover descriptions fro. examples in the toy blocks

world. 1 description which is descriminant (i.e. can be used

to distinqaish one object or set of objects froll other sets)

1s formed by matching the si.ilar parts of the object under

consideration with another ob1ect (near-miss) and then

isolating the structures which are different between the two

ob1ects. This differs from the approacb taten in the

following chapters in that matchinCJ is o Illy done here in

order to adequately describe sOllie specific feature which

distinguishes betveen tvo objects. (e. CJ. to specify the

second part fro. the top in an object, one may bave to

include some predicates .hich define the second from the top

in terms of other descriptors. If the distinguishing feature

involves this part, then the definition of the part used in

the description must be common to both objects.) Winston also

uses modifiers such as '!lust' , 'may', 'must not' in

descriptions. 1 form of these modifiers is inherent in the

VL approach (e.g. discriminant descriptions inyo1ye the 2

'mast' modifier, descriptive descriptions il1volYinq only set

7

of objects yields descriptions inYolYing a type of 'may'

modifier).

The program A BITHftETIC ( Bongard 70 ] finds an

algebraic rule which explains sample relationships. The

program is given sets of tables, each table containing

3-tuples of a ternary relation. 1 set of 33 predicates are

used in the program (although not explicitly given in the

reference) where a predicate may be: e.g. if the quotient of

the first two elements of the 3-tuple is positive, then the

predicate is true. Por each row of a table, a set of

features is generated by finding Boolean combinations of the

predicates appliea to the rov. 1 feature describing the

table is generated for each Boolean combination by finding

the product of Boolean combinations for all rows. The set of

features which appear to be most useful in distinguishing one

table from the others is selected as the description of each

table.

1.3 Pormal Systems for Inductive Inference

1 criticism of some of the above Endeavors is that

they are problem specific. Kore general systems which have

many possible application areas have emphasis in two types of

approaches: 1) generation of descriptions of sets of ob1ects

represented in a logic system of some kind. and 2) creation

8

of new concepts in a sequential manner ~J generating and

modifying hypotheses. 1 summarJ of several tJpes of learning

systems can be found in (Banerji 75]. Of particular interest

:here is the work of Borgan [Borgan 72] in which a formal

system based on the first order predicate calculus with

falsehood preserving transformations is presented. Briefly,

the idea that inductive inference can be described as

backwards reasoning is apparently not sufficent for a

practical system. Por example, if E was derived from the 2

assertions ,E and! . vB, then backward reasoning would 112

somehow have to generate the two assertions above given only

! • There are far to many expressions! Whlch can be applied2 1

to a situation such as this to yield a practical system.

Instead, Borgan defines a falsehood preserving transformation

of a deductive inference rule into an inductive inference

rule (in Borgan's notation):

E 1- E 1 P 2

where is false for everJ interpretation in which I, is false. 1 set of transformations (P-rules) can be created to

convert a deductive logic systeD into an inductive logic

system (e.g. if Band! are atomic expressions, then 1 2

E v! 1- B E 1 2 P , 2

9

i. e. , from the disjunction E Y E, one may infer the 1 2

conjunction E E vhile preserving falsehood). With this 1 2

system, theorem proYing techniques using falsehood preserYing

rules may be applied to inductive problems. (In later

chapters, the symbol 1< is used to denote such a

'generalizing' transformation.)

1 number of authors have presented systems which use

aqraph structure representation of expression in a type of

logic system. The 'parameterized structure representation'

of Hayes-Roth [Bayes-Roth 16] is used in inductiye tasks

which learn descriptions of sets of ob1ects and

transformations from one set of objects to another set of

ob;ects from examples. The problems addressed are closely

akin to tbe work in the following chapters of this paper, one

of the obiectives being to find the common properties of all

examples of one class vhich are available to the system using

a qraphical representation. The numter of possible

alternatives in Bayes-Roth's method is li~ited by a fixed

utility function 'which 'evaluates intermediate results and

discards hypotheses of low utility. The vork differs from

that presented in the following chapters in that only

indp.pendent descriptions are sought using a fixed utility

criterion (these are called descriptive descriptions in

10

chapter 6.). The structure used for representation does not

take into account specific domain structures which are

inherent in the Vt syste. and there is no facility for

:generating new descriptors within the system.

A formal approach using the predicate logic system

[vere 75] produces the largest common set of descriptors of a

set of examples by representing examples using a graph

structure and finding the largest common subgraph of these

structures. Such methods suffer from the BP-coMplete nature

of graph isomorphism algorithms (this problem is addressed in

the following chapters by finding the smallest useful

subgraphs of graphs of examples instead of the largest

subgraph.). Reither Vere nor Hayes-Roth use negative examples

heavily in their implementations.

Redrick [Hedrick 74] uses a semantic net to represent

examples and to build and modify hypotheses as nev examples

are given to the program. the semantic net supports only

binary relations but other.ise loots similar to the graph

structure of Vere and Bayes-Both.

Kochen [Kochen 74] presents a different type of

system with a set of initial events containing state

variables, actions and relations between the actions, a

learning program which applies certain transformations to

events at various time steps. At each ti~e step, weighted

11

hypotheses are formed which reduce the set of states stored

in memory (i.e those states which are Explained by the

hypotheses ).

Production systems provide a rich tool for the

introduction of many inference techniques (recall that VL

decision rules are similar to production rules). Briefly, a

production system architecture contains a working memory, a

set of productions which modify working memory, and a

recoqnize-act cycle with conflict resolution to dictate the

order in which productions are applied to ~emory and to add

new productions when necessary.

Waterman [Waterman 70, 74, 75] uses these to solve

several problems. A program which plays poker has been

developed [Waterman 70] which designs betting strategies in

terms of production rules. Bore recently [Waterman 75], the

approach bas been applied to recognizing letter sequences

wi th success. Rychener [Bychener 76] has applied production

systems to chess end games and natural language input of a

toy blocks world. These two authors use distinct production

system architectures Which differ in the o~dering of working

memory, orderinq of productions and in the way in which new

productions are added to an existing set of productions to

correct errors made by the system.

As a final note with regard to production systems,

12

the RtCIN system (Shortlieffe 74] has been shown to be a yery

powerful tool in aidinq physicians regard to

antimicrobial therapy selection. In the StCIB system,

,'deductive inference using a multi-Yalued truth model is

applied to expert supplied productions and a data base

consisting of a patient record.

1.4 Overview of the Pollowing Chapters

The 'VL system (a subset of the VL sy stem) 'is 21 2

described in chapter 2. The subset used here uses only the

truth yalues TRD!, PALS!, and DIllOW» instead of the

multi-valued truth domain of the VL syste •• llso, only a 2

very small subset of the operators available in YL are 2

used. The inductive rules used to transfor~ initial decision

rules (1.1) into new decision rules (1.2) are giYen in

chapter 3. The rules involve selecting the most significant

features of the initial condition (C in 1.1). extendingi,1

the value set wbich each feature may assume under the domain

structure constraints, and adding new global functions which

describe certain characteristics of the condition (C ) • i,j

Chapter 4 contains a graphical representation of VL rules. 2

1 subset of the graph structure is used in the computer

program INDDCE_1 described in chapter 5. The program accepts

as input:

13

1) a set of decision rules representinq certain examples

of sets of decisions,

2) a problem environ.ent description in the form of VL

decision rules which describes certain

characteristics of functions and doaains which arise

from the particular application, and

3) a set of control parameters which supply the

optimality criterion and certain parameters which

limit the number of alternatives generated at varioQs

points in the program.

The output from the program contains a set of complete,

consistent decision rules. Cbapter 6 gives results of the

program as applied to some specific situations. Chapter 7

describes some limitations and possible extensions of the

proqram. Two appendices are given: appendix 1, providing a

listing of the detailed output of the program applied to one

example of chapter 6, and appendix a, giving a brief review

of the precursor to this work (the program AQVAL/1-AQ7) which

is used as a procedure in I!DOCE_'.

14

2. Representinq Decisions in tbe 'L Syste.2

!ucb of tbe infor_tion in tbis chapter is found in

.r Larson et. ale 77, !ichalski 74 b l- It is included bere to qive tbe reader a faailiarity witb tbe YL s),stem. Tbe

2

complete VL system contains a ...err ricb set of operators and 2

domains. 1 subset of iL called VL wbich contains a basic 2 21

set of operators and domains is used bere. Only tbe iL

21

system is described with notes indicating the extensions

which are possible in the full iL system. In later 2

chapters, tbe notation it is used to refer to the system2

it •

21

Tbe loqic system it is a lanquage for describinq21

situations (e.q. objects, classes of objects) and ezpressinq

decision and inference rules. Tbe lanquage proyides for a

compact expression of descriptions which is both easily

readable and sufficiently precise to facilitate formal

manipulation (possibl)' by a computer).

There are two major differences between iL and the 21

first order predicate calculus

1. Instead of predicates, selectors are used which can

be viewed as tests for membership of Talues of

predicates and functions in a certain set.

15

2. Each Yariable, predicate and function symbol is

assigned a domain (or value set) together witb a

characterization- of the structure of the domain.

(This feature facilitates the of rule

generalization and allows for thE application of

different generalization transformations according to

the structure of the domain.)

There are three types of domains currently

distinguished:

,. Qn2rd~£~g or !~in!!

Elements of the domain are considered to be

independent entities; no structure is assumed to

relate them. 1 variable or function symbol with this

domain is called D2min!1 or S~1~sian (e.g. blood

type, names of obiects, etc.).

2. tin~!£l% g~~[ed or !n1~[~al

The domain is a linearly ordered set. A

variable or function symbol with this domain is

called !n1~!a! (e.g. military rank, temperature,

size) •

16

Elements of the domain are ordered into a

tree structure. 1 predecessor node in the tree

represents a concept which is more general than the

concepts represented by the descendent nodes (e.g.,

the predecessor of the nodes 'triangle', 'rectangle',

'pentagon' may be a ·polygon'). 1 yariable or

function symbol witil sucll a dClllain is called

2.1 VL System Structure

The VL system used is as-tuple (V,l,S,B,I) where: 21

v - is a set of yariable symbols. Each yariable symbol

is associated vith a domain D(~,). 1 ~!R of 1

variables which haye the same douain are labelled

with the same ,Yariable symbol but a different

subscript . (e. g. l[ ,:I , •• ,x , y ,y , ••• ,y are 12k 1 2 1

specifications of yariables in tvo Yariable groups

vhich assume yalues from tvo domains denoted D(x) and

D(y) or alternatiYel" D(x,) and D(y,). 1. 1.

P - is a set of n-ary fUnctions and pEedicate symbols.

Each n-ary function sy.bol represents a mapping from

an argument space into a domain. For a function

f(x ,x , ••• ,x ), this is a ~pping:1 2 n

•••

D(x) x D(x ) x ••• x D(x ) -> D(f)12· It

where D(x ), D(x ), ••• D(x ), D tf) represent the 1 2 It

domains of the yariables x, x, x and the 1 2 It

domain of the function f, respectiyely. 1 predicate

is a function vbose domain is the set [TRUE,P1LSE].

Included in the domains of all function and yariab1e

symbols is tbe value 11 (not applicable).

s - is a set of symbols including:

( ) [ ] = < > , . R - is a set of formation rules described in section 2.3

I - is a set of interpretation rules described in section

2.2 Selector Formation and Interpretation Rules

1 well formed iL formula (vff) is composed of 2

quantifier forms, selectors, and logical connective symbols.

1 ~!g£!Q~ is a form:

rL • R] or [ L' 1

where

18

L, L' - each called the IS1I~! are atomic forms.

fOIm is a yariable symbol or a function or predicate

symbol followed optionally by a list of atomic forms

enclosed in parentheses. In the aboye forms, the

atomic for. t' .ust be a predicate syahol .ith

arguments following in parentheses. If L contains a

function symbol, then the related function is called

an ~12mi£ fYn£!i20.

R - the [ef~I~n£~ is a set of yalues in the domain of the

atomic function of L. B may be in seYeral forms:

Beference Bxample Description

a a constant in the domain of

the atomic function of L

a,b a list of yalues in the

domain of t separated by

commas

a •• b a pair of yalues in the

domain of t separated by

( .. ) • the symbol,.) representing

19

all values in the domain of

L (except NI)

IA the Yalue HI (not

applicable)

t - is one of the symhol combinations

= (= >= ( > ...=

If R is a set of values, thEn L is related to

R by t if

when • is = or ...= L has a value (does not have a yaluE) in the set R

when t is = < > L has a value related to

eyery value of R by I.

,The selector is interpreted as a unit of

information about a situation with value or

truth-status TRUE if the relation R t L holds or

PALSE if the relation does not hold, or UNKNOWN in

which case the selector is interpreted as a question

about the situation which must be answered in order

to determine if the selector is satisfied. If some

20

Yariables in tke atomic fora of tbe selector are

quantified, tbese quantifiers must be considered wben

determining tbe truth-statas of a selector.

If R is., tben L is related to B for any

value of L except 11 (in this case,. is always a).

Belov are some examples of a selector:

Selector Interpretation: truth-status TROE

[color (vall ) ;:; vhite] the color of the vall represented1

by vall is vhite. 1

[ lenqt b (bOX,' ~1] Tbe length of the box represented

by box is greater than or equal1

to 1.

[box, - 2 •• S ] The yariable box may have a yalue , 1 1

betveen 2 and S inclusiYe. The

yalues of bOI: may represent1

yarious bOl:es in a situation. the

selector restricts the ranqe of

yalues of the variable box to the 1

values 2 through S.

[ ontop (x , x ) ] the part represented by I: is on 1 2 1

top of the part represented by

I: • 2

21

2.3 Vt Formation Rule

Pormulas in the it logic- system are used to describe 2

situations, and also to express decision rules and inference

rules. The Vt formulas are defined by the following formation

rules:

1. A selector is a it formula (wff).

2. If V, V and V are wff, then so are: 1 2

(V) a formula in parentheses

in'ferse

V & V or V V conjunction (t be symbol 6- is 1 2 1 2

used to represent conjunction)

v v V diSjunction1 2

V ! V exclusive disjunction1 2

V~V exception

1 2

V -> V V implies i 1 2 1 2

i (-> V V is equivalent to V 1 2 1 2

22

Is , s , ••• , s - 1 2 t

(V) Isistentiall, quantified formula

(I is used to represent the

esisteatial quantifier,.

E. s ,s , ••• , s (T) Distinctly esistentiallJ' - 1 2 t

quantified formula

As ,s , ••• ,s (T) UniTersall, quantified formula - 12k

(} is used to represent the

uniTersal quantifier).

Not all of these forms are considered in the

followinq chapters. In chapter J, (VL inference

rules) onlJ' conjunction, disjunctioD, and quantifiers

are considered. Chapter • (Grap~ Representation)

presents a graph structure representation which

includes all of these forms but the types of for.ulas

actually included in the alqorithm and the

implementation inTolTe only conjunction and distinct

existential quantification.

2.4 Interpretation Rules

A VL formala mar have truth-status TRUE, F1LSE, or

UNKNOWN. In the full VL sJ'stem, a truth-statas domain with 2

interTal structure llaJ' be defined, but here only the Talues

23

above are considered.

interpreted in the normal

VL formula

v y V 1 2

V & V 1 2

The remaining

equivalent forms:

Tile connectives (... .. &) are

manner:

Interpretation

FALSE if Y is TEUE. TRUE if V is

PALSE. UNKNOWN if V is UNKNOWN.

,TRUB IF EITHER OR Y IS TRUE,1 2

UNKNOWN if botb Y and V are 1 2

UNKNOWN or one is UNKNOWN and

the otlaer FALSE, TRUE

otherwise.

O.NKNOWIi if both V and V are 1 2

UNKNOWN or one is true and the

other is UNKBOiN, TBUE if both

V and V are TRUE, PALSE 1 2

otherwise.

connectives may te rewritten in

'L PorUlula !quivalent forlll

v -> V ,V v v 1 2 1 2

v v (V ->') & (' ->')1 2 1 2 2 1

, ............. V , ... V

1 2 1 2

,V ! V y ,~ V V 1 2 1 2 1 2

1 VL system is ased to describe a set of situations.

In order to effectively apply a formula to a set of

situations, the VL system should contain variables,2

functions, and predicates wbicb adequately characterize the

situations. To deterDline the truth-status of a formala with

regard to a specific situation, an event is created (an event

may be yieved as an interpretation of a situation in tbe VL2

system). 1n event is a sequence of assignments to variables,

functions and predicates in the system which characterize a

specific situation. Quantified variables Day be assigned a

set of values. One function assignment ~ay be made to a

given set of values of arguments if the yaloe of tbe function

is known. If a function does not have an assignment for a

given set of values# tben the value HI (not applicable) is

assumed.

25

1 selector [L • a] (or [L']) is sat isfied by an eyent

if there is a set of assignments to variahles and functions

(or predicates) in L (or LI) such that L is related to R by t

(or L' has the value TROB,.

A VL formula is satisfied by an event if it has 2

truth status TRUE when applied to the event.

The quantified formulas are interpreted:

The truth status of

Ex ,x , ••• ,x tV) is TRUE (or FALSE) in a given - 1 2 n

situation if there exists (or

does not exist) values for

x ,x , ••• ,x in the event 1 2 n

assignments which makes the

truth-status of the formula V

equal to TROE

? if it is not known whether

there exist values •••

E.x ,x , ••• ,x (V) is TRUE (or FALSE) in a niven - 1 2 n ':I

situation if there exists (or

does not exist)

(different) values for

x ,x , ••• ,x in the event 1 2 n

26

assiqnll8nts .laicll lAkes tile

tratla-stat us of tae formula

equal to fRUB. flais obyiates

tile need for extra predicates in

an eKpression like X "'=K ,1 2

K "'=K , X ...=x , etc. 231 3

? if it is not· known whether

there exist valuES •••

!K ,x , ••• ,K (V) is TRDB (or PILS!) in a 9iYen 1 2 n

situation if for all assignments

to the yariales X,X , ••• ,K ,1 2 n

the formula V has truth-status

equal to TBUB.

1 formula is a 4escr~21i2n of a situation if

every event which can be deriyed from the situation satisfies

the VL formula and eyer, eyent wbich satisfies the formula 2

is also an interpretation of the situation.

2.5 VL Decision Rules

If v and V are VL formulas, a general form of a vt 1 2 2

decision rule is

, => V (2.5.1)1 2

27

The formula , is called the ~9~9~!!Qn part and , is the 1 2

1 restricted form of the V1 decision rale 2

will be used in the followinq chapters.

{In the computer implementation, the formala , is 2

assumed to be a product of selectors which contain O-ar.,

functions in the referee. The terminoloqT is relaxed in this

case to allow the function symbol which appears in the

decision part to be called a ~~£is1on ~!£i!bl~.'

1 decision rule in the form 2.5.1 may be applied to a

set of situations as follows: If the condition part of the

decision rule (V ) is given troth-status TROE, then the 1

decision part of the decision rule (' ) also assumes the 2

truth-status TROE. For each event (assigrunent e:=L) which

satisfies V a new set of assiqnments are made to the eventl'

-usinq the decision part of the rule to fo en the set of all

events which satlfsfy the coniunction V & , . For example,1 2

qiven the decision rule:

E. x , J: r p (J: , J: ) ] => [D= 1 ] ,(2.5.2)- 1 2 1 2'

and th e event

e: J: :=0,1: J: :=0,1 p(0,1) :='IROE (2.5.3)1 ' 2

with variables functions and predicates: x, D, p, q vith

28

domains O(x)=[0,1), 0(0)=[0,1], D(p),O(q)=[tBUI,P1LSB], a nev

assignment is made to e to give:

e: x :=0,1; x :=0,1; p(O,'):-TIOI; D:-1. (2.5.4)1 2

1 decision rule:

(2.5.5)

applied to the event 2.5.3 gives one of tke two new events: ~

e : %1:=0,1; x :=0,1; p(0,1):=TRUB; q(O,1):=TRO!;1 2

e : x,:=0,1; x :=0,1; p(0,1):=TID!; q(1,0):=TROB;2 2

Note that q(1,1) and q(O,O) are not given status TROB since

the quantifier I. insists that the tvo variables

have different values.

Given a set of decision rules eack in the form 2.5.1,

the set may be applied to a set of events. Initially, the

condition parts of all decision rules bave value OIK1081. If

an event satisfies the condition part of a decision rule, nev

assignments are .ade to the event according to the decision

part of the satisfied rule and the condition part of the rule

returns to truth status UNKHOVI.

In the remaining chapters, events are only used as a

formal basis for defining certain concepts. Since tbe number

29

of events necessary to completely describe a sitaation is

quite large, only the VL formulas themselvEs are manipulated2

by the algorithms.

30

3. ,t Transformation lules

Prom one set of decision rules (1.1), a nev set of

decision rules (1.2) is obtained b, appl,in9 certain

transformation rules (t-rules). Por now, ve vill restrict our

attention to rules which transform the condition part of a

rale. These t-rules may be grouped into three t,pes of rules

based on the events which satisf, the condition part.

Given tvo rules:

a · y => D 1 · 1

1 •

, =) D· 2 2 V is more than V if eyery eyent satisfJin9 , .alsoH!l!'.I!1 --- 2 2 satisfies , . If the conyerse is also true, then '1 is

1

equivalent to , • Belov, 1 and 1 are the inegt and 2!Il..2!l! 2 1 2

of at-rule. the three types of t-rules can be expressed as

follows:

1 transformation!: 1 .t 12 (' being one of -,

3'

the condition part V of B is more general than the 2 2

condition part V of B • , , Rules ,. and 2. are called inducti~ int~~D£~

3. A specializing transformation (deductive inference

rule denoted B, I> R , if the condition part v is 2 1 more qeneral than the condition part 1 •

2

Here, we are most interested in the first tvo types

of t-rules (i.e., inductive inference rules).

3.1 Equivalence Transformation

An equivalence transformation rewrites a VL formula 2

into a different form either using equivalent VL operators2

or introducing new functions which represent some information

already in tbe rule in a different manner. Below are some

examples of equivalence transformations. The symbols Land, L2 represent atomic forms, V and Y' represent YL formulas, D

2

represents a VL formula which has no variables in common 2

with V, or and 1= is used to represent an

equivalence transformation.

E1. Equivalent VL forms. 2

v ([ L = 11 v [L = 2)) => D 1= V[ L = 1,2] => D

32

'[ L ",= 3,11] => D

(assu.ing tbat tbe do.ain of L is tbe interYal [0•• 5]

and bas nominal structure). Tke dot operator (.)

between two atomic fot'1RS is called I internal

conjunction ' thus tbe expression on tke rigbt aboYe

is read: 'If L 1

and L 2

botb bave tke yalue 1 and V is

satisfied, then make decision D.',

v([ L1 = 1][ L2 = 1]) => D 1= '([L,.L2

::I 1]) => D

(assuming that Land 1

L 2

bave tbe same domain size

and structure).

B2, Internal Conjunction of Arguments

VV' => D

y. = [f (I:l' :a i)[ f (x2' 1= "'[f '

'II: i)

(:.I,.x2,=i] => D

Tbis rule .introduces a new Fredicate fl wbich

has the domain [TRDE,PALSI] and two arguments. The

(.) operator instead of (,) indicates that the order

of arguments to f' is irrelevant. The function. fl - assumes tbe value TIDE if f bas the yalue i for both

E3. Introducing Hev Predicates,

'V· =>D

33

Y' = [f (I'. ) = i I f (I'. ) = j]1 2

where i is related to 1 by Eel. Por example,

i1, i>=j would result in new predicates LT_f,

E4. Splitting the Condition Part.

v v Y' => D 1= , => D, Y' => 0

This rule is used to form a set of decision

rules with product condition parts from decision

rules in disiunctive normal form•

. 3.2 Generalizing Transformations

This type of transformation usually produces not only

a more general decision rule from a set of decision rules but

also a 'simpler' one than the original. Some rules are

applied That is, one rule is actually

transformed but the context consisting cf rules with a

telated decision is used to obtain a more optimal result.

(These transformations may also be interpreted as

transformations from a set of rules into a new, more general

rule. The approach of focussing on one rulE in the 'contel'.t'

of others is taken here to more closely re fleet the approach

taken in the implementation.) In the following rules, the

symbol 1< is used to indicate an inductive inference ,rule.

G1 Dropping a Selector.

vr L = H] =) D 1< V =) D

llthough this rule is interesting in a formal

sense, it should be applied with care since the

number of generalizations possible with successiye

applications of this rule is very large. Bore is

said about this problem in section 3.Q.

G2. Extending the Reference

lominal domain structure

vr L = a] ::) D 1< V[ L :: a, b] =) D in context [L = b] => 0

InterYal domain structure

V( t ,. a] =) D 1< Y(L = a •• b] =) D

in context [L = b) => 0

Tree structured domain

V£ t = a] ::) D 1< V[L = c 1 => D in context [t = b) =) D

c is a predecessor of both a and b in the ,

generalization structure of the da.ain of L.

G3. Extension Against

35

Nominal domain

, [L = R ] => D 1< v [L ..,.:9 J => 0 1 1 1 2

in context: V [L = a ] => ,D2 2

assuming a ! a = null. (The symbol !1 2

denotes set intersection.)

Interval domain structure

V [L = a •• b] => D 1< , [L = e•• f] => 0 1 1

in context: '2[L = c •• d] => ~D assuming [a•• b]! [c•• d] =null if b < c then e = 0, f = c-1 if a > d then e = d+1, f = h (0 and hare

the minimum and maximum elements in the domain of

L)

Tree structured domain

, [L = a] => 0 1< , [L = c] => D 1 1

in context: V [L = b] => ~D

2

assu~inq a ! b = null the constant (c) is the most distant

ancestor of (a) which is not an ancestor of (b)

(c may be equal to a).

GQ. Replace A with a constant

36

(AI: V(I: » =) D 1< '(I: ) [I: • a] -) D - i i i i a - an element in the dOMain of 1:. (The

i

symbol ! represents the universal quantifier.) For

el:ample:

11:1 [f (I:1) : 1] =) D 1< [ f (x ) - 1][ x = 2] :=> D 1 1

The left side of the expression requires that f have

the value 1 for all values of I: before a decision be 1

made. The generalized el:pression (right sid,e) only

requires examination of one value in the domain of

1:, namely the value 2. &11 other values in the 1

domain of I: are irrelevant to the decision rule. 1

GS. Beplace a constant with !

V(x)(x = a] =) 0 1< Ix Y (x ) => Di ' i i i a - an element in the domain of x • (fhei

symbol Ii represents the existential quantifier.) For

example:

[ f (x ) = 1][ I: = 3] 1<1 1

The left side of the expression mates a decision only

if f has the value 1 for ][, with value 3. The

37

generalized expression makes a decision if f bas the

value 3 or any value of x • 1

G6. ~oye ~ to the Big~t

1<

3.3 Specializinq Transformations

A specializing transformation may be used to apply a

decision rule to a new situation or to add certain

restrictions to decision rules. Below, the symbol I>

represents a deductive inference rule.

R1. Addinq Restrictions

VV' => D I> VV'V" => D

in context V' => V"

This rule is used to add restrictions to

descriptions and generalizations where the

restrictions rE7present some structure ~hich is

imposed on a function or some relationship between

functions which alvays holds. lor example, the

transitiYity or symmetry of a function may be

introduced using a restriction.

38

R2. Dropping Product Rule

, Y V => 0 I> Y => D1 2 1

This rule may be used to obtain a set of

decision rules with a product in the condition part

from a rule indisjunctiye normal fora.

3.4 Transformation aules InYolYing the Decision Part

It is clear that reYersing the roles of the input and

output of an inductiYe t-rule giYes a deductiYe t-rule and

conversely. Therefore, each of the rules G1-G6 could be

inverted to get a deductiye rule. The rules in sec,tions

3.1-3.3 vere based on the events satisfying the condition

part of a decision rule. Similar rules may be applied to the

decision part of a rule to obtain equiyalent, more general or

more restricted rules.

GiYen two rules:

B : V =) D 1 1

B2 : , =) O

2

R is more general than Band B is more restricted than a 2 1 1 2

if eyery assignment of eyent yalues made by B1 is also made

39

by R • If the converse is also true. then 1 is eqaivalent 2 1

to R • To transforlft the decision part of a decision rule,2

apply G1-G6 to the decision part of a rale to obtain a

deductive role and 11-12 to obtain an inductive rule

intercbanqing the roles of the condition and decision parts

in tbe transformation rules. A few examples are given

belOVe

DP1. Droppinq Selector Role Applied to the Decision Part

v => [L = 1 ]( L = R ] I> V =) (L = R ]1 1 2 2 . 1 1

Thouqh this is a qeneralization rule when

applied to the condition part. it is a deductive rule

when applied to the decision part of a decision

rule.

DP2. Splittinq the Decision Part

, => [L = R ]D I= V => [L = R ], , =) D2 2 1 1 This equivalence preservinq rule is used to

I

produce a set of decision rules which involve only

one decision variable.

DP3. Replace a Constant with A in the Decision Part

v => rL (x 1 ) = B 1 ][ x 1 = a] I )

3.5 Example of lpplication of Transfor.atioD tales

In this section, selected transformations will be

examined in more detail with respect to a specific example•

•Consider the situation in Figure 3.1. There are four objects

classified according to two decision yariables 01 and DB.

Each object is described in terms of the following VL 2

functions and predicates:

variables each representing a part in an ob1ect

with domain: P = [0,1] with nominal structure

ontop: a predicate mapping P x Pinto [TBUE,P1LSE]

The predicate is TIU! if the part represented

by the first argument is on top of the part

represented by the second argument.

shape: a function mapping P into [Triangle(T),

Circle(C), Rectangle(S), Bllipse eB) , Polygon(P),

Curyed rigure (CP) ] witla generalizations:

[shape=T,R]=>[shape=P] and [sbape=c,BF>[shape=CP].

The function specifies the shape of the part

represented by the argument. The domain is a tree

structured domain where Pol,gon is an ancestor (a

generalization) of Triangle and Rectangle and Curved

Figure is an ancestor of Circle and Bllipse.

Dl,OB: decision variables with domain [1,2J

01 = 1 01 = 2

1 Circle 2 Circle DB = 1

Triangle Circle

3 Ellipse Rectangle DB = 2

Rectangle Rectangle

Pigure 3.1 Set of Objects

Each obiect in fig 3.1 may be described in the iL 2

system. The description of objects 1 and 4 are:

( ontop (p , p ) ][ shape (p ) =C][ shape (p ) =T 1

1 2 1 2

=> [01= 1 )( 0 B= 1 J (3.5.1)

( ontop (p , p ) 1[ shape (p ) =R ][ shape (p ) =R]1 2 1 2

=> (01=2][ tB=2 ] (3.5.2)

The variables P, and P2 are assumed to be quantified with the

operator ~. (distinct existential quantifier). Osing the

42

rule DP1, the decision role 3.5.1 can be transformed into tvo

new rules:

[ ontop (p , p ) ][ shape (p ) =C][ shape (p ) =T ] => [D1= 1 ] (3.5.3)121 2

[ ontop (p , p ) )[ shape (p ]=C][ shape (p ) =T] => [DB·' ] (3.5.4)121 2

Concentrating now on 3.5.4, the dropping selector

rule may be applied twice in succession to obtain

[ sha pe (p ) =C] => [DB=1]. (3. 5.5)1

This role describes all objects in figure 3.1 with the

decision DB=1. In addition. it does not describe any object

with 08=2; so it is a complete. consistent generalization of

the objects wit.h DB=1. (10 te t.ha t completeness and

consistency are in no wa, guaranteed b, the application of

the dropping selector role. These conditions can howeyer be

checked after each application of the rule.) Applying the

extension aqainst rule to 3.5.3 in the context

[onto(p ,p) ][shape(p '=Blshape(p )=B] => [D&=2] (3.5.6)1 2 2 2

focusing on the selectors [shape(p )=C] in 3.5." and 1

(shape(p )=B] in 3.5.6 one aay obtain

1

[ sha pe (p 1 =CF ][ontop (pl' p 2) ][ shape (P2) =T] => [DA= 1 ]. (3. 5.7)

43

lpplyinq extension against now to 3.5.7 in the context

[ontop(p ,p) ][shape(p )=C][shape(p )=C] => [Dl=2] (3.5.8)1 2 1 2

focusing on the selectors [sha pe(P2)=TJ in 3.5.7 and

[shape(p )=C] in 3.5.8, one obtains 1

[shape(p )=CP][ontop(p ,p) ][shape(p )=P] =) [01=1]. (3.5.9) , 1 2 2

This rule is a consistent and complete generalization of all

rules in fugnre 3.1 with 01=1 'in the decision part.

Looking nov at the description of object 4 (3.5.2),

tvo rules are obtained by applying t-rule R1:

[ontop(p ,p) ][shape(p )=RJ[s,hape(p )=R] => [01=2J (3. 5.10)1 2 1 2

[ontoP (p"P2) ][shape(P1,=R][shape(P )=R] => [06=2J (3.5.11)2

An application of the dropping selector rule to 3.5.11 twice

in succession qives a complete, consistent rule:

( sha pe (P,' =R] => [OB=2.1 (3.5.12)

An application of t-rule E3 to 3.5.10 vith a relation equa!2

produces:

(ontop(p ,p) ][shape(p )=R][shape(p )=RlrEQ-shape(p.p)]1 2 1 2 1 2

->[ 01=2] (3.5.'3)

The predicate EQ-shape specifies that the value of

.the function shape is the saDIe for argument s p, and p • 2 The

(. ) separatinq the arguments p, and P2 of !Q-shape signify

that the order of the arguments is irre levant. (In the

implementation, a selector with this type of predicate is

written [sbape(p.p )=same].) In application of ~he dropping , 2

,selector rule three times in succession to 3.S.13 gives: ,

[ EQ-shape (p , p ) 1 => [01=2] (3.5.1~)1 2

which is a complete, consistent generalization of 3.5.'0.

In summary, the new rules which were obtained are:

[ontop(p ,p ) l[shape(p )=CP][sbape(p )=P) => [01=1] . , 2 1 2

[EQ-shape(p"P2) ] => [01 = 2] (3.5.15)

[shape (P,):C] => [OB=1]

[shape (P,) =R] => [DB=21

In the above discQssion, the resultinq simple

generalizations depended on knowing the proper rules to apply

and the proper portions of the decision rules to modify. Por

larger problems, this approach is infeasible because of the

large number of possible generalizations whicb could be

I 45 - i

)

I

I made. Therefore, a more efficient approacb is required. Sucb

an approach is giYen in the neKt section and described in

detail in the chapter on basic procedures (chapter 5).

3.6 Efficient Application of Generalization BuIes

There are two significant problems with using the

procedure described in section 3.5 to apply t-rules to a set

of decision rules. The first is the large number of new

rules which can be generated from one rule. This problem

could be circumvented by trimming the intermediate lists of

new rules selecting only the most promising set of rules

according ·to a user specified criterion before fUrther

application of t-rules (see section 5.3 for a description of

a trimming procedure). A second problem which is not

surmounted as easily is compleKity in.ol.ed in

determining whether a VL formula is consistent and in2

evaluating the optimality cost functions during trimming. To

determine whether one VL formula is a generalization of 2

another or is consistent with respect to ancther formula, one

mast determine whether all or any of the events which satisfy

one Vt formula also satisfy another VL formula. 2 2

Using t~ current implementation, this inyol.es

determining whether one graph structure reFresentation is a

subgrapb of another (see section 5.4 for a description of a

http:inyol.eshttp:in.ol.ed

46

subgraph isomorphis. algorith.). This problem is exponential

in nature (i.e., the tiae to determine whether one graph is a

subgraph of another using a depth-first-search is

~roportional to a raised to the power of n where m is the

number of nodes in the larger graph and n is the number of

nodes in the smaller graph). Actually, it is not quite so

time consuming since the graphs haYe relatiYely few edges and

the edges and nodes are labelled. It is, howeyer, important

to form simple generalizations whick correspond to small

graph structures.

Since the cost criterion normall, includes a cost

function which minimizes the number of selectors in a

product, the consistency and optimalit, of optimal rales

should be easil, calculated. In the example of section 3.5,

the original rules were made smaller by dropping selectors.

An alternate approach is to grow the generalized rule

beginning with single selectors and adding new selectors

until a consistent rule or set of alternative rules is

created. 1 very general algorithm follows: chapter 5 giYes a

complete description of the algorithm in the current

implementation.

1. Given a set of decision rules in disjunctiye normal

form, create a new set of rules vitb a product in the

47

condition part and one selector in the decision part

of each rule.

2. Select a rule which involves one value of a decision

variable. Add to the rule new selectors which

represent functional relations as specified by a

user. (i.e., multiply the condition part of the rule

by selectors containing new functional relations.)

3. Pind a consistent generalization of the rule from 2

by locating the most promising selectors of the rule

from 2 and adding nev selectors to each of these

selectors until a set of consistent generalizations

of the rule from 2 is obtained.

q. Apply the extension against rule using an lQVAL/1

procedure to all consistent rules.

5. Select the best generalization from this set and

remove rules from the set produced in step 1 for

which this is a generalization. Repeat steps 2-5

until no more rules remain which involve the decision

variable and value of step 2.

6. Continue by selecting another value of the current

decision 't'ariahle or selectinq another decision

't'ariable until all decisions have been considered.

49

4. Computer Representation of YL Decision Rules

Some of the information in this section appears in

[Larson et.a1. 77] and is giYen here as a background for the

description of the computer implementation. 1 VL decision

rule can be represented as a graph with labelled nodes and

directed labelled edges. The labels on the nodes can be: a)

a selector containing k-ary descriptors without argument

lists. b) a k-ary descriptor without arguments, c) a

quantified ,ariab1e with an optional subrange of ya1ues, d) a

logical operator. (From here on, a node is referred to by

its label. e.g., a selector node means a node with a selector

label.) The edges are labelled with integers from 0,1, ••••

Edges not labelled 0 refer to the position of an argument in

the label at the head of the edge. (Edges haye non-zero

labels only if the position in the argument list of the head

node is important. Labels of 0 may be dropped for

conyenience.)

Several different types of relations may be

represented by edges. The type of relation is determined by

the label on the node at each end of the edge. The types of

relations are:

1. !YD£!i2n ~~!nd!n£! - The label of the head node of

50

tbe edge bas a k-ary descript or. Tbe yalue

represented by the edge is tbe yalue of tbe atomic

form in the tail if tbe tail is a selector node, a

descriptor yalue if tbe tail is a descriptor node, or

one or .all of a set of descriptor yalues if the tail

is a quantified Yariable. fbe edg8 label specifies

whicb argument of the bead node assumes this yalue

(Figure 4.1).

.1 2~ (g = 1•• 2)------~~~ [f = 1 ]~4~-------- ~x2

Functional Dependence: Ex ,x ([g(x )=1 •• 2][f (g(x ),x » = 1]- 1 2 1 1 2

Pigure 4.1

The head node is a logical

operator (e. q. y, &, =» and tbe tail node is a

selector node, or a logical operator node. If the

tail node is a selector, then the yalue represented

2.

by the edge is the trutb yalue of the selector at the

tail (Figure 4.2)

s,

.,

[f = '] [g = 2]

-Ex,

Logical Dependence : ~x, ([f(X,) = 1] v [9(X,) = 2]

Figure 4.2

3. lm~li£i~ !g!iab~ ~~~~ - The labels of the head

and tail nodes are quantified variables. This type

of dependence represents the implicit function (which

can be represented by a Skolem function) (Figure

4.3) •

[p = 1]

Ax --------------------~.~ Ex- , - 2

Implicit Variable Dependence: Ax Ex [p(x ,x ) =1] . - ,- 2 1 2

Figure 4.3

52

4. The bead node is a logical

operator and the tail is a quantified variable. This

type of dependence May be necessary for certain

binary logical operators suc~ as (->, .......>-----'2=-__ [q = 1]

1 1

Scope of Variables: ~X1'X2([P(X1) = 1] => [9(X ' = 11)2

Pigure 11.11

The grapb of a more complex decision rule is given in

Figure 11.5. The value of x is dependent in an unspecified3

way on the value of x (the edge labelled 1). The disjunction2

(v) depends on the values of x and x ' but this is clearly2 3

specified by the functional dependence of f and 9 on x and 2

x. Pinally, observe that the decision operator (=» does not 3

53

explicitly depend on the specific 'taloes of x, x, or x ,1 2 l

but instead depends on tDe truth 'taloe of the entire premise

using some set of 'talue assignments for x ,x , and x • 123

--i....... &...L.,...=>~ [d=1 ]

GrapD Structure Example:

! x Ax Ex «[ f (x , x ) ::: 1] 't [ [d :: 1 1 1 2

Figure ".5

5'1

5. Algorithms' and Computer Implementation

A computer prograll IIDOCB_' has been written to find

.a generalization of a set of decision rules. The algorithms

are described in the remaining sections of this chapter;

examples of generalizations produced by the program are given

in chapter 6 and a sample session with the program is given

in appendix A. The program does not perform all of the

transformations given in chapter 3 and does not accept the

full VL language given in chapter 2. The restrictions on 2

the form of the input and a description of the output are

qiven in the next tvo sections.

5.1 Input to the Program

The program accepts as input: 1) a set of decision

rules, 2) a problem environment description including a set

of restrictions, domain definitions, variable costs etc., and

3) a set of parameters which control certain aspects of the

program operation. Decision rules, restrictions, and domain

structures are entered a VL type formulas in the following2

format:

Decision Rules

Decision rules must satisfy the following

graJlllJlar:

55

11.

56

quantified br the operator I (distinct existential

quantifier - note 3)

In this example. the function p is assumed to

have two walues (i.e•• it is a predicate - note. 6).

The selector containing p is satisfied if it has the

value 1. The second selector restricts the possible

values of x to the set of values [2.Q] (note 3).2

Several observations can be made about this grammar:

1. The' condition part is a single product and the

decision part involves only one wariable. It is

assumed that one decision wariable has been selected

to be studied by the user. llso. the equivalence

rule has been applied which splits a condition part

of a formula in disjunctive normal for~ into a set of

decision rules with condition parts which are single

products (t-rules £4 and DP2).

2. Each atomic form is a function symbol with a list of

single variable arguments. It is assumed that the

user has converted forms such as

[g(f(x.1) ] 1.

51

into a form

by introducing a new predicate each

function symbol f which is in an argument list and a

variable y for each occurrence of the function f(x )

in an arguj

ment list. The predicate p is assumed i

to

have the value TRUE if = f (x.) 1.

and PAtSE

otherwise. For example:

[ shape (pa rt (x ») = 1)1

is assumed to be transformed into e.g. an expression

3. 111 variables (arguments) are assumed to be

existentially quantified. Variables with the same

function symbol part are assumed to have the same

domain. Furthermore, variables with values from the

same domain are assumed to take on distinct values

from the domain. Variables may be restricted to a

subrange of values by using the third option in the

58

selector definition. Osing this method, a constant

may be specified as an argument to a function.

4. If a reference of the' second for_ is specified at

least once in the input (production 16) (e. g.

(f(x )=2 •• 2] or (f (x ) =2 •• 5]), a domain of type1 1

inter.,al is assumed; otherwise, the domain of a

.,ariable or function is assumed to be of nominal type

or tree-structured if such a structure is specified. '

5. The last form rif the reference uses a, yalue (*). This

specifies the entire domain of the associated

function symbol. (B. g. (p=*] means that the

selector is satisfied for any value of p other'than

the value Hl.) If a selector is omitted from the

decision rule entirely, the program assumes that the

function has the yalue I' (not ap~licable). such a domain value has no generalization (other than HA

itself). Although H1 is not specified in an input

rule, it can be a valid value of of a variable.

6. If the second form of the selector is used

(production 13) (e. g.

assumes that the function symbol has the value 1.

This may be used to specify a ~RUE value for a

59

predicate (predicates are treated as functions witb

domain [0,1]). In general, tc simplify tbe

expressions, only positive values cf predicates are

specified. The program tben uses only positive

instances of the predicate in tile qeneralizations.

If negative values of a predicate are desired in tbe

generalizations, these relations sbould be included

in the initial decision rule specification (see the

arcb example in chapter 6) • (e. g. [ontop(p,p )] , 2

specifies tllat p is on top of P2: [ ontop (p r P ) =0 ) , 21

specifies that is not on top ofP1 P2·)

7. If the fourth form of the selector is used (e.g.

[p(x .x ) 1, then the order of the arguments is 1 2

assumed to be irrelevant.

Restrictions:

Restrictions must satisfy tbe following

production:

(REST> ::= (CONDITION> - >

where the arguments of the selectcr part must all

appear in the condition part. CONDITION and SELECTOR

are the same as in the decision rule grammar. Por

example:

60

[left(s ,s ) )fleft(s ,s ) 1 => [left(s ,s )].1 2 2 3 1 3

Bestrictions extend or modif1 decision rule

specifications. Por every occurrence of the CONDITION

of the restriction in the condition part of the

decision rule, the selector is added to the condition

part of the decision rule if it is not already

there. If the SELECTOR is already in the condition

part of a decision rule, then the walues of the

SELECTOR replace the values in the occurrence of

SELECTOR in the decision rule. Os1oq this feature,

the transitive closure of a transitiwe function may

be calculated. 1 restriction which adds equivalence

type predicates is included b1 default. Each

restriction is applied to each decision rule as it is

entered into the program.

Domain'Generalization Structure Specification

This specification must satisfy the

production:

::= [ = ]=>[ = ]

where the tvo function symbols are the same and

PH-5Yft and BEP are as in the decision rule qrammar.

61

Por example:

(s = 1, 3,"] => (s :: 6].

[s = 0,2) => [s = 7]. [s :: 4] => [s = 8]. [s = 6,8] => [s:: 9].

Bnterin

62

program output is giYen in appendix 1. 11thoagb input

decision rules may be any formala vbich satisfies the grammar

qiYen in the preYious section, tbe prograM only searches for

.generalizations vhich can be represented by a connected graph

structure vith functional dependence edges. 'fhe user may

construct decision rules to satisfy this constraint by

incloding nev predicates vbich link products vbich haye no

arguments in common. (In general, it can be tested to see if

a product of selectors in a decision rule has a connected

graph strocture by determining vbether there is a

partitioning of selectors into tvo products vbich haye no

argoment in common. If there is sucb a partitioning, then

the grapb structure is not connected.)

Different generalizations may be obtained by yarying

the program para.eters and reordering the decision rules. In

general, increasing any of the !lISTAR, ItTIR, or KCONSIST

parameters (described belov) vill cause the program to

require more memory and time but the resulting

generalizations may be aore optimal. Rearranging the

optimality criteria may also produce a different result.

higher tolerance associated vith a particular cost fUnction

viII reduce the selection based on tbat function. The

default optimality criteria (or optimality cost functions) in

the order of application are the folloving:

1

--

63

1. ftinimize the inconsistencies of a rule with tolerance

0.30, i.e., the number of events co,ered by the rule

whicb are not supposed to be covered by tbe rule.

(this is cost function number 3 in section 5.5). Tbis

allovs tbe program to produce consistent

generalizations quickly. The high tolerance removes

bigbly inconsistent rules wbile leaving selection of

nearly consistent rules to the remaining cost

functions.

2. ftinimize tbe number of products in the complete

generalization witb tolerance 0.00 (this is cost

function number 1 in section 5.5).

3. Kinimize the number of selectors in each product

produced by tbe program (function 2 i~ section 5.5).

Minimize tbe cost of functions in eacb product

(function 4 in section 5.5). If costs are specified,

this criterion may be moved forward. If tbe user

wishes certain functions to appear in tbe resulting

products, the costs for these functions may be

specified (given negative cost). Si~ilarly, functions

64

which are very difficult or costly to measure may be

given appropriate positive costs.

5. !azimize the intersection of resulting rules (cost

function 5 in section 5.5). In real situations, the

separate products produced by the program may

represent a large number of coamcn input decision

rules along with some peculiarities of specific

situations which arise. Use of this cost function

will fayor the selection of a more representative

result as opposed to one whicb describes only a

particplar set of situations. Appendix B contains a

more detailed discussion of a similar cost function

in tbe VL 1

system.

The program contains about 40 procedures.

PiYe major tasks which are performed by some groups

of procedures are described below.

5.3 ~ormation of a Complete Generalization

1 generalization is found of a set of decision rules

is found containing a specified value I in the decision

part. Tvo sets of products are generated: a set P1 vhich

contains all products in the CONDITIO. parts of rules with a

decision value of I, and a set PO which contains all other

65

products. Each product is called a c-formula

(con1unctive-formula).

One c-formula B1 of P1 is selected at random and a

connected-con1unctive-formula (c2-formula) is generated wbich

is a generalization of 81, consistent with respect to the set

PO, and near optimal with respect to a user defined

criterion. A c-formula is £gnnected if its graph structure

representation is weakly connected by fUnctional dependence

~formul~ PO if it does not intersect with any element of

the s'et PO (i. e., there is no event which satisfies both

c-formulas).

Once a generalization of E1 is found, it is saved in

a set CQ and all elements of P1 whicb are coyered by this

generalization are removed from P1. One c-formula E1 £2!~r~

another c-formula EO if E1 is a generalization of EO.

Another element of the new set P1 is selected and the

procedure repeated. When there are no more elements in P1,

the complete, consistent generalization of the set of

c-formulas P1 is the disjunction of all c2-formulas in CQ.

5.4 Determine Cover and Intersection of 2 Poruulas

Tvo similar procedures are decribed. here. The test

to determine wbether a c2-formula B coyers a c-formula Ht is

66

used vhen Et is an ele.ent of tile set '1. ~he test to determine whether I intersects wi til It is aBed when Z' is an

element of PO ti. e •• to detemine if Z and It are

,'consistent) • 'rhe procedure ases the grapb stractare

representations of .! and Bt (G and Gt witb nodes and edges

',E,V',!' respectiYely). Tbe grapb G is assumed to be weakly

connected. B covers E' if there is a 12ecillizinq isornorill.§l!!

(s-isomorphism) from G to a. subgrapb of G'. ·~be reverse

mapping (from a subgraph of G' to G) is called a qeniEAlizinq

iS2m2[phi~m (g-isornorphism). E intersects with E' if there is

an in~~§!£!ing i22~QI2hi§m (i-isomorphis.) between G and a

subgraph of Gt. Each isomorphism from G to a subgraph of G'

is a 1-to-1 correspondence between nodes and edges of G and a

subset of nodes and edges of G' wbere the correspondence (or

matching) of nodes and edges is defined as follows:

I node n of G match~ a node n' of Gt if:

1. They are both selector nodes or botb gaantified

variable nodes.

and 2. If theT are selector nodes, then the function symbols

in both nodes are the same. If they are yariable

nodes, theT are of the same groap of ,ariables.

67

and 3. With an s-isomorphism or g-isomorpbism, the set of

values associated with n is a generalization of the

set of values associated with n'. (the sets of values

may be equal.) lith an i-isomorphism, the sets of

values intersect. In the case of selector nodes,

these values are the elements in the reference of the

selector. In the case of quantified variable nodes,

these values are the subranges of the variables.

1n edge of G matches an edge of G' if:

1. They have the same label

and 2. The respective head nodes match and the tail nodes

match.

To speed rejection, a quick scan through the nodes is

made to see if there is a correspondence between nodes of G

and a subset of nodes of G' (ignoring links between nodes).

If there is a possible 'correspondence, a prccedure is invoked

which locates a subgraph of G' which is isomorphic to G and

assigns each node of G to a corresponding node of G'. The

procedure is as follows:

68

Select a starting node (n )o

of G .kick contains the

most labelled incoming edges. (This is the selector

node with the largest number of arqaments.) Selecting

.. a node of this type insures that t~ere is a minimuM of backtracking through the starting ~ode.

2. . 1 rooted directed a-cyclic graph G* with nodes and

edges v* and E* is constrocted frOM G by copying a11

nodes and edges of G to G* and assigning a direction

to each edge of G* so that G* has no cycles and for

each node x in (V*-n* ],o

there is a path from n* 0

to

x. (n*o

is the node of G* which corresponds to n 0

in

G.) A traversal of the graph G* is tbe list of edges

and nodes visited in a preorder tra~rsal of G* with

root n* O· A pre2I~g, yaversal of a subgraph with

root x visits tbe node x, visits eacb outgoing edge

of x and traverses the subgrapk .bicb bas as the

root, the head node of the traversed edge.

3. The graph G is traversed in t he order of tbe

traversal of corresponding nodes and edges of G*. At

each step of the traversal of G, a node and nev edge

of G' is found which matcb tbe node and edge of G.

If tvo nodes match, they are !§§i9~g to each other

69

and a record of the matching nodes and edqes is kept

for each assignment in a backtrack list. To

establish a 1-to-1 correspondence, nodes of one graph

which are previously assigned can only match

cor~esponding assigned nodes of the other graph.

4. If there is no node and edge of G' which matches a

node and edge of G, the procedure backtracks to the

previous nodes and edges on the backtrack list,

erasing the last nodes and edges on the backtrack

list and the assignments associated with the nodes.

~nother node and edge of G' are selected which match

the last node and edge of G on the tacktrack list and

the traversal of G continues. If no node and edge of

G' can be found at this point, then the procedure

again backtracks until a new match is found or the

backtrack list is exhausted.

5. If the traversal of G is complete, then G covers

(intersects) Gte If the backtrack list is eXhausted,

then G does not cover (intersect) G'.

A feature is included which finds all subgraphs of G'

which are isomorphic to G. This feature is used in section

5.7 and in adding restrictions to c-formulas.

10

6. If the trayersal of G is complete, then tbe current

set of assiqnments is the desired I'appinq. To find '"

the next isomorph i SIll, the procedure returns to step "

assuminq that the last nodes and edqes on the

backtrack list did flot aatcb.

5.5 Trimminqa Set of c-formulas

Irimmins is the process of selecting the "lISTll best

elements of a set of c-formulas vith reqard to a user defined

criterion. The user specifies tbe cost functions vhich are to

be used, the order in vhich ther should be applied, and tbe

tolerance associated vith each cost function. IJlplemented

cost functions are:

1. The number of eyents of the current set 11 vhich are

coyered by a c2-formula. (The neqatiYe of this

quantity is used to obtain a cost., This function

minimizes the number of c-formulas in CQ.

2. The number of selectors in a c2-formula. The

function minimizes the number of selectors in the

c-formula.

71

3. The number of eyents of PO which intersects with a

c2-formula. This function leads more rapidly to

consistent c-formulas.

The total cost of all functions contained in a

c2-formula.

s. The number of eYents of the original set P1 which are

covered by a c2-formula. (The negative of this

quantity is used to obtain a cost.) This function

finds the most representative c-formulas.

A set of c-formulas is trimmed using n cost functions

(cf ,cf , ••• cf ) and relatiye tolerance for each cost 1 2 n

function (tol ,tol , ••• tol). The costs are applied in the 1 2 n

order specified by the user (cf first, cf second, etc). For 1 2

each cost function cf., the ftAXSTAR best c2-formulas along 1

with all c2-formulas equivalent in cost to the ftAXSTAR best

c-formulas are passed to the eyaluation using the next cost

function cf • Other c2-formulas in the set of c-formulasi+1

are discarded. With the last specified cost function (cf ), n

only the MAISTAR best c-formulas are retained.

For each cost function cf, i=1,2, ••• ,n, equiyalencei

of tvo c-formuas in cost is defined using an absolute

i

72

tolerance (11'). Suppose the set of c-formulas P is composedi

of a list p,p , ••• p • After yalues for cost function cf 1 2 In

been evaluated cf (p ) for each c-formula the i j

1naximum and miniau. cost function yalues are deter.ined

cf i

(p ) max

and cf. (p . ). 1 .1n

1n absolute tolerance (11' )i

is

calculated Qsinq the user specified tolerance tol i

as

follows:

11' = tol • (cf. (p ) - cf. (p • ))i i 1.ax 1 m1n

The ftAISTAR c-formulas . of least cost are determined> and tbe

list reordered (p ,p , ••• ,p , •••• P,. If i!lAISTAR].

The set of c2-foraulas which remains is the desired trimmed

set of c2-formulas.

73

5.6 Pormation of a Set of Consistent Generalizations

1 star (denoted by !O) is formed which covers E1. (A

i~r wbicb covers E1 is a set of consistent c2-formulas which

cover E1., Tbe procedure begins by forming a partial star (P)

wbich contains a set of c-formulas each consisting of one

selector of E1. (Tbe £A!!i!A ~r may contain c2-formulas

wbich are not consistent witb respect to fO.) This partial

star is trimmed according to the user supplied optimality

criterion. The conjunction in_each c-formula whicb remains

after trimming is multiplied by each selectcr of E1 which is

directly connecte-d to it to form a new partial star.

Consistent c2-formulas are placed in !O. Tbe partial star is

again trimmed and nev selectors -added to each product until

tbe desired set !O of c2-formulas is ottained. Seyeral

paramenters control the sizes of sets in tbis procedure:

!lXST1R -- the number of c-formulas in a Fartial star after

trimming

NCOJISIST the mini~am number of consistent c2-formulas

wbicb must be in no

ALTER - the maximam number of new alternatives whicb may be

formed by adding selectors to an element of a partial

star.

1:n the following discussion, equivalence type

selectors (i.,e•• selectors of the form [fex,x ,=same]) are 1 2

treated- differently from selectors involving a function

'symbol and a set of values in tbe reference.

1. A partial star P is formed which contains all

selectors of E1 with unary functions.

2. P is trimmed to contain only the best !lISTAR

c2-formulas. Consistent c-formulas are placed into

!O. If fewer than NCO»SIST elements are in !O, then

step 3 is executed. Otherwise, the 10 procedure is

applied to the elements of 80 (as described in

section 5.7).

3. A new partial star P' is formed frolS tbe old one (P).

Por each element P. in P, a list ot" all variables 1

(i.e., arguments of selectors of p.) is formed. All 1

arguments of equivalence type selectors which occur

in the corresponding selector of 11 are also included

in the list.

4. A list of all selectors of E1 whicb are not already

in and which haye at least one argument in the

75

variable list (found in step 3) is created. If there

are more than ALTER elements in this list, the best

ALTER selectors are retained (using as a criterion

tbe cost of fUnctions in the selectors,.

5. For each of these selectors, a nev c2-formala is

formed vhich -contains the original selectors in p. 1

and the nev selector. If the nev c-formula contains

an equivalence selector vith on11 one argument, then

the nev c-formula is discarded; otherwise, it is

placed in pt. Steps 2 throug~ 5 are repeated, setting

p = p' in s~ep 2 until ReORSIST ele~ents are in "Q or until no new elements_are in the new partial star pt.

5.7 Extending tbe ~eferences of a Consistent c2-formula

Each consistent c2~formula ot !Q (ottained in section

5.6 step 2) contains an alternative, near oFtimal conjunction

of selectors of 21 whicb distingaishes E1 from an, c-formala

of PO. Using some methQds developed for the program lQ1, the

reference of each of these selectors may be generalized to

obtain a consistent c2-formula which vill possibl, cover

possibly more c-formulas of Pl.

Given a graph G of a conistent c2-formula mq in "Q, a

~!~ct~g is created (G$) b, replacing all references of

16

nodes of G witb * (the co~lete set of values for the

function in the selector). the nodes of G* are enumerated n* i

(i=1 ,2,••• ,II) and a ,t. system is created vith each 'I.1 1 .variable x related to a node n* of G*. tt.e do.ain (denoted

i i

D. ) of variable x is tbe same as the domain of the fUnction 1. i

or variable in the node n* •

i

The ,I. events space may be d~ined:1 .

E = D x D x ••• x D • 1 2 n

Tvo sets of 'L coaplexes I. and I.' are formed from the events 1

of the current set P1 and the set PO respectively.

Individual complexes in these sets are denoted 1 and I' • i i

Bach compelex covers a set of points in t he space E.' Por

each element of P1 and PO all isomorphisms from the

c-structure G* to the qraph representation G of a c-formula

in P1 or PO are determined. Por each isomorphism obtained

from P1 and PO, a VL complex is created (lor I' ).1 i i

Denotinq the value sets of the nodes in a subqraph of G which

is isomorphic to G* as a ,a , ••• ,a • the correspondinq ,I.12m 1

complex may be written:

[x =B ][ x =B ] ••• [x =B ].1 1 2 2 II II

This complex covers the ,I. events: 1

77

B x R z ••• z R .1 2 m

in the event space !.

First. the complex 1 wbich results from extracting1

tbe values from tbe nodes of the grapb of.q is generated.

Tben all other isomorphisms from G* to c-fcrmulas of F1 are

determined and a complex added to L for eacb isomorphism

whicb results in a new complex that is not already in L. The

set Llis created ina similar manner by generating all

distinct complexes resulting from isomorphisms from G* to

c-formulas of PO.

Since tbe c-formula mq is consistent with regard to

PO, the complex 1 is dis10int from all complexes in L'. 1

(Tbat is, there is no point in the VL event space! which is 1

in both 1 and a complex of L'.) A near optimal extension of 1

1 against L' in B may be calculated us

21 : Methodology and Computer Implementation · Foundation, Grant No. NSF MCS 74-03514. ... ~ost ....

Documents

Transcript of 21 : Methodology and Computer Implementation · Foundation, Grant No. NSF MCS 74-03514. ... ~ost ....