21 : Methodology and Computer Implementation · Foundation, Grant No. NSF MCS 74-03514. ... ~ost ....
Transcript of 21 : Methodology and Computer Implementation · Foundation, Grant No. NSF MCS 74-03514. ... ~ost ....
-
I
· Inductive Inference in the Variable Valued Predicate Logic
System VL21 : Methodology and Computer Implementation
: by
James B. Larson
Department of Computer Science University of Illinois
Urbana, Illinois
May 1977
This work was submitted in partial fulfillment of the requirements for the degree of Docotr of Philosophy in Computer Science in the Graduate College of the University of Illinois and was supported in part by the National Science Foundation, Grant No. NSF MCS 74-03514.
-
!NDUC!IVE INFERENCE IN THE 'ARIABLE VALUED PREDICATE LOGIC
SYST~~ 'L ~ETHODOLOGY AND COepOTEp. I!PLEHENTATION
21
James Burton Larson, Ph.D.
Department of Computer science
University of Illinois at Urbana-Champaign, 1977
A formal methodolo~y and computer progran arp.
prespnted for the transformation of a set of user supplied
logical decision rules into a new, genpralized set of
decision rules which is near optimal ~ccording to a user
supplied criterion. The VLS2 logic system (a multi-valued
version of a first order predicate calculus) is used as the
framework for defining and expressing decision rules and
transformations on decision rules. The program INDUCE_1
which implements certain inductive inference rules using a
graphical representation of VLS2 expressions is described and
some examples of inductive problems solved by the program are
qiven.
-
iii
ACKNOWLEDGEMENTS
I would like to express my qratitude to the folloving
people for their help on this thesis: First, Professor R.
S. Michalski for his inspiration, challenging problems, and
many significant suggestions especially regaraing
.meta-functions, my committee, Professor D. Plaisted and Dr•
Don Friesen for encouragement and many helpful discussions.
I especially wisb to recognize tbe valuable support of
~ichard Chilausky during the early stages of this research,
Barr Segal for proofreading of the manuscript, and the many
predecessors in the area of inductive inference and variable
valued logic whose efforts have been much appreciated in the
development of this thesis. I am greateful for the financial
support of the National Scien=e Foundation, the Department of
Computer Science, the Research Board of the University of
Illinois, and the Computing Services Offices, the latter for
supplying the computer time to implement the text fornatter
used for this paper.
Finally, without the moral encouragement of my wife,
Rhonda, this vork surely would have ended years ago from
utter frustration and discourage~ent.
-
iv
PREPACE
This paper was prepared on a CYBE~ 175 computer at
the University of Illinois using a text foonatter written by
the author. Since several special characters were not
available on the print train of the printer, the following
character combinations have special meaning:
Character Beaning
logical disjunction
logical conjunction
set intersection
the existential quantifier
the universal quantifier
-
v
TABLE OP CONTENTS
CHAPTER PAGE
1. Introduction.............................. •••••••••••• 1
1.1 The Problem........................................ 2
1.2 Previous Specific Applications......... •••••••••••• 4
1.3 Formal Systems for Inductive Inference............. 7
1.4 Overview of the Following Chapters •••••••••••••••• 12
2. Representing Decisions in the VL System•••••••••••••• 14
2
2.1 VL System Structure.................... •••••••••••• 16
2.2 Selector Formation and Interpretation Rules ••••••• 17
2.3 Vt Formation Rule..................... •••••••••••• 21
2.4 Interpretation Rules............................... 22
2.5 Vt Decision Rules...................... •••••••••••• 26
3. VL Transformation Bules................... •••••••••••• 30
3.1 Eqaivalence Transformation............. •••••••••••• 31
3.2 Generalizing Transformations....................... 33
3.3 Specializing Transformations....................... 37
3.4 Transformation Rules Involving the D~cision Part... 38
3.5 Example of Application of Transformation Rules..... 40
3.6 Efficient Application of Generalization Rules...... 45
4. Computer Representation of YL Decision Rules.......... 49
5. Algorithms and Computer Implementation................ 54
5.1 Input to the Program................... •••••••••••• 54
-
5.2 Proqram Output......................... •••••••••••• 61
5.3 Pormation of a Complete Generalization. •••••••••••• 64
5.4 Determine Cover and Intersection of 2 Pormulas..... 65
5.~ Trimming a Set of c-formulas........... •••••••••••• 70
5.6 Formation of a set of Consistent Generalizations... 13
5.7 Extending the References of a Consistent c2-formula 75
5.8 Adding Hev Punctions and Predicates to c-formulas.. 79
6. Examples of Decision Rule Generation using INDUCE_1... 84
6.1 Pigures Example (EX 1)............................. 89
6.2 ArcD Example (EI2) •••••••••••••••••••••••••••••••• 102
6.3 Trains (EX 3) •••••••••••••••••••••••••••••••••••••• 101
6.4 Textures (EX 4) •••••••••••••••••••••••••••••••••••• 112
7. Current Limitations and Possible Extentions........... 117
L!ST OF REPERENCES......................... •••••••••••• 123
lPPENDIX 1••••••••••••••••••••••••••••••••••••••••••••• 129
APPENDIX 8................................. ............ 138
VITA •••••••••••••••••••••••••••••••••••••••••••••••• 145
-
1. Introduction
An important problem which is often presented to
computer systems is that of ex~ractinq relevan~ informa~ion
from complex data in order to gain a better understanding of
the meaning behind such data. ~ost current methods are
incapable of adequately describinq highly structured
situations and produce results wbicb are difficult to
interpret~ A selection of those systems which overcome these
difficulties is giTen in sections 1.2 and 1.3.
The following chapters deal with finding useful (as
defined by the user with an optimality criterion),
generalized information about sets of situations represented
as logical VL decision rules. 1 decision rule is a form
CONDITION => DECISION
where CONDITION describes some set of situations and DECISION
describes some new situation or action which is indicated if
an existing situation satisfies tbe description in
CONDITION. If no situation satisfies the CONDITION in a
decision rule, the rule makes a HULL decision. The
descriptions in CONDITION and DECISION are represented in the
VL logic system. This system is a variable valued first
order predicate caluculus with a rich set of operators and
2
-
•
•
•
• •
• •
• •
•
•
•
2
tbe facility for allowinq user defined dOMain sizes and
structures for yariables and functions appropriate for tbe
problem at band. The approach taken bere is to apply
inductive inference rules to loqical decision .rules which
express some decisions .ade with sets of situations in order
to form new, near-optimal decision rules wbich retain the
decision makinq capabilities of the oriqinal rules.
1.1 The Problem
The specific induction problem being inYestiqated is
as follows: /
Given a set of decision rules:
C =>0,-· C =>D • ••• C =>0 1,1 1,2 1 - 1.t1 1 C =>0 · C =>D • ••• C =)D,2,1 2- 2,2 2- 2,t2
(1. 1)
C =>0 C =>D C =)D· · •••n,' n - n,2 n - n,tn n where C and 0 are expressions in the YL system which
i, j i 2
represent the CONDITIOI and OBCISIOI parts of decision rules
respecti.,el" then find, through an application of
generali'Zation rules, a set ofYL decision rules:
-
3
C =>0 1 1
C =)02 2
•
(1. 2)•
•
C =>0 n n
which are with regard to the rules (1.1):
1) consistent
2) complete
3) optimal with regard to a user supplied optimality
criteria.
The new rules are ~Qn§ist~! if for any situation for which
the new rules assiqn a decision (a non-lULL decision), the
initial rules assign the same decision or a BULL decision.
They are &2m~l~~~ if for any situation for which the initial
roles assign a decision, the new rules assign a decision.
Prom the initial roles, it is usually possible to derive many
sets of rules which are consistent and complete. Therefore,
a criteriJn of opti.ality (defined by a oser according to his
problem) is used to select a few alternatiges which are most
-
•
desirable according to the specific induction problem. The
attention is restricted to sets of rules which .ake only one
decision for a gi.en situation.
1.2 Preyious Specific lpplications
InductiYe inference is used here to describe a forMal
method for rewriting or generalizing ayailable data in order
to gi.e new information about a problem and aake new
decisions which could not be obtained before. statistical
methods are probably the most widely used forms of inducti.e
inference. These methods require a great deal of a-priori
knowledge in~lQding the a.ailability of a large set of data,
knowledge about the interpendence of yariables on each other,
and an understanding of the type of underlying distribution
of the data [Croft-71]. In addition, statistical results may
be difficult to read and interpret [Larson 16] (e.g. a
conditional probabilitT matrix).
The first approach to automated inductiYe inference
using logic was most likely de.eloped by Bunt [Hunt 66]. He
described a number of different schemes for generating
decision trees which can be used to distinguish between sets
of letter seguences. Although a decision tree produces an
elegant procedure which can be easily executed on a co.puter,
it lacks the flexibility necessary to represent more general
concepts.
-
5
The BEURISTIC-DEJDBIL program [Buchanan et.al. 69]
provides a model appropriate for representing the structure
of chemical compounds and some transformations representing
possible chemical reactions which can be applied to the
compound representations under certain known physical
constraints. The program finds a set of possible structures
of a compound knowing its empirical formula and mass
spectromet~r data by suggesting various structures for the
compound and applying transformations to the structures under
the quidence .of a set of heuristics based on the mass
spectrometer data. The meta-DENDRAL Frogram [Buchanan
et.al. 72] finds a general mechanism or theory wbich explains
the transformations which take place relying on. tbe knowledge
of those transformations which are plausible and those which
are forbidden.
computer aided medical diagnosis is another area in
which logical inductive inference methods have been
suggested. Pople [Pople et.al. 72] have suggested a graph
structure representation of biomedical facts and an approach
to forming theories by finding common subgrapbs using user
supplied suggestions. Of particular note is work in the area
of computer aided medical diagnosis and plant pathology
[Michalski 73,7Q, Cbilausky et.al 76, Larson 76] use the 'L
system (a variable valued logic system which is the precursor
1
-
6
to the VL syste. used here) and the proqram lQY1L/1-1Q7 to 2
infer descriptions of clases of liYer diseases and soybean
diseases. The latter work 1s a specific application of a the
:VL, logic syste. to seyeral problems.
Winston eVinston 701 demonstrates specific procedures
which discover descriptions fro. examples in the toy blocks
world. 1 description which is descriminant (i.e. can be used
to distinqaish one object or set of objects froll other sets)
1s formed by matching the si.ilar parts of the object under
consideration with another ob1ect (near-miss) and then
isolating the structures which are different between the two
ob1ects. This differs from the approacb taten in the
following chapters in that matchinCJ is o Illy done here in
order to adequately describe sOllie specific feature which
distinguishes betveen tvo objects. (e. CJ. to specify the
second part fro. the top in an object, one may bave to
include some predicates .hich define the second from the top
in terms of other descriptors. If the distinguishing feature
involves this part, then the definition of the part used in
the description must be common to both objects.) Winston also
uses modifiers such as '!lust' , 'may', 'must not' in
descriptions. 1 form of these modifiers is inherent in the
VL approach (e.g. discriminant descriptions inyo1ye the 2
'mast' modifier, descriptive descriptions il1volYinq only set
-
7
of objects yields descriptions inYolYing a type of 'may'
modifier).
The program A BITHftETIC ( Bongard 70 ] finds an
algebraic rule which explains sample relationships. The
program is given sets of tables, each table containing
3-tuples of a ternary relation. 1 set of 33 predicates are
used in the program (although not explicitly given in the
reference) where a predicate may be: e.g. if the quotient of
the first two elements of the 3-tuple is positive, then the
predicate is true. Por each row of a table, a set of
features is generated by finding Boolean combinations of the
predicates appliea to the rov. 1 feature describing the
table is generated for each Boolean combination by finding
the product of Boolean combinations for all rows. The set of
features which appear to be most useful in distinguishing one
table from the others is selected as the description of each
table.
1.3 Pormal Systems for Inductive Inference
1 criticism of some of the above Endeavors is that
they are problem specific. Kore general systems which have
many possible application areas have emphasis in two types of
approaches: 1) generation of descriptions of sets of ob1ects
represented in a logic system of some kind. and 2) creation
-
8
of new concepts in a sequential manner ~J generating and
modifying hypotheses. 1 summarJ of several tJpes of learning
systems can be found in (Banerji 75]. Of particular interest
:here is the work of Borgan [Borgan 72] in which a formal
system based on the first order predicate calculus with
falsehood preserving transformations is presented. Briefly,
the idea that inductive inference can be described as
backwards reasoning is apparently not sufficent for a
practical system. Por example, if E was derived from the 2
assertions ,E and! . vB, then backward reasoning would 112
somehow have to generate the two assertions above given only
! • There are far to many expressions! Whlch can be applied2 1
to a situation such as this to yield a practical system.
Instead, Borgan defines a falsehood preserving transformation
of a deductive inference rule into an inductive inference
rule (in Borgan's notation):
E 1- E 1 P 2
where is false for everJ interpretation in which I, is false. 1 set of transformations (P-rules) can be created to
convert a deductive logic systeD into an inductive logic
system (e.g. if Band! are atomic expressions, then 1 2
E v! 1- B E 1 2 P , 2
-
9
i. e. , from the disjunction E Y E, one may infer the 1 2
conjunction E E vhile preserving falsehood). With this 1 2
system, theorem proYing techniques using falsehood preserYing
rules may be applied to inductive problems. (In later
chapters, the symbol 1< is used to denote such a
'generalizing' transformation.)
1 number of authors have presented systems which use
aqraph structure representation of expression in a type of
logic system. The 'parameterized structure representation'
of Hayes-Roth [Bayes-Roth 16] is used in inductiye tasks
which learn descriptions of sets of ob1ects and
transformations from one set of objects to another set of
ob;ects from examples. The problems addressed are closely
akin to tbe work in the following chapters of this paper, one
of the obiectives being to find the common properties of all
examples of one class vhich are available to the system using
a qraphical representation. The numter of possible
alternatives in Bayes-Roth's method is li~ited by a fixed
utility function 'which 'evaluates intermediate results and
discards hypotheses of low utility. The vork differs from
that presented in the following chapters in that only
indp.pendent descriptions are sought using a fixed utility
criterion (these are called descriptive descriptions in
-
10
chapter 6.). The structure used for representation does not
take into account specific domain structures which are
inherent in the Vt syste. and there is no facility for
:generating new descriptors within the system.
A formal approach using the predicate logic system
[vere 75] produces the largest common set of descriptors of a
set of examples by representing examples using a graph
structure and finding the largest common subgraph of these
structures. Such methods suffer from the BP-coMplete nature
of graph isomorphism algorithms (this problem is addressed in
the following chapters by finding the smallest useful
subgraphs of graphs of examples instead of the largest
subgraph.). Reither Vere nor Hayes-Roth use negative examples
heavily in their implementations.
Redrick [Hedrick 74] uses a semantic net to represent
examples and to build and modify hypotheses as nev examples
are given to the program. the semantic net supports only
binary relations but other.ise loots similar to the graph
structure of Vere and Bayes-Both.
Kochen [Kochen 74] presents a different type of
system with a set of initial events containing state
variables, actions and relations between the actions, a
learning program which applies certain transformations to
events at various time steps. At each ti~e step, weighted
-
11
hypotheses are formed which reduce the set of states stored
in memory (i.e those states which are Explained by the
hypotheses ).
Production systems provide a rich tool for the
introduction of many inference techniques (recall that VL
decision rules are similar to production rules). Briefly, a
production system architecture contains a working memory, a
set of productions which modify working memory, and a
recoqnize-act cycle with conflict resolution to dictate the
order in which productions are applied to ~emory and to add
new productions when necessary.
Waterman [Waterman 70, 74, 75] uses these to solve
several problems. A program which plays poker has been
developed [Waterman 70] which designs betting strategies in
terms of production rules. Bore recently [Waterman 75], the
approach bas been applied to recognizing letter sequences
wi th success. Rychener [Bychener 76] has applied production
systems to chess end games and natural language input of a
toy blocks world. These two authors use distinct production
system architectures Which differ in the o~dering of working
memory, orderinq of productions and in the way in which new
productions are added to an existing set of productions to
correct errors made by the system.
As a final note with regard to production systems,
-
12
the RtCIN system (Shortlieffe 74] has been shown to be a yery
powerful tool in aidinq physicians regard to
antimicrobial therapy selection. In the StCIB system,
,'deductive inference using a multi-Yalued truth model is
applied to expert supplied productions and a data base
consisting of a patient record.
1.4 Overview of the Pollowing Chapters
The 'VL system (a subset of the VL sy stem) 'is 21 2
described in chapter 2. The subset used here uses only the
truth yalues TRD!, PALS!, and DIllOW» instead of the
multi-valued truth domain of the VL syste •• llso, only a 2
very small subset of the operators available in YL are 2
used. The inductive rules used to transfor~ initial decision
rules (1.1) into new decision rules (1.2) are giYen in
chapter 3. The rules involve selecting the most significant
features of the initial condition (C in 1.1). extendingi,1
the value set wbich each feature may assume under the domain
structure constraints, and adding new global functions which
describe certain characteristics of the condition (C ) • i,j
Chapter 4 contains a graphical representation of VL rules. 2
1 subset of the graph structure is used in the computer
program INDDCE_1 described in chapter 5. The program accepts
as input:
-
13
1) a set of decision rules representinq certain examples
of sets of decisions,
2) a problem environ.ent description in the form of VL
decision rules which describes certain
characteristics of functions and doaains which arise
from the particular application, and
3) a set of control parameters which supply the
optimality criterion and certain parameters which
limit the number of alternatives generated at varioQs
points in the program.
The output from the program contains a set of complete,
consistent decision rules. Cbapter 6 gives results of the
program as applied to some specific situations. Chapter 7
describes some limitations and possible extensions of the
proqram. Two appendices are given: appendix 1, providing a
listing of the detailed output of the program applied to one
example of chapter 6, and appendix a, giving a brief review
of the precursor to this work (the program AQVAL/1-AQ7) which
is used as a procedure in I!DOCE_'.
-
14
2. Representinq Decisions in tbe 'L Syste.2
!ucb of tbe infor_tion in tbis chapter is found in
.r Larson et. ale 77, !ichalski 74 b l- It is included bere to qive tbe reader a faailiarity witb tbe YL s),stem. Tbe
2
complete VL system contains a ...err ricb set of operators and 2
domains. 1 subset of iL called VL wbich contains a basic 2 21
set of operators and domains is used bere. Only tbe iL
21
system is described with notes indicating the extensions
which are possible in the full iL system. In later 2
chapters, tbe notation it is used to refer to the system2
it •
21
Tbe loqic system it is a lanquage for describinq21
situations (e.q. objects, classes of objects) and ezpressinq
decision and inference rules. Tbe lanquage proyides for a
compact expression of descriptions which is both easily
readable and sufficiently precise to facilitate formal
manipulation (possibl)' by a computer).
There are two major differences between iL and the 21
first order predicate calculus
1. Instead of predicates, selectors are used which can
be viewed as tests for membership of Talues of
predicates and functions in a certain set.
-
15
2. Each Yariable, predicate and function symbol is
assigned a domain (or value set) together witb a
characterization- of the structure of the domain.
(This feature facilitates the of rule
generalization and allows for thE application of
different generalization transformations according to
the structure of the domain.)
There are three types of domains currently
distinguished:
,. Qn2rd~£~g or !~in!!
Elements of the domain are considered to be
independent entities; no structure is assumed to
relate them. 1 variable or function symbol with this
domain is called D2min!1 or S~1~sian (e.g. blood
type, names of obiects, etc.).
2. tin~!£l% g~~[ed or !n1~[~al
The domain is a linearly ordered set. A
variable or function symbol with this domain is
called !n1~!a! (e.g. military rank, temperature,
size) •
-
16
Elements of the domain are ordered into a
tree structure. 1 predecessor node in the tree
represents a concept which is more general than the
concepts represented by the descendent nodes (e.g.,
the predecessor of the nodes 'triangle', 'rectangle',
'pentagon' may be a ·polygon'). 1 yariable or
function symbol witil sucll a dClllain is called
2.1 VL System Structure
The VL system used is as-tuple (V,l,S,B,I) where: 21
v - is a set of yariable symbols. Each yariable symbol
is associated vith a domain D(~,). 1 ~!R of 1
variables which haye the same douain are labelled
with the same ,Yariable symbol but a different
subscript . (e. g. l[ ,:I , •• ,x , y ,y , ••• ,y are 12k 1 2 1
specifications of yariables in tvo Yariable groups
vhich assume yalues from tvo domains denoted D(x) and
D(y) or alternatiYel" D(x,) and D(y,). 1. 1.
P - is a set of n-ary fUnctions and pEedicate symbols.
Each n-ary function sy.bol represents a mapping from
an argument space into a domain. For a function
f(x ,x , ••• ,x ), this is a ~pping:1 2 n
-
•••
D(x) x D(x ) x ••• x D(x ) -> D(f)12· It
where D(x ), D(x ), ••• D(x ), D tf) represent the 1 2 It
domains of the yariables x, x, x and the 1 2 It
domain of the function f, respectiyely. 1 predicate
is a function vbose domain is the set [TRUE,P1LSE].
Included in the domains of all function and yariab1e
symbols is tbe value 11 (not applicable).
s - is a set of symbols including:
( ) [ ] = < > , . R - is a set of formation rules described in section 2.3
I - is a set of interpretation rules described in section
2.2 Selector Formation and Interpretation Rules
1 well formed iL formula (vff) is composed of 2
quantifier forms, selectors, and logical connective symbols.
1 ~!g£!Q~ is a form:
rL • R] or [ L' 1
where
-
18
L, L' - each called the IS1I~! are atomic forms.
fOIm is a yariable symbol or a function or predicate
symbol followed optionally by a list of atomic forms
enclosed in parentheses. In the aboye forms, the
atomic for. t' .ust be a predicate syahol .ith
arguments following in parentheses. If L contains a
function symbol, then the related function is called
an ~12mi£ fYn£!i20.
R - the [ef~I~n£~ is a set of yalues in the domain of the
atomic function of L. B may be in seYeral forms:
Beference Bxample Description
a a constant in the domain of
the atomic function of L
a,b a list of yalues in the
domain of t separated by
commas
a •• b a pair of yalues in the
domain of t separated by
( .. ) • the symbol,.) representing
-
19
all values in the domain of
L (except NI)
IA the Yalue HI (not
applicable)
t - is one of the symhol combinations
= (= >= ( > ...=
If R is a set of values, thEn L is related to
R by t if
when • is = or ...= L has a value (does not have a yaluE) in the set R
when t is = < > L has a value related to
eyery value of R by I.
,The selector is interpreted as a unit of
information about a situation with value or
truth-status TRUE if the relation R t L holds or
PALSE if the relation does not hold, or UNKNOWN in
which case the selector is interpreted as a question
about the situation which must be answered in order
to determine if the selector is satisfied. If some
-
20
Yariables in tke atomic fora of tbe selector are
quantified, tbese quantifiers must be considered wben
determining tbe truth-statas of a selector.
If R is., tben L is related to B for any
value of L except 11 (in this case,. is always a).
Belov are some examples of a selector:
Selector Interpretation: truth-status TROE
[color (vall ) ;:; vhite] the color of the vall represented1
by vall is vhite. 1
[ lenqt b (bOX,' ~1] Tbe length of the box represented
by box is greater than or equal1
to 1.
[box, - 2 •• S ] The yariable box may have a yalue , 1 1
betveen 2 and S inclusiYe. The
yalues of bOI: may represent1
yarious bOl:es in a situation. the
selector restricts the ranqe of
yalues of the variable box to the 1
values 2 through S.
[ ontop (x , x ) ] the part represented by I: is on 1 2 1
top of the part represented by
I: • 2
-
21
2.3 Vt Formation Rule
Pormulas in the it logic- system are used to describe 2
situations, and also to express decision rules and inference
rules. The Vt formulas are defined by the following formation
rules:
1. A selector is a it formula (wff).
2. If V, V and V are wff, then so are: 1 2
(V) a formula in parentheses
in'ferse
V & V or V V conjunction (t be symbol 6- is 1 2 1 2
used to represent conjunction)
v v V diSjunction1 2
V ! V exclusive disjunction1 2
V~V exception
1 2
V -> V V implies i 1 2 1 2
i (-> V V is equivalent to V 1 2 1 2
-
22
Is , s , ••• , s - 1 2 t
(V) Isistentiall, quantified formula
(I is used to represent the
esisteatial quantifier,.
E. s ,s , ••• , s (T) Distinctly esistentiallJ' - 1 2 t
quantified formula
As ,s , ••• ,s (T) UniTersall, quantified formula - 12k
(} is used to represent the
uniTersal quantifier).
Not all of these forms are considered in the
followinq chapters. In chapter J, (VL inference
rules) onlJ' conjunction, disjunctioD, and quantifiers
are considered. Chapter • (Grap~ Representation)
presents a graph structure representation which
includes all of these forms but the types of for.ulas
actually included in the alqorithm and the
implementation inTolTe only conjunction and distinct
existential quantification.
2.4 Interpretation Rules
A VL formala mar have truth-status TRUE, F1LSE, or
UNKNOWN. In the full VL sJ'stem, a truth-statas domain with 2
interTal structure llaJ' be defined, but here only the Talues
-
23
above are considered.
interpreted in the normal
VL formula
v y V 1 2
V & V 1 2
The remaining
equivalent forms:
Tile connectives (... .. &) are
manner:
Interpretation
FALSE if Y is TEUE. TRUE if V is
PALSE. UNKNOWN if V is UNKNOWN.
,TRUB IF EITHER OR Y IS TRUE,1 2
UNKNOWN if botb Y and V are 1 2
UNKNOWN or one is UNKNOWN and
the otlaer FALSE, TRUE
otherwise.
O.NKNOWIi if both V and V are 1 2
UNKNOWN or one is true and the
other is UNKBOiN, TBUE if both
V and V are TRUE, PALSE 1 2
otherwise.
connectives may te rewritten in
-
'L PorUlula !quivalent forlll
v -> V ,V v v 1 2 1 2
v v (V ->') & (' ->')1 2 1 2 2 1
, ............. V , ... V
1 2 1 2
,V ! V y ,~ V V 1 2 1 2 1 2
1 VL system is ased to describe a set of situations.
In order to effectively apply a formula to a set of
situations, the VL system should contain variables,2
functions, and predicates wbicb adequately characterize the
situations. To deterDline the truth-status of a formala with
regard to a specific situation, an event is created (an event
may be yieved as an interpretation of a situation in tbe VL2
system). 1n event is a sequence of assignments to variables,
functions and predicates in the system which characterize a
specific situation. Quantified variables Day be assigned a
set of values. One function assignment ~ay be made to a
given set of values of arguments if the yaloe of tbe function
is known. If a function does not have an assignment for a
given set of values# tben the value HI (not applicable) is
assumed.
-
25
1 selector [L • a] (or [L']) is sat isfied by an eyent
if there is a set of assignments to variahles and functions
(or predicates) in L (or LI) such that L is related to R by t
(or L' has the value TROB,.
A VL formula is satisfied by an event if it has 2
truth status TRUE when applied to the event.
The quantified formulas are interpreted:
The truth status of
Ex ,x , ••• ,x tV) is TRUE (or FALSE) in a given - 1 2 n
situation if there exists (or
does not exist) values for
x ,x , ••• ,x in the event 1 2 n
assignments which makes the
truth-status of the formula V
equal to TROE
? if it is not known whether
there exist values •••
E.x ,x , ••• ,x (V) is TRUE (or FALSE) in a niven - 1 2 n ':I
situation if there exists (or
does not exist)
(different) values for
x ,x , ••• ,x in the event 1 2 n
-
26
assiqnll8nts .laicll lAkes tile
tratla-stat us of tae formula
equal to fRUB. flais obyiates
tile need for extra predicates in
an eKpression like X "'=K ,1 2
K "'=K , X ...=x , etc. 231 3
? if it is not· known whether
there exist valuES •••
!K ,x , ••• ,K (V) is TRDB (or PILS!) in a 9iYen 1 2 n
situation if for all assignments
to the yariales X,X , ••• ,K ,1 2 n
the formula V has truth-status
equal to TBUB.
1 formula is a 4escr~21i2n of a situation if
every event which can be deriyed from the situation satisfies
the VL formula and eyer, eyent wbich satisfies the formula 2
is also an interpretation of the situation.
2.5 VL Decision Rules
If v and V are VL formulas, a general form of a vt 1 2 2
decision rule is
, => V (2.5.1)1 2
-
27
The formula , is called the ~9~9~!!Qn part and , is the 1 2
1 restricted form of the V1 decision rale 2
will be used in the followinq chapters.
{In the computer implementation, the formala , is 2
assumed to be a product of selectors which contain O-ar.,
functions in the referee. The terminoloqT is relaxed in this
case to allow the function symbol which appears in the
decision part to be called a ~~£is1on ~!£i!bl~.'
1 decision rule in the form 2.5.1 may be applied to a
set of situations as follows: If the condition part of the
decision rule (V ) is given troth-status TROE, then the 1
decision part of the decision rule (' ) also assumes the 2
truth-status TROE. For each event (assigrunent e:=L) which
satisfies V a new set of assiqnments are made to the eventl'
-usinq the decision part of the rule to fo en the set of all
events which satlfsfy the coniunction V & , . For example,1 2
qiven the decision rule:
E. x , J: r p (J: , J: ) ] => [D= 1 ] ,(2.5.2)- 1 2 1 2'
and th e event
e: J: :=0,1: J: :=0,1 p(0,1) :='IROE (2.5.3)1 ' 2
with variables functions and predicates: x, D, p, q vith
-
28
domains O(x)=[0,1), 0(0)=[0,1], D(p),O(q)=[tBUI,P1LSB], a nev
assignment is made to e to give:
e: x :=0,1; x :=0,1; p(O,'):-TIOI; D:-1. (2.5.4)1 2
1 decision rule:
(2.5.5)
applied to the event 2.5.3 gives one of tke two new events: ~
e : %1:=0,1; x :=0,1; p(0,1):=TRUB; q(O,1):=TRO!;1 2
e : x,:=0,1; x :=0,1; p(0,1):=TID!; q(1,0):=TROB;2 2
Note that q(1,1) and q(O,O) are not given status TROB since
the quantifier I. insists that the tvo variables
have different values.
Given a set of decision rules eack in the form 2.5.1,
the set may be applied to a set of events. Initially, the
condition parts of all decision rules bave value OIK1081. If
an event satisfies the condition part of a decision rule, nev
assignments are .ade to the event according to the decision
part of the satisfied rule and the condition part of the rule
returns to truth status UNKHOVI.
In the remaining chapters, events are only used as a
formal basis for defining certain concepts. Since tbe number
-
29
of events necessary to completely describe a sitaation is
quite large, only the VL formulas themselvEs are manipulated2
by the algorithms.
-
30
3. ,t Transformation lules
Prom one set of decision rules (1.1), a nev set of
decision rules (1.2) is obtained b, appl,in9 certain
transformation rules (t-rules). Por now, ve vill restrict our
attention to rules which transform the condition part of a
rale. These t-rules may be grouped into three t,pes of rules
based on the events which satisf, the condition part.
Given tvo rules:
a · y => D 1 · 1
1 •
, =) D· 2 2 V is more than V if eyery eyent satisfJin9 , .alsoH!l!'.I!1 --- 2 2 satisfies , . If the conyerse is also true, then '1 is
1
equivalent to , • Belov, 1 and 1 are the inegt and 2!Il..2!l! 2 1 2
of at-rule. the three types of t-rules can be expressed as
follows:
1 transformation!: 1 .t 12 (' being one of -,
-
3'
the condition part V of B is more general than the 2 2
condition part V of B • , , Rules ,. and 2. are called inducti~ int~~D£~
3. A specializing transformation (deductive inference
rule denoted B, I> R , if the condition part v is 2 1 more qeneral than the condition part 1 •
2
Here, we are most interested in the first tvo types
of t-rules (i.e., inductive inference rules).
3.1 Equivalence Transformation
An equivalence transformation rewrites a VL formula 2
into a different form either using equivalent VL operators2
or introducing new functions which represent some information
already in tbe rule in a different manner. Below are some
examples of equivalence transformations. The symbols Land, L2 represent atomic forms, V and Y' represent YL formulas, D
2
represents a VL formula which has no variables in common 2
with V, or and 1= is used to represent an
equivalence transformation.
E1. Equivalent VL forms. 2
v ([ L = 11 v [L = 2)) => D 1= V[ L = 1,2] => D
-
32
'[ L ",= 3,11] => D
(assu.ing tbat tbe do.ain of L is tbe interYal [0•• 5]
and bas nominal structure). Tke dot operator (.)
between two atomic fot'1RS is called I internal
conjunction ' thus tbe expression on tke rigbt aboYe
is read: 'If L 1
and L 2
botb bave tke yalue 1 and V is
satisfied, then make decision D.',
v([ L1 = 1][ L2 = 1]) => D 1= '([L,.L2
::I 1]) => D
(assuming that Land 1
L 2
bave tbe same domain size
and structure).
B2, Internal Conjunction of Arguments
VV' => D
y. = [f (I:l' :a i)[ f (x2' 1= "'[f '
'II: i)
(:.I,.x2,=i] => D
Tbis rule .introduces a new Fredicate fl wbich
has the domain [TRDE,PALSI] and two arguments. The
(.) operator instead of (,) indicates that the order
of arguments to f' is irrelevant. The function. fl - assumes tbe value TIDE if f bas the yalue i for both
E3. Introducing Hev Predicates,
'V· =>D
-
33
Y' = [f (I'. ) = i I f (I'. ) = j]1 2
where i is related to 1 by Eel. Por example,
i1, i>=j would result in new predicates LT_f,
E4. Splitting the Condition Part.
v v Y' => D 1= , => D, Y' => 0
This rule is used to form a set of decision
rules with product condition parts from decision
rules in disiunctive normal form•
. 3.2 Generalizing Transformations
This type of transformation usually produces not only
a more general decision rule from a set of decision rules but
also a 'simpler' one than the original. Some rules are
applied That is, one rule is actually
transformed but the context consisting cf rules with a
telated decision is used to obtain a more optimal result.
(These transformations may also be interpreted as
transformations from a set of rules into a new, more general
rule. The approach of focussing on one rulE in the 'contel'.t'
of others is taken here to more closely re fleet the approach
taken in the implementation.) In the following rules, the
symbol 1< is used to indicate an inductive inference ,rule.
-
G1 Dropping a Selector.
vr L = H] =) D 1< V =) D
llthough this rule is interesting in a formal
sense, it should be applied with care since the
number of generalizations possible with successiye
applications of this rule is very large. Bore is
said about this problem in section 3.Q.
G2. Extending the Reference
lominal domain structure
vr L = a] ::) D 1< V[ L :: a, b] =) D in context [L = b] => 0
InterYal domain structure
V( t ,. a] =) D 1< Y(L = a •• b] =) D
in context [L = b) => 0
Tree structured domain
V£ t = a] ::) D 1< V[L = c 1 => D in context [t = b) =) D
c is a predecessor of both a and b in the ,
generalization structure of the da.ain of L.
G3. Extension Against
-
35
Nominal domain
, [L = R ] => D 1< v [L ..,.:9 J => 0 1 1 1 2
in context: V [L = a ] => ,D2 2
assuming a ! a = null. (The symbol !1 2
denotes set intersection.)
Interval domain structure
V [L = a •• b] => D 1< , [L = e•• f] => 0 1 1
in context: '2[L = c •• d] => ~D assuming [a•• b]! [c•• d] =null if b < c then e = 0, f = c-1 if a > d then e = d+1, f = h (0 and hare
the minimum and maximum elements in the domain of
L)
Tree structured domain
, [L = a] => 0 1< , [L = c] => D 1 1
in context: V [L = b] => ~D
2
assu~inq a ! b = null the constant (c) is the most distant
ancestor of (a) which is not an ancestor of (b)
(c may be equal to a).
GQ. Replace A with a constant
-
36
(AI: V(I: » =) D 1< '(I: ) [I: • a] -) D - i i i i a - an element in the dOMain of 1:. (The
i
symbol ! represents the universal quantifier.) For
el:ample:
11:1 [f (I:1) : 1] =) D 1< [ f (x ) - 1][ x = 2] :=> D 1 1
The left side of the expression requires that f have
the value 1 for all values of I: before a decision be 1
made. The generalized el:pression (right sid,e) only
requires examination of one value in the domain of
1:, namely the value 2. &11 other values in the 1
domain of I: are irrelevant to the decision rule. 1
GS. Beplace a constant with !
V(x)(x = a] =) 0 1< Ix Y (x ) => Di ' i i i a - an element in the domain of x • (fhei
symbol Ii represents the existential quantifier.) For
example:
[ f (x ) = 1][ I: = 3] 1<1 1
The left side of the expression mates a decision only
if f has the value 1 for ][, with value 3. The
-
37
generalized expression makes a decision if f bas the
value 3 or any value of x • 1
G6. ~oye ~ to the Big~t
1<
3.3 Specializinq Transformations
A specializing transformation may be used to apply a
decision rule to a new situation or to add certain
restrictions to decision rules. Below, the symbol I>
represents a deductive inference rule.
R1. Addinq Restrictions
VV' => D I> VV'V" => D
in context V' => V"
This rule is used to add restrictions to
descriptions and generalizations where the
restrictions rE7present some structure ~hich is
imposed on a function or some relationship between
functions which alvays holds. lor example, the
transitiYity or symmetry of a function may be
introduced using a restriction.
-
38
R2. Dropping Product Rule
, Y V => 0 I> Y => D1 2 1
This rule may be used to obtain a set of
decision rules with a product in the condition part
from a rule indisjunctiye normal fora.
3.4 Transformation aules InYolYing the Decision Part
It is clear that reYersing the roles of the input and
output of an inductiYe t-rule giYes a deductiYe t-rule and
conversely. Therefore, each of the rules G1-G6 could be
inverted to get a deductiye rule. The rules in sec,tions
3.1-3.3 vere based on the events satisfying the condition
part of a decision rule. Similar rules may be applied to the
decision part of a rule to obtain equiyalent, more general or
more restricted rules.
GiYen two rules:
B : V =) D 1 1
B2 : , =) O
2
R is more general than Band B is more restricted than a 2 1 1 2
if eyery assignment of eyent yalues made by B1 is also made
-
39
by R • If the converse is also true. then 1 is eqaivalent 2 1
to R • To transforlft the decision part of a decision rule,2
apply G1-G6 to the decision part of a rale to obtain a
deductive role and 11-12 to obtain an inductive rule
intercbanqing the roles of the condition and decision parts
in tbe transformation rules. A few examples are given
belOVe
DP1. Droppinq Selector Role Applied to the Decision Part
v => [L = 1 ]( L = R ] I> V =) (L = R ]1 1 2 2 . 1 1
Thouqh this is a qeneralization rule when
applied to the condition part. it is a deductive rule
when applied to the decision part of a decision
rule.
DP2. Splittinq the Decision Part
, => [L = R ]D I= V => [L = R ], , =) D2 2 1 1 This equivalence preservinq rule is used to
I
produce a set of decision rules which involve only
one decision variable.
DP3. Replace a Constant with A in the Decision Part
v => rL (x 1 ) = B 1 ][ x 1 = a] I )
-
3.5 Example of lpplication of Transfor.atioD tales
In this section, selected transformations will be
examined in more detail with respect to a specific example•
•Consider the situation in Figure 3.1. There are four objects
classified according to two decision yariables 01 and DB.
Each object is described in terms of the following VL 2
functions and predicates:
variables each representing a part in an ob1ect
with domain: P = [0,1] with nominal structure
ontop: a predicate mapping P x Pinto [TBUE,P1LSE]
The predicate is TIU! if the part represented
by the first argument is on top of the part
represented by the second argument.
shape: a function mapping P into [Triangle(T),
Circle(C), Rectangle(S), Bllipse eB) , Polygon(P),
Curyed rigure (CP) ] witla generalizations:
[shape=T,R]=>[shape=P] and [sbape=c,BF>[shape=CP].
The function specifies the shape of the part
represented by the argument. The domain is a tree
structured domain where Pol,gon is an ancestor (a
generalization) of Triangle and Rectangle and Curved
Figure is an ancestor of Circle and Bllipse.
-
Dl,OB: decision variables with domain [1,2J
01 = 1 01 = 2
1 Circle 2 Circle DB = 1
Triangle Circle
3 Ellipse Rectangle DB = 2
Rectangle Rectangle
Pigure 3.1 Set of Objects
Each obiect in fig 3.1 may be described in the iL 2
system. The description of objects 1 and 4 are:
( ontop (p , p ) ][ shape (p ) =C][ shape (p ) =T 1
1 2 1 2
=> [01= 1 )( 0 B= 1 J (3.5.1)
( ontop (p , p ) 1[ shape (p ) =R ][ shape (p ) =R]1 2 1 2
=> (01=2][ tB=2 ] (3.5.2)
The variables P, and P2 are assumed to be quantified with the
operator ~. (distinct existential quantifier). Osing the
-
42
rule DP1, the decision role 3.5.1 can be transformed into tvo
new rules:
[ ontop (p , p ) ][ shape (p ) =C][ shape (p ) =T ] => [D1= 1 ] (3.5.3)121 2
[ ontop (p , p ) )[ shape (p ]=C][ shape (p ) =T] => [DB·' ] (3.5.4)121 2
Concentrating now on 3.5.4, the dropping selector
rule may be applied twice in succession to obtain
[ sha pe (p ) =C] => [DB=1]. (3. 5.5)1
This role describes all objects in figure 3.1 with the
decision DB=1. In addition. it does not describe any object
with 08=2; so it is a complete. consistent generalization of
the objects wit.h DB=1. (10 te t.ha t completeness and
consistency are in no wa, guaranteed b, the application of
the dropping selector role. These conditions can howeyer be
checked after each application of the rule.) Applying the
extension aqainst rule to 3.5.3 in the context
[onto(p ,p) ][shape(p '=Blshape(p )=B] => [D&=2] (3.5.6)1 2 2 2
focusing on the selectors [shape(p )=C] in 3.5." and 1
(shape(p )=B] in 3.5.6 one aay obtain
1
[ sha pe (p 1 =CF ][ontop (pl' p 2) ][ shape (P2) =T] => [DA= 1 ]. (3. 5.7)
-
43
lpplyinq extension against now to 3.5.7 in the context
[ontop(p ,p) ][shape(p )=C][shape(p )=C] => [Dl=2] (3.5.8)1 2 1 2
focusing on the selectors [sha pe(P2)=TJ in 3.5.7 and
[shape(p )=C] in 3.5.8, one obtains 1
[shape(p )=CP][ontop(p ,p) ][shape(p )=P] =) [01=1]. (3.5.9) , 1 2 2
This rule is a consistent and complete generalization of all
rules in fugnre 3.1 with 01=1 'in the decision part.
Looking nov at the description of object 4 (3.5.2),
tvo rules are obtained by applying t-rule R1:
[ontop(p ,p) ][shape(p )=RJ[s,hape(p )=R] => [01=2J (3. 5.10)1 2 1 2
[ontoP (p"P2) ][shape(P1,=R][shape(P )=R] => [06=2J (3.5.11)2
An application of the dropping selector rule to 3.5.11 twice
in succession qives a complete, consistent rule:
( sha pe (P,' =R] => [OB=2.1 (3.5.12)
An application of t-rule E3 to 3.5.10 vith a relation equa!2
produces:
(ontop(p ,p) ][shape(p )=R][shape(p )=RlrEQ-shape(p.p)]1 2 1 2 1 2
-
->[ 01=2] (3.5.'3)
The predicate EQ-shape specifies that the value of
.the function shape is the saDIe for argument s p, and p • 2 The
(. ) separatinq the arguments p, and P2 of !Q-shape signify
that the order of the arguments is irre levant. (In the
implementation, a selector with this type of predicate is
written [sbape(p.p )=same].) In application of ~he dropping , 2
,selector rule three times in succession to 3.S.13 gives: ,
[ EQ-shape (p , p ) 1 => [01=2] (3.5.1~)1 2
which is a complete, consistent generalization of 3.5.'0.
In summary, the new rules which were obtained are:
[ontop(p ,p ) l[shape(p )=CP][sbape(p )=P) => [01=1] . , 2 1 2
[EQ-shape(p"P2) ] => [01 = 2] (3.5.15)
[shape (P,):C] => [OB=1]
[shape (P,) =R] => [DB=21
In the above discQssion, the resultinq simple
generalizations depended on knowing the proper rules to apply
and the proper portions of the decision rules to modify. Por
larger problems, this approach is infeasible because of the
large number of possible generalizations whicb could be
-
I 45 - i
)
I
I made. Therefore, a more efficient approacb is required. Sucb
an approach is giYen in the neKt section and described in
detail in the chapter on basic procedures (chapter 5).
3.6 Efficient Application of Generalization BuIes
There are two significant problems with using the
procedure described in section 3.5 to apply t-rules to a set
of decision rules. The first is the large number of new
rules which can be generated from one rule. This problem
could be circumvented by trimming the intermediate lists of
new rules selecting only the most promising set of rules
according ·to a user specified criterion before fUrther
application of t-rules (see section 5.3 for a description of
a trimming procedure). A second problem which is not
surmounted as easily is compleKity in.ol.ed in
determining whether a VL formula is consistent and in2
evaluating the optimality cost functions during trimming. To
determine whether one VL formula is a generalization of 2
another or is consistent with respect to ancther formula, one
mast determine whether all or any of the events which satisfy
one Vt formula also satisfy another VL formula. 2 2
Using t~ current implementation, this inyol.es
determining whether one graph structure reFresentation is a
subgrapb of another (see section 5.4 for a description of a
http:inyol.eshttp:in.ol.ed
-
46
subgraph isomorphis. algorith.). This problem is exponential
in nature (i.e., the tiae to determine whether one graph is a
subgraph of another using a depth-first-search is
~roportional to a raised to the power of n where m is the
number of nodes in the larger graph and n is the number of
nodes in the smaller graph). Actually, it is not quite so
time consuming since the graphs haYe relatiYely few edges and
the edges and nodes are labelled. It is, howeyer, important
to form simple generalizations whick correspond to small
graph structures.
Since the cost criterion normall, includes a cost
function which minimizes the number of selectors in a
product, the consistency and optimalit, of optimal rales
should be easil, calculated. In the example of section 3.5,
the original rules were made smaller by dropping selectors.
An alternate approach is to grow the generalized rule
beginning with single selectors and adding new selectors
until a consistent rule or set of alternative rules is
created. 1 very general algorithm follows: chapter 5 giYes a
complete description of the algorithm in the current
implementation.
1. Given a set of decision rules in disjunctiye normal
form, create a new set of rules vitb a product in the
-
47
condition part and one selector in the decision part
of each rule.
2. Select a rule which involves one value of a decision
variable. Add to the rule new selectors which
represent functional relations as specified by a
user. (i.e., multiply the condition part of the rule
by selectors containing new functional relations.)
3. Pind a consistent generalization of the rule from 2
by locating the most promising selectors of the rule
from 2 and adding nev selectors to each of these
selectors until a set of consistent generalizations
of the rule from 2 is obtained.
q. Apply the extension against rule using an lQVAL/1
procedure to all consistent rules.
5. Select the best generalization from this set and
remove rules from the set produced in step 1 for
which this is a generalization. Repeat steps 2-5
until no more rules remain which involve the decision
variable and value of step 2.
6. Continue by selecting another value of the current
-
decision 't'ariahle or selectinq another decision
't'ariable until all decisions have been considered.
-
49
4. Computer Representation of YL Decision Rules
Some of the information in this section appears in
[Larson et.a1. 77] and is giYen here as a background for the
description of the computer implementation. 1 VL decision
rule can be represented as a graph with labelled nodes and
directed labelled edges. The labels on the nodes can be: a)
a selector containing k-ary descriptors without argument
lists. b) a k-ary descriptor without arguments, c) a
quantified ,ariab1e with an optional subrange of ya1ues, d) a
logical operator. (From here on, a node is referred to by
its label. e.g., a selector node means a node with a selector
label.) The edges are labelled with integers from 0,1, ••••
Edges not labelled 0 refer to the position of an argument in
the label at the head of the edge. (Edges haye non-zero
labels only if the position in the argument list of the head
node is important. Labels of 0 may be dropped for
conyenience.)
Several different types of relations may be
represented by edges. The type of relation is determined by
the label on the node at each end of the edge. The types of
relations are:
1. !YD£!i2n ~~!nd!n£! - The label of the head node of
-
50
tbe edge bas a k-ary descript or. Tbe yalue
represented by the edge is tbe yalue of tbe atomic
form in the tail if tbe tail is a selector node, a
descriptor yalue if tbe tail is a descriptor node, or
one or .all of a set of descriptor yalues if the tail
is a quantified Yariable. fbe edg8 label specifies
whicb argument of the bead node assumes this yalue
(Figure 4.1).
.1 2~ (g = 1•• 2)------~~~ [f = 1 ]~4~-------- ~x2
Functional Dependence: Ex ,x ([g(x )=1 •• 2][f (g(x ),x » = 1]- 1 2 1 1 2
Pigure 4.1
The head node is a logical
operator (e. q. y, &, =» and tbe tail node is a
selector node, or a logical operator node. If the
tail node is a selector, then the yalue represented
2.
by the edge is the trutb yalue of the selector at the
tail (Figure 4.2)
-
s,
.,
[f = '] [g = 2]
-Ex,
Logical Dependence : ~x, ([f(X,) = 1] v [9(X,) = 2]
Figure 4.2
3. lm~li£i~ !g!iab~ ~~~~ - The labels of the head
and tail nodes are quantified variables. This type
of dependence represents the implicit function (which
can be represented by a Skolem function) (Figure
4.3) •
[p = 1]
Ax --------------------~.~ Ex- , - 2
Implicit Variable Dependence: Ax Ex [p(x ,x ) =1] . - ,- 2 1 2
Figure 4.3
-
52
4. The bead node is a logical
operator and the tail is a quantified variable. This
type of dependence May be necessary for certain
binary logical operators suc~ as (->, .......>-----'2=-__ [q = 1]
1 1
Scope of Variables: ~X1'X2([P(X1) = 1] => [9(X ' = 11)2
Pigure 11.11
The grapb of a more complex decision rule is given in
Figure 11.5. The value of x is dependent in an unspecified3
way on the value of x (the edge labelled 1). The disjunction2
(v) depends on the values of x and x ' but this is clearly2 3
specified by the functional dependence of f and 9 on x and 2
x. Pinally, observe that the decision operator (=» does not 3
-
53
explicitly depend on the specific 'taloes of x, x, or x ,1 2 l
but instead depends on tDe truth 'taloe of the entire premise
using some set of 'talue assignments for x ,x , and x • 123
--i....... &...L.,...=>~ [d=1 ]
GrapD Structure Example:
! x Ax Ex «[ f (x , x ) ::: 1] 't [ [d :: 1 1 1 2
Figure ".5
-
5'1
5. Algorithms' and Computer Implementation
A computer prograll IIDOCB_' has been written to find
.a generalization of a set of decision rules. The algorithms
are described in the remaining sections of this chapter;
examples of generalizations produced by the program are given
in chapter 6 and a sample session with the program is given
in appendix A. The program does not perform all of the
transformations given in chapter 3 and does not accept the
full VL language given in chapter 2. The restrictions on 2
the form of the input and a description of the output are
qiven in the next tvo sections.
5.1 Input to the Program
The program accepts as input: 1) a set of decision
rules, 2) a problem environment description including a set
of restrictions, domain definitions, variable costs etc., and
3) a set of parameters which control certain aspects of the
program operation. Decision rules, restrictions, and domain
structures are entered a VL type formulas in the following2
format:
Decision Rules
Decision rules must satisfy the following
graJlllJlar:
-
55
11.
-
56
quantified br the operator I (distinct existential
quantifier - note 3)
In this example. the function p is assumed to
have two walues (i.e•• it is a predicate - note. 6).
The selector containing p is satisfied if it has the
value 1. The second selector restricts the possible
values of x to the set of values [2.Q] (note 3).2
Several observations can be made about this grammar:
1. The' condition part is a single product and the
decision part involves only one wariable. It is
assumed that one decision wariable has been selected
to be studied by the user. llso. the equivalence
rule has been applied which splits a condition part
of a formula in disjunctive normal for~ into a set of
decision rules with condition parts which are single
products (t-rules £4 and DP2).
2. Each atomic form is a function symbol with a list of
single variable arguments. It is assumed that the
user has converted forms such as
[g(f(x.1) ] 1.
-
51
into a form
by introducing a new predicate each
function symbol f which is in an argument list and a
variable y for each occurrence of the function f(x )
in an arguj
ment list. The predicate p is assumed i
to
have the value TRUE if = f (x.) 1.
and PAtSE
otherwise. For example:
[ shape (pa rt (x ») = 1)1
is assumed to be transformed into e.g. an expression
3. 111 variables (arguments) are assumed to be
existentially quantified. Variables with the same
function symbol part are assumed to have the same
domain. Furthermore, variables with values from the
same domain are assumed to take on distinct values
from the domain. Variables may be restricted to a
subrange of values by using the third option in the
-
58
selector definition. Osing this method, a constant
may be specified as an argument to a function.
4. If a reference of the' second for_ is specified at
least once in the input (production 16) (e. g.
(f(x )=2 •• 2] or (f (x ) =2 •• 5]), a domain of type1 1
inter.,al is assumed; otherwise, the domain of a
.,ariable or function is assumed to be of nominal type
or tree-structured if such a structure is specified. '
5. The last form rif the reference uses a, yalue (*). This
specifies the entire domain of the associated
function symbol. (B. g. (p=*] means that the
selector is satisfied for any value of p other'than
the value Hl.) If a selector is omitted from the
decision rule entirely, the program assumes that the
function has the yalue I' (not ap~licable). such a domain value has no generalization (other than HA
itself). Although H1 is not specified in an input
rule, it can be a valid value of of a variable.
6. If the second form of the selector is used
(production 13) (e. g.
assumes that the function symbol has the value 1.
This may be used to specify a ~RUE value for a
-
59
predicate (predicates are treated as functions witb
domain [0,1]). In general, tc simplify tbe
expressions, only positive values cf predicates are
specified. The program tben uses only positive
instances of the predicate in tile qeneralizations.
If negative values of a predicate are desired in tbe
generalizations, these relations sbould be included
in the initial decision rule specification (see the
arcb example in chapter 6) • (e. g. [ontop(p,p )] , 2
specifies tllat p is on top of P2: [ ontop (p r P ) =0 ) , 21
specifies that is not on top ofP1 P2·)
7. If the fourth form of the selector is used (e.g.
[p(x .x ) 1, then the order of the arguments is 1 2
assumed to be irrelevant.
Restrictions:
Restrictions must satisfy tbe following
production:
(REST> ::= (CONDITION> - >
where the arguments of the selectcr part must all
appear in the condition part. CONDITION and SELECTOR
are the same as in the decision rule grammar. Por
example:
-
60
[left(s ,s ) )fleft(s ,s ) 1 => [left(s ,s )].1 2 2 3 1 3
Bestrictions extend or modif1 decision rule
specifications. Por every occurrence of the CONDITION
of the restriction in the condition part of the
decision rule, the selector is added to the condition
part of the decision rule if it is not already
there. If the SELECTOR is already in the condition
part of a decision rule, then the walues of the
SELECTOR replace the values in the occurrence of
SELECTOR in the decision rule. Os1oq this feature,
the transitive closure of a transitiwe function may
be calculated. 1 restriction which adds equivalence
type predicates is included b1 default. Each
restriction is applied to each decision rule as it is
entered into the program.
Domain'Generalization Structure Specification
This specification must satisfy the
production:
::= [ = ]=>[ = ]
where the tvo function symbols are the same and
PH-5Yft and BEP are as in the decision rule qrammar.
-
61
Por example:
(s = 1, 3,"] => (s :: 6].
[s = 0,2) => [s = 7]. [s :: 4] => [s = 8]. [s = 6,8] => [s:: 9].
Bnterin
-
62
program output is giYen in appendix 1. 11thoagb input
decision rules may be any formala vbich satisfies the grammar
qiYen in the preYious section, tbe prograM only searches for
.generalizations vhich can be represented by a connected graph
structure vith functional dependence edges. 'fhe user may
construct decision rules to satisfy this constraint by
incloding nev predicates vbich link products vbich haye no
arguments in common. (In general, it can be tested to see if
a product of selectors in a decision rule has a connected
graph strocture by determining vbether there is a
partitioning of selectors into tvo products vbich haye no
argoment in common. If there is sucb a partitioning, then
the grapb structure is not connected.)
Different generalizations may be obtained by yarying
the program para.eters and reordering the decision rules. In
general, increasing any of the !lISTAR, ItTIR, or KCONSIST
parameters (described belov) vill cause the program to
require more memory and time but the resulting
generalizations may be aore optimal. Rearranging the
optimality criteria may also produce a different result.
higher tolerance associated vith a particular cost fUnction
viII reduce the selection based on tbat function. The
default optimality criteria (or optimality cost functions) in
the order of application are the folloving:
1
-
--
63
1. ftinimize the inconsistencies of a rule with tolerance
0.30, i.e., the number of events co,ered by the rule
whicb are not supposed to be covered by tbe rule.
(this is cost function number 3 in section 5.5). Tbis
allovs tbe program to produce consistent
generalizations quickly. The high tolerance removes
bigbly inconsistent rules wbile leaving selection of
nearly consistent rules to the remaining cost
functions.
2. ftinimize tbe number of products in the complete
generalization witb tolerance 0.00 (this is cost
function number 1 in section 5.5).
3. Kinimize the number of selectors in each product
produced by tbe program (function 2 i~ section 5.5).
Minimize tbe cost of functions in eacb product
(function 4 in section 5.5). If costs are specified,
this criterion may be moved forward. If tbe user
wishes certain functions to appear in tbe resulting
products, the costs for these functions may be
specified (given negative cost). Si~ilarly, functions
-
64
which are very difficult or costly to measure may be
given appropriate positive costs.
5. !azimize the intersection of resulting rules (cost
function 5 in section 5.5). In real situations, the
separate products produced by the program may
represent a large number of coamcn input decision
rules along with some peculiarities of specific
situations which arise. Use of this cost function
will fayor the selection of a more representative
result as opposed to one whicb describes only a
particplar set of situations. Appendix B contains a
more detailed discussion of a similar cost function
in tbe VL 1
system.
The program contains about 40 procedures.
PiYe major tasks which are performed by some groups
of procedures are described below.
5.3 ~ormation of a Complete Generalization
1 generalization is found of a set of decision rules
is found containing a specified value I in the decision
part. Tvo sets of products are generated: a set P1 vhich
contains all products in the CONDITIO. parts of rules with a
decision value of I, and a set PO which contains all other
-
65
products. Each product is called a c-formula
(con1unctive-formula).
One c-formula B1 of P1 is selected at random and a
connected-con1unctive-formula (c2-formula) is generated wbich
is a generalization of 81, consistent with respect to the set
PO, and near optimal with respect to a user defined
criterion. A c-formula is £gnnected if its graph structure
representation is weakly connected by fUnctional dependence
~formul~ PO if it does not intersect with any element of
the s'et PO (i. e., there is no event which satisfies both
c-formulas).
Once a generalization of E1 is found, it is saved in
a set CQ and all elements of P1 whicb are coyered by this
generalization are removed from P1. One c-formula E1 £2!~r~
another c-formula EO if E1 is a generalization of EO.
Another element of the new set P1 is selected and the
procedure repeated. When there are no more elements in P1,
the complete, consistent generalization of the set of
c-formulas P1 is the disjunction of all c2-formulas in CQ.
5.4 Determine Cover and Intersection of 2 Poruulas
Tvo similar procedures are decribed. here. The test
to determine wbether a c2-formula B coyers a c-formula Ht is
-
66
used vhen Et is an ele.ent of tile set '1. ~he test to determine whether I intersects wi til It is aBed when Z' is an
element of PO ti. e •• to detemine if Z and It are
,'consistent) • 'rhe procedure ases the grapb stractare
representations of .! and Bt (G and Gt witb nodes and edges
',E,V',!' respectiYely). Tbe grapb G is assumed to be weakly
connected. B covers E' if there is a 12ecillizinq isornorill.§l!!
(s-isomorphism) from G to a. subgrapb of G'. ·~be reverse
mapping (from a subgraph of G' to G) is called a qeniEAlizinq
iS2m2[phi~m (g-isornorphism). E intersects with E' if there is
an in~~§!£!ing i22~QI2hi§m (i-isomorphis.) between G and a
subgraph of Gt. Each isomorphism from G to a subgraph of G'
is a 1-to-1 correspondence between nodes and edges of G and a
subset of nodes and edges of G' wbere the correspondence (or
matching) of nodes and edges is defined as follows:
I node n of G match~ a node n' of Gt if:
1. They are both selector nodes or botb gaantified
variable nodes.
and 2. If theT are selector nodes, then the function symbols
in both nodes are the same. If they are yariable
nodes, theT are of the same groap of ,ariables.
-
67
and 3. With an s-isomorphism or g-isomorpbism, the set of
values associated with n is a generalization of the
set of values associated with n'. (the sets of values
may be equal.) lith an i-isomorphism, the sets of
values intersect. In the case of selector nodes,
these values are the elements in the reference of the
selector. In the case of quantified variable nodes,
these values are the subranges of the variables.
1n edge of G matches an edge of G' if:
1. They have the same label
and 2. The respective head nodes match and the tail nodes
match.
To speed rejection, a quick scan through the nodes is
made to see if there is a correspondence between nodes of G
and a subset of nodes of G' (ignoring links between nodes).
If there is a possible 'correspondence, a prccedure is invoked
which locates a subgraph of G' which is isomorphic to G and
assigns each node of G to a corresponding node of G'. The
procedure is as follows:
-
68
Select a starting node (n )o
of G .kick contains the
most labelled incoming edges. (This is the selector
node with the largest number of arqaments.) Selecting
.. a node of this type insures that t~ere is a minimuM of backtracking through the starting ~ode.
2. . 1 rooted directed a-cyclic graph G* with nodes and
edges v* and E* is constrocted frOM G by copying a11
nodes and edges of G to G* and assigning a direction
to each edge of G* so that G* has no cycles and for
each node x in (V*-n* ],o
there is a path from n* 0
to
x. (n*o
is the node of G* which corresponds to n 0
in
G.) A traversal of the graph G* is tbe list of edges
and nodes visited in a preorder tra~rsal of G* with
root n* O· A pre2I~g, yaversal of a subgraph with
root x visits tbe node x, visits eacb outgoing edge
of x and traverses the subgrapk .bicb bas as the
root, the head node of the traversed edge.
3. The graph G is traversed in t he order of tbe
traversal of corresponding nodes and edges of G*. At
each step of the traversal of G, a node and nev edge
of G' is found which matcb tbe node and edge of G.
If tvo nodes match, they are !§§i9~g to each other
-
69
and a record of the matching nodes and edqes is kept
for each assignment in a backtrack list. To
establish a 1-to-1 correspondence, nodes of one graph
which are previously assigned can only match
cor~esponding assigned nodes of the other graph.
4. If there is no node and edge of G' which matches a
node and edge of G, the procedure backtracks to the
previous nodes and edges on the backtrack list,
erasing the last nodes and edges on the backtrack
list and the assignments associated with the nodes.
~nother node and edge of G' are selected which match
the last node and edge of G on the tacktrack list and
the traversal of G continues. If no node and edge of
G' can be found at this point, then the procedure
again backtracks until a new match is found or the
backtrack list is exhausted.
5. If the traversal of G is complete, then G covers
(intersects) Gte If the backtrack list is eXhausted,
then G does not cover (intersect) G'.
A feature is included which finds all subgraphs of G'
which are isomorphic to G. This feature is used in section
5.7 and in adding restrictions to c-formulas.
-
10
6. If the trayersal of G is complete, then tbe current
set of assiqnments is the desired I'appinq. To find '"
the next isomorph i SIll, the procedure returns to step "
assuminq that the last nodes and edqes on the
backtrack list did flot aatcb.
5.5 Trimminqa Set of c-formulas
Irimmins is the process of selecting the "lISTll best
elements of a set of c-formulas vith reqard to a user defined
criterion. The user specifies tbe cost functions vhich are to
be used, the order in vhich ther should be applied, and tbe
tolerance associated vith each cost function. IJlplemented
cost functions are:
1. The number of eyents of the current set 11 vhich are
coyered by a c2-formula. (The neqatiYe of this
quantity is used to obtain a cost., This function
minimizes the number of c-formulas in CQ.
2. The number of selectors in a c2-formula. The
function minimizes the number of selectors in the
c-formula.
-
71
3. The number of eyents of PO which intersects with a
c2-formula. This function leads more rapidly to
consistent c-formulas.
The total cost of all functions contained in a
c2-formula.
s. The number of eYents of the original set P1 which are
covered by a c2-formula. (The negative of this
quantity is used to obtain a cost.) This function
finds the most representative c-formulas.
A set of c-formulas is trimmed using n cost functions
(cf ,cf , ••• cf ) and relatiye tolerance for each cost 1 2 n
function (tol ,tol , ••• tol). The costs are applied in the 1 2 n
order specified by the user (cf first, cf second, etc). For 1 2
each cost function cf., the ftAXSTAR best c2-formulas along 1
with all c2-formulas equivalent in cost to the ftAXSTAR best
c-formulas are passed to the eyaluation using the next cost
function cf • Other c2-formulas in the set of c-formulasi+1
are discarded. With the last specified cost function (cf ), n
only the MAISTAR best c-formulas are retained.
For each cost function cf, i=1,2, ••• ,n, equiyalencei
of tvo c-formuas in cost is defined using an absolute
-
i
72
tolerance (11'). Suppose the set of c-formulas P is composedi
of a list p,p , ••• p • After yalues for cost function cf 1 2 In
been evaluated cf (p ) for each c-formula the i j
1naximum and miniau. cost function yalues are deter.ined
cf i
(p ) max
and cf. (p . ). 1 .1n
1n absolute tolerance (11' )i
is
calculated Qsinq the user specified tolerance tol i
as
follows:
11' = tol • (cf. (p ) - cf. (p • ))i i 1.ax 1 m1n
The ftAISTAR c-formulas . of least cost are determined> and tbe
list reordered (p ,p , ••• ,p , •••• P,. If i!lAISTAR].
The set of c2-foraulas which remains is the desired trimmed
set of c2-formulas.
-
73
5.6 Pormation of a Set of Consistent Generalizations
1 star (denoted by !O) is formed which covers E1. (A
i~r wbicb covers E1 is a set of consistent c2-formulas which
cover E1., Tbe procedure begins by forming a partial star (P)
wbich contains a set of c-formulas each consisting of one
selector of E1. (Tbe £A!!i!A ~r may contain c2-formulas
wbich are not consistent witb respect to fO.) This partial
star is trimmed according to the user supplied optimality
criterion. The conjunction in_each c-formula whicb remains
after trimming is multiplied by each selectcr of E1 which is
directly connecte-d to it to form a new partial star.
Consistent c2-formulas are placed in !O. Tbe partial star is
again trimmed and nev selectors -added to each product until
tbe desired set !O of c2-formulas is ottained. Seyeral
paramenters control the sizes of sets in tbis procedure:
!lXST1R -- the number of c-formulas in a Fartial star after
trimming
NCOJISIST the mini~am number of consistent c2-formulas
wbicb must be in no
ALTER - the maximam number of new alternatives whicb may be
formed by adding selectors to an element of a partial
star.
-
1:n the following discussion, equivalence type
selectors (i.,e•• selectors of the form [fex,x ,=same]) are 1 2
treated- differently from selectors involving a function
'symbol and a set of values in tbe reference.
1. A partial star P is formed which contains all
selectors of E1 with unary functions.
2. P is trimmed to contain only the best !lISTAR
c2-formulas. Consistent c-formulas are placed into
!O. If fewer than NCO»SIST elements are in !O, then
step 3 is executed. Otherwise, the 10 procedure is
applied to the elements of 80 (as described in
section 5.7).
3. A new partial star P' is formed frolS tbe old one (P).
Por each element P. in P, a list ot" all variables 1
(i.e., arguments of selectors of p.) is formed. All 1
arguments of equivalence type selectors which occur
in the corresponding selector of 11 are also included
in the list.
4. A list of all selectors of E1 whicb are not already
in and which haye at least one argument in the
-
75
variable list (found in step 3) is created. If there
are more than ALTER elements in this list, the best
ALTER selectors are retained (using as a criterion
tbe cost of fUnctions in the selectors,.
5. For each of these selectors, a nev c2-formala is
formed vhich -contains the original selectors in p. 1
and the nev selector. If the nev c-formula contains
an equivalence selector vith on11 one argument, then
the nev c-formula is discarded; otherwise, it is
placed in pt. Steps 2 throug~ 5 are repeated, setting
p = p' in s~ep 2 until ReORSIST ele~ents are in "Q or until no new elements_are in the new partial star pt.
5.7 Extending tbe ~eferences of a Consistent c2-formula
Each consistent c2~formula ot !Q (ottained in section
5.6 step 2) contains an alternative, near oFtimal conjunction
of selectors of 21 whicb distingaishes E1 from an, c-formala
of PO. Using some methQds developed for the program lQ1, the
reference of each of these selectors may be generalized to
obtain a consistent c2-formula which vill possibl, cover
possibly more c-formulas of Pl.
Given a graph G of a conistent c2-formula mq in "Q, a
~!~ct~g is created (G$) b, replacing all references of
-
16
nodes of G witb * (the co~lete set of values for the
function in the selector). the nodes of G* are enumerated n* i
(i=1 ,2,••• ,II) and a ,t. system is created vith each 'I.1 1 .variable x related to a node n* of G*. tt.e do.ain (denoted
i i
D. ) of variable x is tbe same as the domain of the fUnction 1. i
or variable in the node n* •
i
The ,I. events space may be d~ined:1 .
E = D x D x ••• x D • 1 2 n
Tvo sets of 'L coaplexes I. and I.' are formed from the events 1
of the current set P1 and the set PO respectively.
Individual complexes in these sets are denoted 1 and I' • i i
Bach compelex covers a set of points in t he space E.' Por
each element of P1 and PO all isomorphisms from the
c-structure G* to the qraph representation G of a c-formula
in P1 or PO are determined. Por each isomorphism obtained
from P1 and PO, a VL complex is created (lor I' ).1 i i
Denotinq the value sets of the nodes in a subqraph of G which
is isomorphic to G* as a ,a , ••• ,a • the correspondinq ,I.12m 1
complex may be written:
[x =B ][ x =B ] ••• [x =B ].1 1 2 2 II II
This complex covers the ,I. events: 1
-
77
B x R z ••• z R .1 2 m
in the event space !.
First. the complex 1 wbich results from extracting1
tbe values from tbe nodes of the grapb of.q is generated.
Tben all other isomorphisms from G* to c-fcrmulas of F1 are
determined and a complex added to L for eacb isomorphism
whicb results in a new complex that is not already in L. The
set Llis created ina similar manner by generating all
distinct complexes resulting from isomorphisms from G* to
c-formulas of PO.
Since tbe c-formula mq is consistent with regard to
PO, the complex 1 is dis10int from all complexes in L'. 1
(Tbat is, there is no point in the VL event space! which is 1
in both 1 and a complex of L'.) A near optimal extension of 1
1 against L' in B may be calculated us