The Alkemy Source Book - Australian National...

The Alkemy Source Book

Version Sep 2009

Kee Siong NgThe Australian National University

Contents

1 High-level System Description 11.1 Module Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 System Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Types, Terms and Escher 92.1 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Term Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.2 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.1.3 Sharing of Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.1.4 Free and Bound Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.1.5 Variable Renaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.1.6 Term Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.2 Term Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2.1 Internal Rewrite Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2.2 Computing and Reducing Candidate Redexes . . . . . . . . . . . . . . . . . 552.2.3 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

2.3 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752.3.1 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852.3.2 Type Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

2.4 Function Symbol Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

3 Predicate Construction 1003.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1003.2 Transformations and Standard Predicates . . . . . . . . . . . . . . . . . . . . . . . 1003.3 Regular Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1113.4 Predicate Rewrite Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1143.5 Predicate Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1193.6 Miscellaneous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1253.7 Module Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

4 Training and Test Individuals 132

5 A Decision-Tree Learner 1445.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1445.2 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

5.2.1 struct distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1445.2.2 struct treenode_t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1495.2.3 struct olnode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

5.3 Alkemy Module Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1575.4 Learning Wrappers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1595.5 Tree Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

5.5.1 Initialise Learner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

I

CONTENTS II

5.5.2 Clean Up Learner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1635.5.3 Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

5.6 Tree-Induction Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1675.6.1 BuildTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1715.6.2 Build Decision Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

5.7 Predicate Evaluation via Table Lookup . . . . . . . . . . . . . . . . . . . . . . . . . 1865.8 Error Complexity Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1945.9 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

6 User Interface 2066.1 The System Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

6.1.1 Data Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2076.1.2 Training Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2116.1.3 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2136.1.4 Predicate Rewrites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2176.1.5 Learning Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2206.1.6 Term Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2206.1.7 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

6.2 The Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

7 Administration Overhead 2327.1 File Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2327.2 Learning Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2347.3 IO Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

8 A Listing of the Code Chunks 248

References 253

Chapter 1

High-level System Description

Nothing is particularly hard if you divide it into small jobs.Henry Ford

1.1 Module Guide

Comment 1.1.1. The system can be logically partitioned into six main modules. Each of them,together with their secondary modules in the form of actual C++ files, are described briefly in thefollowing.

Types, Terms, and Escher This module is concerned with the underlying knowledge repre-sentation language and computational model of Alkemy. The language is based on higher-orderlogic [Llo03], which is in turn based on Church’s λ calculus [Chu40, Bar87]. The computationalmodel is based on Escher [Llo99], a functional logic programming language similar to Haskell withextra facilities for supporting logic programming idioms. We discuss the representation of typesand terms as defined in higher-order logic. We also describe how equational reasoning in the styleof Escher works.

terms.cc This module describes how terms are represented and the different operations for ma-nipulating them.

types.cc This module describes how types are represented and the different operations for ma-nipulationg them.

pattern-match.cc This module describes the pattern-matching algorithm for terms. This oper-ation plays an important role in equational reasoning.

unification.cc This module describes the unification algorithm for types.

Predicate Construction This module is responsible for more functionalities than its namesuggests. In broad terms, it is concerned with all things predicate, from data structures forrepresenting predicates, to predicate evaluation, to predicate construction through rewriting. Thefour submodules are as follows:

predicate.cc This module defines and implements the class representations for transformationsand predicates. For ease of manipulation, surrogate classes for the two are implemented.

evaluator.cc This module is responsible for performing predicate evaluation on individuals inthe data set.

1

1.2. SYSTEM DYNAMICS 2

rewrite.cc This module is concerned with predicate construction. It implements a public func-tion, used by the decision-tree learner, that takes as input a predicate and returns all itschildren obtainable through rewrites. The data structures used for the storage and manip-ulation of predicate rewrite systems are also implemented in this file.

A Decision-tree Learner (alkemy.cc) This is the module responsible for constructing anddisplaying decision trees. A greedy algorithm is used to recursively partition the training in-stances into purer and purer nodes. At each stage, with the help of the predicate constructorand the training set, the learner explores a hypothesis space using a best-first search based on anaccuracy-guided heuristic. By default, a tree post-pruner based on the error-complexity algorithmintroduced in CART [BFOS84] is used to prune the induced tree at the end. Both classificationand regression trees can be induced from data.

Datasets Management (trainset.cc) This module is primarily concerned with the mainte-nance of the training, validation and test sets in support of the decision-tree learner. Among otherthings, it implements public access functions to the individuals, indexed by their identifiers; therandom partitioning of the data set into a training set and a validation set; and the cleaning upof the data set at the end of learning.

User Interface The system takes as input a problem specification file that describes in formalterms what the learning task is. Among other things, it specifies what the individuals in the dataset look like, and the form and the structure of the hypothesis space. This module, implementedin lex and yacc [LMB92], is responsible for the specification of the grammar of the input file, andthe implementation of the associated actions for the different directives and statements in the file.

inputproc.l This is the lex file implementing the scanner for use in support of the system parser.

inputproc.y This is the yacc file implementing the system parser.

Administration Overhead This module is concerned with all the administrative overheadsassociated with the running of the system.

global.cc This is the common room of the different modules. Located in this file are some datastructures recording information needed by almost all the modules, including learning op-tions and information related to types, transformations and constants. Also implemented inthis file are miscellaneous support mechanisms like arbitrary-precision integers and memorymanagement facilities.

main.cc This file implements the main function. Its job is to process the input specification file,initialise all the different modules and data stores accordingly, before firing up the decision-tree learner to do the actual tree induction.

1.2 System Dynamics

Comment 1.2.1. We now look at system dynamics. The system first initialises the various datastores, including the training and test examples and the types and transformation data storesin the global space. Then it initialises the decision-tree learner before starting off the learningprocess. Finally, upon completion of learning, it reports the results obtained and clean up all thedata stores and temporary data structures. The detail steps are now described. But first the mainmodule.


2 〈main.cc 2〉≡#include <iostream>#include <map>#ifndef __APPLE__

#ifndef __sun

#include <getopt.h>#endif#endif#include <string.h>#include <signal.h>#include <stdio.h>#include <stdlib.h>#include "global.h"

#include "trainset.h"

#include "rewrite.h"

#include "alkemy.h"

using namespace std;

extern FILE ∗ yyin; extern int yyparse();void out_of_memory() cerr ≪ "Memory exhausted."≪ endl; exit(1);

int main(int argc, char ∗∗argv) for (int i=1; i6=argc; i++) commandline = commandline + argv[i] + " ";〈main::System Initialisation and Startup 3a〉〈main::Clean Up 8〉return 0;

Defines:main, never used.out_of_memory, used in chunk 3a.

Uses alkemy.h 157, commandline 234 235, global.h 232, rewrite.h 131b, trainset 135a, and trainset.h 133.

Comment 1.2.2. The external variable yyin and function yyparse serves as the interface to thesystem parser.

Comment 1.2.3. In the initialisation phase, we process the learning options and the spec file inturn.

3a 〈main::System Initialisation and Startup 3a〉≡ (2)

set_new_handler(out_of_memory);〈main::Process Learning Options 3b〉initialise_constants();〈main::Process Specification File 7b〉assert(getTrSize() ≤ 1000000); // 1 million is the initial value of purity

Uses getTrSize 134 140b, initialise_constants 243c, and out_of_memory 2.

3b 〈main::Process Learning Options 3b〉≡ (3a)

〈main::initialise with default options 4a〉〈main::change to specified options 5〉


Comment 1.2.4. The different options are discussed later in Comment 7.2.1.

4a 〈main::initialise with default options 4a〉≡ (3b)

options.verbosity = 0; options.strategy = LR;options.i_prune = 0; options.prune = 0;options.stump = false; options.cutout = -1;options.ollength = 0; options.enumSpace = false;options.test_percentage = 0; options.exp_count = 1;options.crossvalidate = -5; options.valid = -5;options.balance = 0; options.recursive = false;options.FAPtable_length = 1000000; options.FAPtable_entry_length = 30;options.postprune = false; options.seed = 0;options.boostN = 0; options.purity = 1000000;options.foldnumber = -5; options.one_redex = false;options.learning_mode = CLASSIFICATION;options.decision_list = false;options.margin = -1;options.pos_only = false;options.typeCheck = true;

Uses CLASSIFICATION 234, LR 234, and options 234 235.

Comment 1.2.5. This is the const char argument to the getopt function.

4b 〈getopt argument 4b〉≡ (5)

"p:vhnc:t:s:l:xreF:d:S:PC:z:V:m:B:a:o:DM:OT"


5 〈main::change to specified options 5〉≡ (3b)

char c ; int num; unsigned int unum; float fnum;while ((c = getopt(argc, argv, 〈getopt argument 4b〉)) 6= EOF)

switch (c) 〈GMP sensitive options 6a〉case ’h’: 〈command line help menu 7a〉case ’t’: if (sscanf(optarg, "%d", &num) ≡ 1)

assert(num > 0 ∧ num < 100); options.test_percentage = num; break;

case ’o’: if (sscanf(optarg,"%d",&num)≡1) options.ollength = num; break;case ’s’: if (sscanf(optarg,"%u",&unum)≡1) options.seed = unum; break;case ’n’: options.stump = true; break;case ’l’: if (sscanf(optarg,"%d",&num)≡1) options.balance=num; break;case ’p’: if ((sscanf(optarg, "%f", &fnum) ≡ 1) ∧

(fnum ≥ -1.0) ∧ (fnum ≤ 101.0)) options.prune = fnum ; options.i_prune = fnum ;

else cerr≪"Prune must be between -1 and 101\n";exit(1);break;

case ’v’: options.verbosity++; break;case ’r’: options.recursive = true; break;case ’c’: if (sscanf(optarg, "%d", &num) ≡ 1) options.cutout = num; break;case ’F’: if (sscanf(optarg, "%d", &num) ≡ 1)

options.FAPtable_length = num; break;case ’d’: if (sscanf(optarg, "%d", &num) ≡ 1)

options.FAPtable_entry_length = num ; break;case ’P’: options.postprune = true; break;case ’C’: if (sscanf(optarg, "%d", &num) ≡ 1)

if (num > 1) options.crossvalidate = num;else cerr ≪ "Cross validation value must be >1.\n";

exit(1); break;

case ’z’: if (sscanf(optarg, "%d", &num) ≡ 1) options.foldnumber=num; break;case ’V’: if ((sscanf(optarg, "%d", &num) ≡ 1) ∧

(num ≥ 1) ∧ (num ≤ 101)) options.valid = num;

else cerr≪"Valid must be between 1 and 100\n"; exit(1);break;

case ’m’: if (sscanf(optarg, "%u", &unum) ≡ 1) options.exp_count = unum;break;

case ’B’: if (sscanf(optarg, "%u", &num) ≡ 1) options.boostN = num; break;case ’a’: if (sscanf(optarg, "%u", &num) ≡ 1) options.purity = num; break;case ’D’: options.decision_list = true; break;case ’M’: if (sscanf(optarg, "%u", &num) ≡ 1) options.margin = num; break;case ’O’: options.pos_only = true; break;case ’T’: options.typeCheck = false; break;

〈main::incompatible options 6b〉

Uses options 234 235.


6a 〈GMP sensitive options 6a〉≡ (5)

case ’e’: options.enumSpace = true;

#ifdef NO_GMP

cerr ≪ "This option is not available "

"because GMP is not installed.\n";exit(1);

#endifbreak;

case ’S’: if (sscanf(optarg, "%d", &num) ≡ 1) assert(num ≥ 0 ∧ num ≤ 1);options.strategy = num;

#ifdef NO_GMP

if (num ≡ 1) cerr ≪ "The specified strategy is not "

"supported because GMP is not installed.\n";exit(1);

#endif

break;


6b 〈main::incompatible options 6b〉≡ (5)

if (options.i_prune < 0 ∧ options.FAPtable_length 6= 0) cerr ≪ "You have turned off pruning. You may wish to set "

"FAPtable_length to zero as well since the correct "

"working of the table lookup algorithm uses the same "

"assumptions underlying the pruning mechanism.\n";exit(1);

if (options.postprune ∧ options.valid ≡ -5)

cerr ≪ "Hhmm... you specified that I should perform a tree post"

"-pruning at the end of learning, but you did not specify "

"the size of the validation set. As a default, I have set "

"that to 10% of the training set. Have a look at the -V flag "

"if this is not what you want.\n";options.valid = 10;

// cross validation + test set is not allowedassert(¬(options.crossvalidate 6= -5 ∧ options.test_percentage > 0));



7a 〈command line help menu 7a〉≡ (5)

cerr ≪ "Usage : " ≪ argv[0] ≪ " [options] <spec file>\n"

"Basic Options:\n"

" -h : Help \n"

" -v : Level of verbosity (-v verbose; -vv very verbose)\n"

" -p : Set the prune parameter. Default is 0.\n"

" -n : Learn a decision stump.\n"

" -c : Set the cutout parameter.\n"

" -P : Do tree post-pruning.\n"

" | -V : Choose n% of trainset as validation set.\n"

" -C : Do a n-fold cross-validation.\n"

" | -z : Do the z-th fold only.\n"

" -s : Set the seed for partitioning.\n"

" -t : Set the percentage of data to use as test set. Default is 0.\n"

" | -m : The number of times to do a leave-n-out run.\n"

" -e : Compute the size of the search space.\n"

" -B : Learn by boosting up to m trees.\n"

" -D : Learn a decision list.\n"

" -M : Margin needed for node splitting.\n"

" -O : Learning from one class only.\n"

" -T : Do not type check.\n"

"Incompatible options"

" -C and -t\n"

"Advanced Options:\n"

" -F : Set the max FAP table size.\n"

" -d : Set the max size of predicate depth in the FAP table.\n"

" -o : Set the max openlist length. Default is unlimited.\n"

" -S : Change the predicate enumeration strategy. LR: 0, Expected: 1\n"

" -l : Set the balance parameter.\n"

;exit(0);

Uses LR 234, openlist 157, options 234 235, trainset 135a, and verbose 242.

Comment 1.2.6. Here, we call the parser to process the input specification file. On the successfulreturn of the function call yyparse(), we would have available four kinds of information.

1. Type objects would have been created for all the declared types. These are accessible viaget-type() calls from the global data area.

2. All the training examples would have been stored in the individuals module.

3. The rewrites would have been properly setup.

4. The learning options would have been initialised based on the input specification file.

The parser will start off the decision-tree learning process after it has finished parsing the inputspecification file. See Comment 6.1.2.

7b 〈main::Process Specification File 7b〉≡ (3a)

FILE ∗ specfile = fopen(argv[argc-1], "r"); assert(specfile);yyin = specfile;int succ = yyparse(); if (succ ≡ 1) exit(1);fclose(specfile);// cout << "Train set:" << endl; printTrainset();// cout << "Test set:" << endl; printTestset();

Uses printTestset 134 142b and printTrainset 134 142b.


Comment 1.2.7. Upon completion of learning, we clean up the data stores, freeing allocatedmemories as necessary.

8 〈main::Clean Up 8〉≡ (2)

cleanup_trainset();cleanup_rewrite();cleanup_statements();cleanup_trans_table();cleanup_typeobjs();

Uses cleanup_rewrite 116a 118c, cleanup_statements 243b, cleanup_trainset 134 143, cleanup_trans_table237b, and cleanup_typeobjs 238b.

Chapter 2

Types, Terms and Escher

2.1 Terms

2.1.1 Term Representation

Comment 2.1.1. We begin with the data structure for representing terms.

9 〈terms.h 9〉≡#ifndef _TERM_H

#define _TERM_H

#include <iostream>#include <string>#include <vector>#include <utility>#include <cassert>#include <stdlib.h>#include <ctype.h>#include <math.h>#include "io.h"

#include "tables.h"


#define uint unsigned int

struct term_schema;

〈term-schema::definitions 16b〉〈term-schema::supporting types 16c〉typedef vector<int> occurrence;〈term-schema::type defs 10b〉

struct term_schema 〈term-schema parts 10c〉〈term-schema::constructors 12a〉〈term-schema::function declarations 11a〉

;〈term-schema::memory management 17a〉〈term-schema::external functions 13d〉

#endif

9

2.1. TERMS 10

Defines:occurrence, never used.term_schema, used in chunks 11–15, 17–25, 27–32, 34–40, 42–44, 46–55, 57–60, 62–68, 70, 71a, 75, 85b, 90a, 95c,

96a, 103, 105, 132, 153b, 154a, 172b, 189–91, 212a, 215, 216, 220, 222–25, and 241.terms.h, used in chunks 10a, 75c, 85b, 95c, 130c, 132, 189b, 206, 232, 236a, and 241.

Uses io.h 246 and tables.h 96b.

10a 〈terms.cc 10a〉≡#include "terms.h"

〈terms.cc::local functions 14a〉〈term-schema::function definitions 11b〉

Uses terms.h 9.

Comment 2.1.2. We use a standard approach to represent terms. A term is a graph of nodes,where each node is a term-schema as defined. One possible optimization is to distinguish betweenboxed and unboxed fields [Pey87, pg. 190]. For a discussion on term representations, see [Pey87,Chap. 10].

Comment 2.1.3. A term schema can be any one of the following: a syntactical variable (SV), avariable (V), a function symbol (F), a data constructor (D), an application (APP), an abstraction(ABS), or a product (PROD). This information is recorded in tag.

10b 〈term-schema::type defs 10b〉≡ (9) 31a ⊲

enum kind SV, V, F, D, APP, ABS, PROD ;

10c 〈term-schema parts 10c〉≡ (9) 10d ⊲

kind tag;

Defines:tag, used in chunks 10–16, 19b, 20b, 23–28, 30, 32, 34, 56–59, 65–68, 70c, 71a, 77a, 79–82, 84c, 85a, 91c, 190a,

and 223a.

Comment 2.1.4. Syntactic variables, variables, functions and data constructors have names.

10d 〈term-schema parts 10c〉+≡ (9) ⊳ 10c 11c ⊲

string name;

Comment 2.1.5. This is used to check that the correct kind of terms have names.

10e 〈term has name 10e〉≡ (10f 19b)

tag ≥ SV ∧ tag ≤ D

Uses tag 10c.

Comment 2.1.6. Terms with names are called atomic terms. Terms that does not have namesare called composite terms.

10f 〈atomic term 10f〉≡〈term has name 10e〉

2.1. TERMS 11

11a 〈term-schema::function declarations 11a〉≡ (9) 11d ⊲

bool isF() return (tag ≡ F); bool isF(string f);bool isApp() return (tag ≡ APP); bool isD() return (tag ≡ D); bool isD(string d) return (tag ≡ D ∧ name ≡ d); bool isNotD() return (tag 6= D); bool isVar() return (tag ≡ V); bool isVar(string v) return (tag ≡ V ∧ name ≡ v); bool isAbs() return (tag ≡ ABS); bool isProd() return (tag ≡ PROD);

Defines:isAbs, used in chunks 36d, 44b, and 94a.isApp, used in chunks 13, 49a, 51b, 54c, 61a, 62c, and 92b.isD, used in chunks 40a, 43, 51c, 54d, 57a, 91a, and 190a.isF, used in chunks 13c, 49a, 51, 54, 58c, 59b, 91a, and 216d.isNotD, used in chunk 38b.isProd, used in chunks 37a and 94c.isVar, used in chunks 36, 47b, 53b, 91c, 92a, and 94a.

Uses tag 10c.

Comment 2.1.7. Underscore is used as a wildcard that stands for anything.

11b 〈term-schema::function definitions 11b〉≡ (10a) 11e ⊲

bool term_schema::isF(string f) return (tag ≡ F ∧ (f ≡ "_" ∨ name ≡ f)); Defines:

isF, used in chunks 13c, 49a, 51, 54, 58c, 59b, 91a, and 216d.Uses tag 10c and term_schema 9.

Comment 2.1.8. The parameters tag and kind does not have default values. They are initializedin the constructor code with pass-in values.

Comment 2.1.9. Application, abstraction and product terms have subterms. These are capturedin the vector fields.

11c 〈term-schema parts 10c〉+≡ (9) ⊳ 10d 12b ⊲

vector<term_schema ∗> fields;Defines:

fields, used in chunks 11, 13–15, 19, 20b, 23–25, 27–30, 32b, 33d, 36–38, 40a, 44b, 46a, 48, 49, 51–55, 57–59,61b, 64–68, 73, and 94.

Uses term_schema 9.

11d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 11a 12f ⊲

term_schema ∗ leftc() /∗ assert(tag == APP); ∗/ return fields[0]; term_schema ∗ rightc() /∗ assert(tag == APP); ∗/ return fields[1]; void setSubterm(term_schema ∗ t, uint i);void insert(term_schema ∗ t) fields.push_back(t);

Defines:insert, used in chunks 14, 49b, 79a, 80c, 103b, 105b, 123c, 147b, 190a, 212a, 215a, 216c, and 222–25.leftc, used in chunks 13, 14, 36a, 40a, 43, 44b, 46–49, 51–55, 59, 60, 62c, 72, 92b, 93a, and 105b.rightc, used in chunks 36a, 40a, 43, 44b, 46, 47, 49, 51–55, 59, 60, 62c, 72, 92b, and 93b.setSubterm, used in chunk 60.

Uses fields 11c, tag 10c, and term_schema 9.

11e 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 11b 13a ⊲

void term_schema::setSubterm(term_schema ∗ t, uint i) assert(i < fields.size()); fields[i] = t;

Defines:setSubterm, used in chunk 60.

Uses fields 11c and term_schema 9.

2.1. TERMS 12

12a 〈term-schema::constructors 12a〉≡ (9)

term_schema(kind k) tag = k; 〈term-schema initializations 12c〉 term_schema(kind k, char ∗ n)

tag = k; assert(n); name = n; 〈term-schema initializations 12c〉 term_schema(kind k, string n)

tag = k; name = n; 〈term-schema initializations 12c〉 Uses tag 10c and term_schema 9.

Comment 2.1.10. Certain basic data constructors like numbers can best be dealt with in theiroriginal machine representations. (Otherwise, a lot of conversions from and to strings are needed.)The variable num replaces the name field for numbers.

12b 〈term-schema parts 10c〉+≡ (9) ⊳ 11c 15b ⊲

bool isfloat, isint;long int numi;double numf;bool isstring;

Defines:isfloat, used in chunks 12, 15, 18a, 19b, 38a, 40, 41, 43, 70c, and 91a.isint, used in chunks 12, 15, 18a, 19b, 38a, 40, 41, 43, 70c, and 91a.isstring, used in chunks 12, 14d, 18a, 70c, and 91a.numf, used in chunks 12, 15, 18a, 19b, 38a, 40, 41, 43, and 70c.numi, used in chunks 12, 15, 18a, 19b, 38a, 40, 41, 43, and 70c.

12c 〈term-schema initializations 12c〉≡ (12a) 15c ⊲

isfloat = false; isint = false; isstring = false;Uses isfloat 12b, isint 12b, and isstring 12b.

12d 〈term-schema clone parts 12d〉≡ (19b) 16f ⊲

if (tag ≡ D) isfloat = ret→isfloat; isint = ret→isint;numi = ret→numi; numf = ret→numf;isstring = ret→isstring;

Uses isfloat 12b, isint 12b, isstring 12b, numf 12b, numi 12b, and tag 10c.

12e 〈term-schema replace parts 12e〉≡ (20b) 16g ⊲

if (t→tag ≡ D) isfloat = t→isfloat; isint = t→isint;numi = t→numi; numf = t→numf;isstring = t→isstring;

Uses isfloat 12b, isint 12b, isstring 12b, numf 12b, numi 12b, and tag 10c.

Comment 2.1.11. A term of the form (t1 (t2 · · · (tn−1 tn) · · · )) can be visualized to take on theshape of a spine. (Draw it!) The (leftmost) term t1 is called the tip of the spine. At differentplaces throughout a computation, we need to access the leftmost term in a nested applicationnode, and the following two functions provide this service. The input x to the second functionwill get assigned the value n− 1. We currently perform a (linear) traversal down the spine. It ispossible to make this go faster if necessary.

12f 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 11d 13b ⊲

term_schema ∗ spineTip();term_schema ∗ spineTip(int & x);

Defines:spineTip, used in chunk 216d.

Uses term_schema 9.

2.1. TERMS 13

13a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 11e 13c ⊲

term_schema ∗ term_schema::spineTip() if (tag 6= APP) return this;term_schema ∗ p = fields[0];while (p→isApp()) p = p→fields[0];return p;

term_schema ∗ term_schema::spineTip(int & x)

if (tag 6= APP) x = 0; return this; x = 1;term_schema ∗ p = fields[0];while (p→isApp()) p = p→fields[0]; x++; return p;

Defines:spineTip, used in chunk 216d.

Uses fields 11c, isApp 11a, tag 10c, and term_schema 9.

Comment 2.1.12. We do not currently use spineTip during computation. For efficiency rea-sons, the spine tips of every subterm of the current query term is calculated up-front using thelabelVariables function (see Comment 2.1.38) at present. The function spineTip is only calledby the parser for error checking.

Comment 2.1.13. The following function checks whether the current term has the general form((f t1) t2), where f is given as input. If spinetip has already been computed, we can do thingsslightly faster.

13b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 12f 14b ⊲

bool isFunc2Args(string f);

Defines:isFunc2Args, used in chunks 47, 52–55, and 61a.

13c 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 13a 14d ⊲

bool term_schema::isFunc2Args(string f) if (spinetip)

if (spinelength ≡ 3 ∧ spinetip→isF(f))return true;

if (isApp() ∧ leftc()→isApp() ∧ leftc()→leftc()→isF(f))return true;

return false;

Defines:isFunc2Args, used in chunks 47, 52–55, and 61a.

Uses isApp 11a, isF 11a 11b, leftc 11d, spinelength 21, spinetip 21, and term_schema 9.

Comment 2.1.14. The following function creates a new term having the form ((f t1) t2) wheref (given) is a function symbol of arity two. The arguments t1 and t2 needs to be initialized bythe calling function.

13d 〈term-schema::external functions 13d〉≡ (9) 29b ⊲

extern term_schema ∗ newT2Args(kind k, string f);

Uses newT2Args 14a and term_schema 9.

2.1. TERMS 14

14a 〈terms.cc::local functions 14a〉≡ (10a) 17b ⊲

term_schema ∗ newT2Args(kind k, string f) term_schema ∗ ret = new_term(APP);ret→insert(new_term(APP));ret→leftc()→insert(new_term(k, f));return ret;

Defines:newT2Args, used in chunks 13d, 37a, 38a, 49b, and 225.

Uses insert 11d, leftc 11d, new_term 17b, and term_schema 9.

Comment 2.1.15. The following function initializes the two arguments of a term created usingnewT2Args.

14b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 13b 14c ⊲

void initT2Args(term_schema ∗ t1, term_schema ∗ t2) leftc()→insert(t1); insert(t2);

Defines:initT2Args, used in chunks 37a, 38a, and 225.

Uses insert 11d, leftc 11d, and term_schema 9.

Comment 2.1.16. The following function checks whether two terms are equal to each other.This is currently only used in debugging code.

14c 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 14b 15d ⊲

bool equal(term_schema ∗ t);

Defines:equal, used in chunks 63d and 71a.

Uses term_schema 9.

14d 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 13c 15e ⊲

bool term_schema::equal(term_schema ∗ t) if (tag 6= t→tag) return false;if (name 6= t→name) return false;if (isstring 6= t→isstring) return false;〈term schema::equal::numbers 15a〉uint size1 = fields.size();uint size2 = t→fields.size();if (size1 6= size2) return false;for (uint i=0; i6=size1; i++)

if (fields[i]→equal(t→fields[i]) ≡ false)return false;

return true;

Defines:equal, used in chunks 63d and 71a.

Uses fields 11c, isstring 12b, tag 10c, and term_schema 9.

Comment 2.1.17. We treat numbers in a slightly peculiar way. We will equate an integer anda floating-point number (even though the types do not agree) if they are the same number. Wedo this because the internal arithmetic of Escher can add, subtract, multiply and divide integersand floating-point numbers to produce another floating-point number. See Comment 2.2.12.

2.1. TERMS 15

15a 〈term schema::equal::numbers 15a〉≡ (14d)

if (isint ∧ t→isint ∧ numi 6= t→numi) return false;if (isint ∧ t→isfloat ∧ (double)numi 6= t→numf) return false;if (isfloat ∧ t→isint ∧ numf 6= (double)t→numi) return false;if (isfloat ∧ t→isfloat ∧ numf 6= t→numf) return false;

Uses isfloat 12b, isint 12b, numf 12b, and numi 12b.

Comment 2.1.18. This is used for marking and printing redexes.

15b 〈term-schema parts 10c〉+≡ (9) ⊳ 12b 16d ⊲

bool redex;Defines:

redex, used in chunks 15, 61b, 63, 72b, and 130b.

15c 〈term-schema initializations 12c〉+≡ (12a) ⊳ 12c 16e ⊲

redex = false;

Uses redex 15b.

Comment 2.1.19. The variable redex does not really play a part during cloning and reusing.

Comment 2.1.20. A term is printed in the way it is represented. The redex (if one exists) ismarked out using square brackets. Shared nodes are also marked with their reference count.

15d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 14c 19a ⊲

void print();

15e 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 14d 19b ⊲

void term_schema::print() if (getSelector() ≡ SILENT) return;if (redex) ioprint(" [[[ ");// if (refcount > 1) ioprint("_s_");if (tag ≥ SV ∧ tag ≤ F) ioprint(name);else if (tag ≡ D)

if (isfloat) ioprint(numf);else if (isint) ioprint(numi);else ioprint(name);

else if (tag ≡ APP) ioprint("("); fields[0]→print(); ioprint(" ");fields[1]→print(); ioprint(")");

else if (tag ≡ ABS) ioprint("\\"); fields[0]→print(); ioprint(".");fields[1]→print();

else if (tag ≡ PROD) int size = fields.size();if (size ≡ 0) ioprint("()"); return; ioprint("(");for (int i=0; i6=size-1; i++)

fields[i]→print(); ioprint(","); fields[size-1]→print(); ioprint(")");

else 〈print error handling 16a〉 if (redex) ioprint(" ]]] ");

Uses fields 11c, getSelector 246 247a, ioprint 246 247a, isfloat 12b, isint 12b, numf 12b, numi 12b,redex 15b, refcount 20c, SILENT 246, tag 10c, and term_schema 9.

2.1. TERMS 16

16a 〈print error handling 16a〉≡ (15e)

cerr ≪ "Printing untagged term.\ttag = " ≪ tag ≪ endl;assert(false);

Uses tag 10c.

2.1.1.1 Constraints for Syntactic Variables

Comment 2.1.21. We have a (limited) syntax for specifying constraints on what sort of terms asyntactical variable can range over. (See the grammar for Escher.) Four types of constraints aresupported at present. The constraint CVAR forces a syntactical variable to range over variablesonly; CCONST forces a syntactical variable to range over constants only. The constraint CEQUALdictates that the value of one syntactical variable must be equal to the value of one other; Theconstraint CNOTEQUAL dictates that the value of one syntactical variable must not be equal tothe value of one other. For details on how these constraints are implemented, see Comment 2.2.99.

16b 〈term-schema::definitions 16b〉≡ (9) 39b ⊲

#define CVAR 1

#define CCONST 2

#define CEQUAL 3

#define CNOTEQUAL 4

Defines:CCONST, used in chunks 71a and 223a.CEQUAL, used in chunks 71a and 223a.CNOTEQUAL, used in chunks 71a and 223a.CVAR, used in chunks 71a and 223a.

16c 〈term-schema::supporting types 16c〉≡ (9)

struct condition int tag; string name; ;

Defines:condition, used in chunks 16, 62b, 71a, 203, and 223.

Uses tag 10c.

16d 〈term-schema parts 10c〉+≡ (9) ⊳ 15b 20c ⊲

condition ∗ cond; // only applies to SV

Defines:cond, used in chunks 16, 19c, 71a, 222, and 223.

Uses condition 16c.

16e 〈term-schema initializations 12c〉+≡ (12a) ⊳ 15c 20d ⊲

cond = NULL;

Uses cond 16d.

16f 〈term-schema clone parts 12d〉+≡ (19b) ⊳ 12d 26c ⊲

if (cond) assert(tag = SV);ret→cond = new condition;ret→cond→name = cond→name; ret→cond→tag = cond→tag;

Uses cond 16d, condition 16c, and tag 10c.

16g 〈term-schema replace parts 12e〉+≡ (20b) ⊳ 12e 22c ⊲

if (cond) delete cond;cond = t→cond;

Uses cond 16d.

2.1. TERMS 17

2.1.2 Memory Management

Comment 2.1.22. We look at some memory management issues in this section. A naive schemerelying on new and delete is in use at the moment. It is not clear to the author whether a separateheap-allocating scheme would make the system go a whole lot faster.

Comment 2.1.23. We put wrappers around new and delete to collect some statistics. Theprocedure mem_report shows the total number of terms allocated and subsequently freed. This isused to check whether there is a memory leak.

17a 〈term-schema::memory management 17a〉≡ (9)

extern term_schema ∗ new_term(kind k);extern term_schema ∗ new_term(kind k, char ∗ s);extern term_schema ∗ new_term(kind k, string s);extern term_schema ∗ new_term_int(int x);extern term_schema ∗ new_term_float(float x);extern term_schema ∗ new_term_string(string x);extern void mem_report();

Uses mem_report 18c, new_term 17b, new_term_float 18a, new_term_string 18a, and term_schema 9.

17b 〈terms.cc::local functions 14a〉+≡ (10a) ⊳ 14a 18a ⊲

#ifdef DEBUG_MEM

static long int allocated = 0;static long int freed = 0;#endif// #define DEBUG_MEMterm_schema ∗ new_term(kind k)

〈memory debugging code 18b〉return new term_schema(k);

term_schema ∗ new_term(kind k, char ∗ s) 〈memory debugging code 18b〉return new term_schema(k, s);

term_schema ∗ new_term(kind k, string s) 〈memory debugging code 18b〉return new term_schema(k, s);

Defines:new_term, used in chunks 14a, 17a, 19b, 36–38, 43, 49b, 52–54, 103b, 105b, 190a, 216c, and 222–25.

Uses term_schema 9.

2.1. TERMS 18

18a 〈terms.cc::local functions 14a〉+≡ (10a) ⊳ 17b 18c ⊲

term_schema ∗ new_term_int(int x) 〈memory debugging code 18b〉term_schema ∗ ret = new term_schema(D);ret→isint = true; ret→numi = x; return ret;

term_schema ∗ new_term_int(long int x)

〈memory debugging code 18b〉term_schema ∗ ret = new term_schema(D);ret→isint = true; ret→numi = x; return ret;

term_schema ∗ new_term_float(float x)

〈memory debugging code 18b〉term_schema ∗ ret = new term_schema(D);ret→isfloat = true; ret→numf = x; return ret;

term_schema ∗ new_term_string(string x)

〈memory debugging code 18b〉term_schema ∗ ret = new term_schema(D);ret→isstring = true; ret→name = x; return ret;

Defines:new_term_float, used in chunks 17a, 19b, 40, 41, and 222.new_term_string, used in chunks 17a and 222.

Uses isfloat 12b, isint 12b, isstring 12b, numf 12b, numi 12b, and term_schema 9.

18b 〈memory debugging code 18b〉≡ (17b 18a)

#ifdef DEBUG_MEM

allocated++;#endif

18c 〈terms.cc::local functions 14a〉+≡ (10a) ⊳ 18a 29c ⊲

void delete_term(term_schema ∗ x) #ifdef DEBUG_MEM

freed++;#endif

delete x;void mem_report() #ifdef DEBUG_MEM

cout ≪ "\n\nReport from Memory Manager:\n";cout ≪ "\tAllocated " ≪ allocated ≪ endl;cout ≪ "\tFreed " ≪ freed ≪ endl;cout ≪ "\tUnaccounted " ≪ allocated - freed ≪ endl ≪ endl;

#endif

Defines:delete_term, used in chunk 19c.mem_report, used in chunk 17a.

Uses term_schema 9.

2.1. TERMS 19

19a 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 15d 20a ⊲

term_schema ∗ clone();void freememory();

Defines:clone, used in chunks 33b, 60, 63d, 77–79, 81–84, 86–88, 91a, 92a, 94–96, 102–105, 107–109, 122, 123b, 151a,

154c, 173a, 177–80, 190a, 195, 196a, 203, 208a, 215b, 217c, 219b, 227a, and 243c.freememory, used in chunks 20b, 33b, 36b, 52c, 54e, 55b, 60, 65–67, 102–105, 115b, 121–30, 132, 143, 151b,

153–55, 163b, 173c, 174a, 177–81, 183, 190, 195–97, 205, 236a, 237b, 239, 241, and 243b.Uses term_schema 9.

Comment 2.1.24. Cloning of a term with shared nodes will result in an identical term withoutshared nodes.

19b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 15e 19c ⊲

term_schema ∗ term_schema::clone() term_schema ∗ ret;if (isfloat) ret = new_term_float(numf);else if (isint) ret = new_term_int(numi);else if (〈term has name 10e〉) ret = new_term(tag, name);else ret = new_term(tag);〈term-schema clone parts 12d〉

int size = fields.size();for (int i=0; i6=size; i++)

ret→fields.push_back(fields[i]→clone());return ret;

Defines:

clone, used in chunks 33b, 60, 63d, 77–79, 81–84, 86–88, 91a, 92a, 94–96, 102–105, 107–109, 122, 123b, 151a,154c, 173a, 177–80, 190a, 195, 196a, 203, 208a, 215b, 217c, 219b, 227a, and 243c.

Uses fields 11c, isfloat 12b, isint 12b, new_term 17b, new_term_float 18a, numf 12b, numi 12b, tag 10c,and term_schema 9.

Comment 2.1.25. We explicitly free memory instead of relying on destructors. The functionfreememory must take node sharing into account. A term is in use while its reference count is stillnon-zero.

19c 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 19b 20b ⊲

void term_schema::freememory() refcount−−;〈freememory error checking 19d〉if (refcount 6= 0) return;int size = fields.size();for (int i=0; i6=size; i++)

if (fields[i]) fields[i]→freememory();if (cond) delete cond;delete_term(this);

Defines:

freememory, used in chunks 20b, 33b, 36b, 52c, 54e, 55b, 60, 65–67, 102–105, 115b, 121–30, 132, 143, 151b,153–55, 163b, 173c, 174a, 177–81, 183, 190, 195–97, 205, 236a, 237b, 239, 241, and 243b.

Uses cond 16d, delete_term 18c, fields 11c, refcount 20c, and term_schema 9.

19d 〈freememory error checking 19d〉≡ (19c)

if (refcount < 0) setSelector(STDERR); print(); ioprintln();cerr ≪ "refcount = " ≪ refcount ≪ endl;

assert(refcount ≥ 0);Uses ioprintln 246 247a, refcount 20c, setSelector 246 247a, and STDERR 246.

2.1. TERMS 20

Comment 2.1.26. This function overwrites the root of the current term with the input termt. We need to do this if the current node is shared (see §2.1.3) or when the current term is theroot term (with no parent). The procedure is simple. The information on the root of t is copied,and all the child nodes of t are reused. We first reuse the child nodes of t because we could bereplacing the current term with its children, in which case t can get deleted before we can reuseits child nodes if we are not careful.

20a 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 19a 20e ⊲

void replace(term_schema ∗ t);Defines:

replace, used in chunks 32a, 34, 36b, 54e, and 60.Uses term_schema 9.

20b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 19c 22e ⊲

void term_schema::replace(term_schema ∗ t) tag = t→tag; name = t→name;〈term-schema replace parts 12e〉int tsize = t→fields.size();for (int i=0; i6=tsize; i++) t→fields[i]→reuse();

int size = fields.size();for (int i=0; i6=size; i++) if (fields[i]) fields[i]→freememory();fields.clear();for (int i=0; i6=tsize; i++) fields.push_back(t→fields[i]);

Defines:

replace, used in chunks 32a, 34, 36b, 54e, and 60.Uses clear 145b, fields 11c, freememory 19a 19c, reuse 20e, tag 10c, and term_schema 9.

2.1.3 Sharing of Nodes

Comment 2.1.27. We use reference counting to implement sharing of nodes.

20c 〈term-schema parts 10c〉+≡ (9) ⊳ 16d 21 ⊲

int refcount;Defines:

refcount, used in chunks 15e, 19, 20, 23a, and 46a.

20d 〈term-schema initializations 12c〉+≡ (12a) ⊳ 16e 22b ⊲

refcount = 1;Uses refcount 20c.

Comment 2.1.28. A cloned object of a shared term would have refcount 1. Also, after replacing,the term retains its original refcount value.

20e 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 20a 20f ⊲

term_schema ∗ reuse() refcount++; return this; Defines:

reuse, used in chunks 20b, 33b, 37a, 38a, 44b, 46a, 49b, 51c, 52c, 54e, 55a, and 65–67.Uses refcount 20c and term_schema 9.

20f 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 20e 22d ⊲

bool shared() return (refcount > 1); Defines:

shared, used in chunk 33b.Uses refcount 20c.

2.1. TERMS 21

Comment 2.1.29. A few notes on sharing. One of the biggest advantages of sharing is thatcommon subexpressions need only be evaluated once. Sharing of nodes can, however, interferewith a few basic operations in Escher.

Firstly, I believe the operation of checking for free-variable capture, a test we need to do frequentlyduring pattern matching (see §2.2.3) and term substitution (see §2.1.6), cannot be done efficientlyif a variable that occurs both free and bound in a term is shared. The algorithm described inComment 2.1.38 will not work. It is hard to imagine an (easy) labelling scheme that would work.

Second, sharing of nodes is not always safe. Some statements in the booleans module, espe-cially the ones that support logic programming (see for example Comment 2.2.15), can potentiallychange shared nodes in destructive ways. The extensive use of such sharing-unfriendly statementsin Escher is the primary reason I gave up on sharing.

In the absence of sharing, the computational saving that can be obtained from common subex-pression evaluation can be achieved using (intelligent) caching.

Having said all that, sharing does have at least one important use in our interpreter; see Comment2.2.80.

2.1.4 Free and Bound Variables

Comment 2.1.30. One must be careful when dealing with free and bound variables. This issomething that is not difficult to get right, but incredibly easy to get wrong! So please pay someattention.

Definition 2.1.31. An occurrence of a variable x in a term is bound if it occurs within a subtermof the form λx.t.

Definition 2.1.32. An occurrence of a variable in a term is free if it is not a bound occurrence.

Fact 2.1.33. A variable is free in a term iff it has a free occurrence.

Comment 2.1.34. We have the following simple method for determining free and bound variables.Given a term t, we label (using parameter label) every subterm of t (see Comment 2.2.53) from leftto right with increasing numbers starting from 0. We then label (using parameter bindinglabel)every bound variable in t by the abstraction that binds it. To check whether a variable x is freeinside a term u (where both x and u are subterms of t), we simply check whether the binding labelof x is less than the label of u. The label of a composite subterm referred to is simply the label ofits leftmost symbol (recorded in spinetip).

Example 2.1.35. Here is an example of a labelled term.

( f λx. ( ( ∧ ( ( ∧ x) λy. λx. ( y x))) y))

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

2 10 11

The second row gives the labels of subterms. The third row gives the binding labels of boundvariables.

21 〈term-schema parts 10c〉+≡ (9) ⊳ 20c 24e ⊲

int label;int bindinglabel;term_schema ∗ spinetip;int spinelength;

Defines:

2.1. TERMS 22

bindinglabel, used in chunks 22–24.label, used in chunks 22–24, 60, 61, 132, 142a, 147b, 152a, 166a, 196b, 201c, 202b, 204, 205, and 212.spinelength, used in chunks 13c, 22–24, 38b, 58c, and 62.spinetip, used in chunks 13c, 22–24, 38, 57a, 58c, 61a, and 62a.

Uses term_schema 9.

22a 〈labelVariables initialization values 22a〉≡ (22)

label = -1;bindinglabel = -1;spinetip = NULL;spinelength = -1;

Uses bindinglabel 21, label 21, spinelength 21, and spinetip 21.

22b 〈term-schema initializations 12c〉+≡ (12a) ⊳ 20d 25a ⊲

〈labelVariables initialization values 22a〉

Comment 2.1.36. All these values become obsolete on replacing.

22c 〈term-schema replace parts 12e〉+≡ (20b) ⊳ 16g 25b ⊲

〈labelVariables initialization values 22a〉

Comment 2.1.37. The labels and binding labels of a term t usually needs to be recalculated ifany part of t changes through (non-trivial) term substitutions. (Renaming of bound variables arefine, however.) Cloning and replacing are usually done as part of term rewriting, which involvesterm substitutions. For this reason, we will not try to reuse calculated values of the parametersduring cloning and reusing, but will instead demand that they be recalculated when necessary.

Comment 2.1.38. The following function implements the algorithm described in Comment2.1.34. We calculate the value of spinetip and the length of the spine here as well. This savesexplicit calls to the function spineTip (both versions) later on in the computation. We alsoreinitialize the variable freevars_computed. (See Comment 2.1.46.)

The actual algorithm is slightly more clever than the naïve version described earlier in that weonly compute the labels of those subterms that strictly need recalculating. The very first time aterm t is labelled, we need to traverse the whole term to label each subterm. If a subterm s of tis subsequently modified, we only need to recompute the labels of all subterms of t at and to theright of s. The input parameter changed records the location of s.

Comment 2.1.39. The correctness of labelVariables rests on the assumption that there is nonode sharing in the current term.

22d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 20f 24b ⊲

void labelVariables(int changed);void labelVariables(int & marker, int changed, vector<term_schema ∗> bvars);

Defines:labelVariables, used in chunks 23b, 24a, and 190a.

Uses marker 125a and term_schema 9.

22e 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 20b 23a ⊲

void term_schema::labelVariables(int changed) int marker = 0; vector<term_schema ∗> bvars;labelVariables(marker, changed, bvars);


Uses marker 125a and term_schema 9.

2.1. TERMS 23

Comment 2.1.40. When we call labelVariables on a term, we know that either the subterm slabelled by changed resides in the current term or the current term resides to the right of s. Thismeans that if the current term t is an atomic term, we need to recompute its label because it iseither s or is to the right of s. If t is an abstraction, then s is either in the body of t or t is to theright of s. In both cases, we need to recurse onto the body of t. These are the straightforwardcases. We will look at application and tuple terms in Comment 2.1.42.

23a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 22e 24c ⊲

void term_schema::labelVariables(int & marker, int changed,vector<term_schema ∗> bvars)

assert(refcount ≡ 1);validfree = false; freevars_computed = false; label = marker++;if (tag ≡ D ∨ tag ≡ F ∨ tag ≡ SV)

spinetip = this; spinelength = 1; else if (tag ≡ V)

spinetip = this; spinelength = 1;int size = bvars.size();for (int i=size-1; i6=-1; i−−)

if (name ≡ bvars[i]→fields[0]→name) bindinglabel = bvars[i]→label; return;

else if (tag ≡ ABS) spinetip = this; spinelength = 1;bvars.push_back(this);fields[1]→labelVariables(marker, changed, bvars);

else if (tag ≡ PROD) 〈labelVariables::PROD 24a〉

else if (tag ≡ APP) 〈labelVariables::APP 23b〉


Uses bindinglabel 21, fields 11c, freevars_computed 24e, label 21, marker 125a, refcount 20c, spinelength 21,spinetip 21, tag 10c, term_schema 9, and validfree 26a.

Comment 2.1.41. In the tag == V case, we search bvars from right to left in accordance to thenormal scoping rules.

Comment 2.1.42. Now the interesting cases. We first look at application nodes. If the currentterm t is the term s labelled changed or is to the right of s, that is, label ≥ changed , then weneed to relabel both child nodes. Otherwise if s is inside t, then we need to call labelVariablesrecursively on both child nodes only when s appears in the left child node. This condition is truewhen changed is less than the label of the right child node. In the case when s is in the rightchild node, we can skip a recursive call on the left child node but we must remember to updatethe marker field before we recurse on the right child.

23b 〈labelVariables::APP 23b〉≡ (23a)

if (label ≥ changed ∨ fields[1]→label > changed)fields[0]→labelVariables(marker, changed, bvars);

else marker = fields[1]→label;fields[1]→labelVariables(marker, changed, bvars);spinetip = fields[0]→spinetip;spinelength = fields[0]→spinelength + 1;

Uses fields 11c, label 21, labelVariables 22d 22e 23a, marker 125a, spinelength 21, and spinetip 21.

2.1. TERMS 24

Comment 2.1.43. The same kind of reasoning used for application terms applies here for productterms.

24a 〈labelVariables::PROD 24a〉≡ (23a)

spinetip = this; spinelength = 1;int size = fields.size();for (int i=0; i6=size-1; i++)

if (label ≥ changed ∨ fields[i+1]→label > changed) assert(fields[i+1]→label);fields[i]→labelVariables(marker, changed, bvars);

else marker = fields[i+1]→label; continue; fields[size-1]→labelVariables(marker, changed, bvars);

Uses fields 11c, label 21, labelVariables 22d 22e 23a, marker 125a, spinelength 21, and spinetip 21.

Comment 2.1.44. This next function checks whether the current term x, which should be avariable inside the term enclosed, has a free occurrence inside enclosed. The test is particularlysimple. The variable x has a free occurrence iff either the binding label of x is -1 (initializationvalue) or the (non-negative) binding label of x is less than the label of enclosed.

24b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 22d 24d ⊲

bool isFreeInside(term_schema ∗ enclosed);Defines:

isFreeInside, used in chunk 25d.Uses term_schema 9.

24c 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 23a 25c ⊲

bool term_schema::isFreeInside(term_schema ∗ enclosed) assert(tag ≡ V);if (bindinglabel ≡ -1) return true;assert(enclosed→spinetip);return (enclosed→spinetip→label > bindinglabel);

Defines:

isFreeInside, used in chunk 25d.Uses bindinglabel 21, label 21, spinetip 21, tag 10c, and term_schema 9.

Comment 2.1.45. The following function returns all the free variables inside a term. It isassumed that we have called labelVariables on the term to initialize all the labels and bindinglabels.

24d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 24b 25e ⊲

void getFreeVars();void getFreeVars(term_schema ∗ root);

Defines:getFreeVars, used in chunks 28a and 29a.

Uses term_schema 9.

Comment 2.1.46. Computed free variables are cached in the vector myfreevars. The flagfreevars_computed tells us whether myfreevars has been initialized. A vector instead of a setis used to store the free variables. This means free variables with multiple occurrences will berecorded multiple times.

24e 〈term-schema parts 10c〉+≡ (9) ⊳ 21 26a ⊲

bool freevars_computed;vector<string> myfreevars;

Defines:freevars_computed, used in chunks 23a, 25, and 30c.myfreevars, used in chunks 25d, 28a, and 29a.

2.1. TERMS 25

25a 〈term-schema initializations 12c〉+≡ (12a) ⊳ 22b 26b ⊲

freevars_computed = false;Uses freevars_computed 24e.

Comment 2.1.47. These values become obsolete on replacing.

25b 〈term-schema replace parts 12e〉+≡ (20b) ⊳ 22c 26d ⊲

freevars_computed = false;Uses freevars_computed 24e.

25c 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 24c 25d ⊲

void term_schema::getFreeVars() if (freevars_computed) return;getFreeVars(this);

Defines:

getFreeVars, used in chunks 28a and 29a.Uses freevars_computed 24e and term_schema 9.

25d 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 25c 27a ⊲

void term_schema::getFreeVars(term_schema ∗ root) if (myfreevars.size()) myfreevars.clear();freevars_computed = true;if (tag ≡ D ∨ tag ≡ F) return;if (tag ≡ V) if (isFreeInside(root)) myfreevars.push_back(name);

return; if (tag ≡ ABS) fields[1]→getFreeVars(root);

myfreevars = fields[1]→myfreevars; return; int size = fields.size();for (int i=0; i6=size; i++)

fields[i]→getFreeVars(root);int size2 = fields[i]→myfreevars.size();for (int j=0; j6=size2; j++)

myfreevars.push_back(fields[i]→myfreevars[j]);

Defines:

getFreeVars, used in chunks 28a and 29a.Uses clear 145b, fields 11c, freevars_computed 24e, isFreeInside 24b 24c, myfreevars 24e, tag 10c,

and term_schema 9.

Comment 2.1.48. The function labelVariables caters for the case where the term in consid-eration undergoes changes during computation. For terms that stay unchanged throughout thewhole computation (e.g. program statements), freeness checking of variables can be done (slightly)more efficiently by flagging each bound variable in the term directly up front. This is achievedusing the following function labelStaticBoundVars().

At present, we only call this function on the head of program statements.

25e 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 24d 26e ⊲

void labelStaticBoundVars();void labelBound(string x);

Defines:labelBound, used in chunk 27a.labelStaticBoundVars, used in chunks 32a, 34, and 216e.

2.1. TERMS 26

Comment 2.1.49. We first look at the free parameter. To ensure safe use, the free parameter isonly valid if the validfree parameter is true. (The function labelStaticBoundVars is responsiblefor setting this latter parameter. Its value will get set to false during cloning and replacing.)

26a 〈term-schema parts 10c〉+≡ (9) ⊳ 24e 56c ⊲

bool free;bool validfree;

Defines:free, used in chunks 26, 27, and 193.validfree, used in chunks 23a, 26, and 27.

26b 〈term-schema initializations 12c〉+≡ (12a) ⊳ 25a 56d ⊲

validfree = false;

Uses validfree 26a.

Comment 2.1.50. If the whole term t on which labelBoundVars is called is to be cloned, thenthe existing value of the free parameter would remain correct. However, if only a subterm t1 oft is to be cloned, then some variables that are bound in t can become free in t1. Variables thatare free in t would remain free in t1 though. However, if t (respectively t1) is then subsequentlysubstituted into another term (using the mechanism of syntactical variables), then free variablesin t (respectively t1) can become bound. For all these reasons, we will not attempt to recyclevalues of free parameters during cloning and replacing.

26c 〈term-schema clone parts 12d〉+≡ (19b) ⊳ 16f 56e ⊲

ret→validfree = false;

Uses validfree 26a.

Comment 2.1.51. Ditto for replacing. Free variables can become bound after replacing whilebound variables remain bound after replacing. The trouble here is that we do not really want totraverse the input graph to label the variables. At present, we only use the free parameters insidethe head of a program statement during pattern matching. We will just mark in the replace codethat proper handling of the free parameter is not yet implemented.

26d 〈term-schema replace parts 12e〉+≡ (20b) ⊳ 25b 56f ⊲

validfree = false;Uses validfree 26a.

26e 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 25e 27c ⊲

bool isFree() assert(tag ≡ V ∧ validfree); return free;

Defines:isFree, used in chunks 33c, 66–68, and 72a.

Uses free 26a, tag 10c, and validfree 26a.

2.1. TERMS 27

Comment 2.1.52. A straightforward tree traversal is used to label the bound variables. Boundvariables inside a lambda term are marked before the free variables. Hence the way labelling isdone inside the (tag == V) case.

27a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 25d 27b ⊲

void term_schema::labelStaticBoundVars() if (tag ≡ F ∨ tag ≡ D ∨ tag ≡ SV) return;if (tag ≡ V) if (¬validfree) validfree = true; free = true;

return; if (tag ≡ ABS)

fields[0]→validfree = true; fields[0]→free = false;fields[1]→labelBound(fields[0]→name);fields[1]→labelStaticBoundVars();return;

int size = fields.size();for (int i=0; i6=size; i++) fields[i]→labelStaticBoundVars();

Defines:

labelStaticBoundVars, used in chunks 32a, 34, and 216e.Uses fields 11c, free 26a, labelBound 25e 27b, tag 10c, term_schema 9, and validfree 26a.

27b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 27a 28a ⊲

void term_schema::labelBound(string x) if (tag ≡ F ∨ tag ≡ D ∨ tag ≡ SV) return;if (tag ≡ V) if (name ≡ x) validfree = true; free = false;

return; if (tag ≡ ABS) fields[1]→labelBound(x); return; int size = fields.size();for (int i=0; i6=size; i++) fields[i]→labelBound(x);

Defines:labelBound, used in chunk 27a.

Uses fields 11c, free 26a, tag 10c, term_schema 9, and validfree 26a.

Comment 2.1.53. The functions isFree and isFreeInside discussed above allows one to checkwhether a subterm s of a term t occurs free inside t. Some times we want to check whether avariable x has a free occurrence inside another term. The following functions allow us to do that.Some occurrences of the input variable could be bound. We return upon seeing the first freeoccurrence the input variable.

There are two versions of this function. The first, occursFree, uses getFreeVars to compute allthe free variables in a term and then check whether var is inside this set. If occursFree is calledrepeatedly by the same term, this caching of computed free variables is beneficial. The second,occursFreeNaive, performs a simple traversal of the term to check whether var occurs free.

27c 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 26e 28d ⊲

bool occursFree(string var);bool occursFreeNaive(string var);bool occursFreeNaive(string var, vector<string> boundv);

Defines:occursFree, used in chunks 47b, 49a, and 53b.occursFreeNaive, used in chunks 28c and 48a.

2.1. TERMS 28

28a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 27b 28b ⊲

bool term_schema::occursFree(string var) getFreeVars();int size = myfreevars.size();for (int i=0; i6=size; i++)

if (myfreevars[i] ≡ var) return true;return false;

Defines:

occursFree, used in chunks 47b, 49a, and 53b.Uses getFreeVars 24d 25c 25d, myfreevars 24e, and term_schema 9.

28b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 28a 28c ⊲

bool term_schema::occursFreeNaive(string var) vector<string> boundv;return occursFreeNaive(var, boundv);

Defines:occursFreeNaive, used in chunks 28c and 48a.

Uses term_schema 9.

28c 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 28b 29a ⊲

bool term_schema::occursFreeNaive(string var, vector<string> boundv) if (tag ≡ F ∨ tag ≡ D) return false;if (tag ≡ V)

if (name ≡ var ∧ inVector(name,boundv) ≡ false) return true;return false;

if (tag ≡ ABS) boundv.push_back(fields[0]→name);return fields[1]→occursFreeNaive(var, boundv);


if (fields[i]→occursFreeNaive(var, boundv)) return true;return false;

Uses fields 11c, inVector 241 242, occursFreeNaive 27c 28b, tag 10c, and term_schema 9.

Comment 2.1.54. This function checks whether any free variable inside the calling term iscaptured by at least one of the bounded variables. The index of the captured variable is recordedin captd. We store pointers to binding abstraction terms instead of strings for two reasons. First,we sometimes need to change the name of a binding variable when a free variable is captured. Thishappens, for example, during term substitution. Having a pointer to the abstraction term allowsus to jump straight to the offending term. Second, in terms of memory usage, storing pointers toterms is cheaper. If we want to use a set instead of a vector to store the binding variables (maybefor efficiency reasons), it is easy to put a wrapper around term_schema * and define a pointer pto be less than q iff p->fields[0]->name < q->fields[0]->name.

28d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 27c 29d ⊲

bool captured(vector<term_schema ∗> & bvars, int & captd);

Defines:captured, used in chunks 33d, 72b, and 73d.

Uses term_schema 9.

2.1. TERMS 29

29a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 28c 30a ⊲

bool term_schema::captured(vector<term_schema ∗> & bvars, int & captd) if (bvars.empty()) return false;getFreeVars();int fsize = myfreevars.size();int bsize = bvars.size();for (int i=0; i6=fsize; i++)

for (int j=0; j6=bsize; j++)if (myfreevars[i] ≡ bvars[j]→fields[0]→name)

captd = j; return true; return false;

Defines:captured, used in chunks 33d, 72b, and 73d.

Uses fields 11c, getFreeVars 24d 25c 25d, myfreevars 24e, and term_schema 9.

Comment 2.1.55. For small terms, the use of vector for both myfreevars and bvars is probablyokay. For larger terms, the use of set (red-black trees) could be much better.

2.1.5 Variable Renaming

Comment 2.1.56. Different forms of variable renaming are required in performing computations.We discuss these operations in this section.

Comment 2.1.57. This function creates new previously unused variable names.

29b 〈term-schema::external functions 13d〉+≡ (9) ⊳ 13d

extern string newVar();Uses newVar 29c.

29c 〈terms.cc::local functions 14a〉+≡ (10a) ⊳ 18c 59c ⊲

#include <stdlib.h>#include "global.h"

static int varInt = 0;string newVar()

string ret = "pve";ret += numtostring(varInt);varInt++;assert(varInt 6= 0); // check for overflowreturn ret;

Defines:newVar, used in chunks 29b, 33d, 46a, 48a, and 73c.

Uses global.h 232 and numtostring 241.

Comment 2.1.58. This function renames all occurrences of a variable var1 inside the currentterm to var2. Note that both free and bound occurrences are renamed. This is okay since thefunction is only called (sensibly) as a subroutine by the other variable-renaming functions in thissection.

29d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 28d 30b ⊲

void rename(string var1, string var2);

Defines:rename, used in chunk 30c.

2.1. TERMS 30

30a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 29a 30c ⊲

void term_schema::rename(string var1, string var2) if (tag ≡ SV ∨ tag ≡ F ∨ tag ≡ D) return;if (tag ≡ V) if (name ≡ var1) name = var2; return; int size = fields.size();for (int i=0; i6=size; i++)

fields[i]→rename(var1, var2);

Defines:rename, used in chunk 30c.

Uses fields 11c, tag 10c, and term_schema 9.

Comment 2.1.59. This function renames one particular lambda variable in a term. This is usedin term substitutions in the case when a free variable capture occurs.

30b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 29d 31b ⊲

void renameLambdaVar(string var1, string var2);

Defines:renameLambdaVar, used in chunk 33d.

30c 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 30a 32a ⊲

void term_schema::renameLambdaVar(string var1, string var2) freevars_computed = false;if (tag ≡ SV ∨ tag ≡ V ∨ tag ≡ F ∨ tag ≡ D) return;if (tag ≡ ABS)

if (fields[0]→name ≡ var1) fields[0]→name = var2;fields[1]→rename(var1, var2);

// if lambda variables are distinct, this is not neededfields[1]→renameLambdaVar(var1, var2);return;


fields[i]→renameLambdaVar(var1, var2);

Defines:renameLambdaVar, used in chunk 33d.

Uses fields 11c, freevars_computed 24e, rename 29d 30a, tag 10c, and term_schema 9.

2.1. TERMS 31

2.1.6 Term Substitution

Definition 2.1.60. [Llo03, pg. 55] A term substitution is a finite set of the form x1/t1, . . . , xn/tnwhere each xi is a variable, each ti is a term distinct from xi, and x1, . . . , xn are distinct.

Comment 2.1.61. Each pair xi/ti is represented as a structure as follows.

31a 〈term-schema::type defs 10b〉+≡ (9) ⊳ 10b

struct substitution string first;term_schema ∗ second;substitution() second = NULL; substitution(string v, term_schema ∗ t) first = v; second = t;

;

Defines:substitution, used in chunks 31, 32, 34, 44b, 46a, 51a, 54e, 55b, 60, 70, 72a, 74, and 75.

Uses term_schema 9.

Definition 2.1.62. [Llo03, pg. 56] Let t be a term and θ = x1/t1, . . . , xn/tn a term substitution.The instance tθ of t by θ is the well-formed expression defined as follows.

1. If t is a variable xi for some i ∈ 1, . . . , n, then xiθ = ti.If t is a variable y distinct from all the xi, then yθ = y.

2. If t is a constant C, then Cθ = C.

3. If t is an abstraction λxi.s, for some i ∈ 1, . . . , n, then

(λxi.s)θ = λxi.(sx1/t1, . . . , xi−1/ti−1, xi+1/ti+1, . . . , xn/tn).

If t is an abstraction λy.s, where y is distinct from all the xi,

(a) if for some i ∈ 1, . . . , n, y is free in ti and xi is free in s, then

(λy.s)θ = λz.(sy/zθ)

where z is a new variable.

(b) else (λy.s)θ = λy.(sθ);

4. If t is an application (u v), then (u v)θ = (uθ vθ).

5. If t is a tuple (t1, . . . , tn), then (t1, . . . , tn)θ = (t1θ, . . . , tnθ).

Comment 2.1.63. Term substitutions are performed by the function subst. There are twoversions of it, one deals with singleton sets, the other with non-singleton sets. In both cases, realwork is done by the function subst2.

31b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 30b 35 ⊲

void subst(vector<substitution> & subs);void subst(substitution & sub);void subst2(vector<substitution> & subs, vector<term_schema ∗> bv,

term_schema ∗∗ pointer);Defines:

subst, used in chunks 44b, 46a, 52c, 54e, 55b, 60, and 63d.subst2, used in chunks 32a and 34.

Uses substitution 31a and term_schema 9.

2.1. TERMS 32

Comment 2.1.64. A single traversal of the tree achieves the desired parallel-instantiation-of-variables effect.

Comment 2.1.65. Given t and θ, the function subst will handle the special case where t is avariable (and thus free) or a syntactical variable. All other cases are handled by subst2. Beforecalling subst2, we call labelStaticBoundVars to label the variables. The free values computedare safe for use here because they are read only once by subst2 and changes introduced by subst2

are all localized on the spots where free variables live in the term.

Comment 2.1.66. Pointers to terms in subs are all pointers to subterms in an existing structurethat will be deleted after the term substitution. For that reason, these pointers can be safelyreused once, but not more than that.

For the special case where the current term is a variable or a syntactical variable, we need to makethe term replacement in place using replace.

32a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 30c 32b ⊲

void term_schema::subst(vector<substitution> & subs) if (tag ≡ V ∨ tag ≡ SV)

int size = subs.size();for (int i=0; i6=size; i++)

if (name ≡ subs[i].first) this→replace(subs[i].second); return;

return;labelStaticBoundVars();vector<term_schema ∗> bindingAbs;subst2(subs, bindingAbs, NULL);

Defines:

subst, used in chunks 44b, 46a, 52c, 54e, 55b, 60, and 63d.Uses labelStaticBoundVars 25e 27a, replace 20a 20b, subst2 31b 32b, substitution 31a, tag 10c,

and term_schema 9.

Comment 2.1.67. All the complications in Definition 2.1.62 are in the abstraction case. Oper-ationally, checking all those conditions every time we encounter an abstraction is expensive. Wecan perform these checks only when strictly necessary by delaying them until before we apply asubstitution, that is, until we see a free variable in t that matches one of the variables in θ.

32b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 32a 34 ⊲

void term_schema::subst2(vector<substitution> & subs,vector<term_schema ∗> bindingAbs,term_schema ∗∗ pointer)

if (tag ≡ SV) 〈subst2::case of SV 33a〉 if (tag ≡ V) 〈subst2::case of V 33c〉 if (tag ≡ F ∨ tag ≡ D) return;if (tag ≡ ABS)

bindingAbs.push_back(this);fields[1]→subst2(subs, bindingAbs, &fields[1]);return;


fields[i]→subst2(subs, bindingAbs, &fields[i]);

Defines:subst2, used in chunks 32a and 34.

Uses fields 11c, substitution 31a, tag 10c, and term_schema 9.

2.1. TERMS 33

Comment 2.1.68. Term substitution is not formally defined for syntactical variables. It shouldbehave like a free variable (see Comment 2.1.70), except that we do not have to worry about freevariable capture.

33a 〈subst2::case of SV 33a〉≡ (32b)

int size = subs.size();for (int i=0; i6=size; i++)

if (name ≡ subs[i].first) 〈subst2::replace by ti 33b〉 return; return;

Comment 2.1.69. See the first part of Comment 2.1.66 for why we do what we do here. Theparent pointer must exist because the case where it does not exist is handled by subst.

33b 〈subst2::replace by ti 33b〉≡ (33)

assert(pointer);this→freememory();if (subs[i].second→shared()) ∗pointer = subs[i].second→clone();else ∗pointer = subs[i].second→reuse();

Uses clone 19a 19b, freememory 19a 19c, reuse 20e, and shared 20f.

Comment 2.1.70. We now look at the tag == V case. If the current term is a bound variablein t, then the first part of Definition 2.1.62 (3) applies and nothing changes. If the current termis a free variable in t and does not occur in θ, the second part of Definition 2.1.62 (1) applies andagain nothing happens. If the current term is a free variable in t that matches an xi in θ, then thefirst part of Definition 2.1.62 (1) applies and we substitute the current term with ti. Before wedo that, however, we check whether any free variable in ti is captured by any λ abstraction thatencloses the current term. If yes, part (a) of Definition 2.1.62 (3) applies and we must rename theoffending λ variable before replacing the current term with ti. Otherwise, part (b) of Definition2.1.62 (3) applies and we can just go ahead and replace the current term with ti.

33c 〈subst2::case of V 33c〉≡ (32b)

if (isFree() ≡ false) return;int size = subs.size();for (int i=0; i6=size; i++)

if (name 6= subs[i].first) continue;〈subst2::free variable captured 33d〉〈subst2::replace by ti 33b〉return;

return;

Uses isFree 26e.

33d 〈subst2::free variable captured 33d〉≡ (33c)

int k;while (subs[i].second→captured(bindingAbs,k))

bindingAbs[k]→renameLambdaVar(bindingAbs[k]→fields[0]→name,newVar());

Uses captured 28d 29a, fields 11c, newVar 29c, and renameLambdaVar 30b 30c.

Comment 2.1.71. The use of captured (hence the use of cached computed free variables) herewarrants some caution. If subs[i].second does not remain unchanged throughout the termsubstitution process, errors can creep in. We now argue that subs[i].second stays unchangedthroughout.

2.1. TERMS 34

Term substitution is only used in two places in Escher. The first place is in the constructionof body instances after successful pattern matching on the head of a statement. (See Comment2.2.70.) The use of subst has no problem here because all the terms in θ are in the matched redex,whereas we only do surgery on the (cloned) body of a statement.

The other place term substitutions take place is in some of the internal simplification routinesdescribed in §2.2.1. Such uses only ever involve a single pair x/t. In all routines except betareduction, t will remain unchanged because of the requirement that x does not occur free in t. Inbeta reduction (see Comment 2.2.15), it is easy to see that t remains unchanged since substitutionis a once off operation. That is, even if x occurs free in t, it will never be substituted. (Otherwise,we will have an infinite recursion.)

Comment 2.1.72. The following is the version of subst that handles singleton term substitutions.We make a vector out of the single pair and use subst2 to do the job.

34 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 32b 36a ⊲

void term_schema::subst(substitution & sub) if (tag ≡ V ∨ tag ≡ SV)

if (name ≡ sub.first) this→replace(sub.second); return; labelStaticBoundVars();vector<term_schema ∗> bindingAbs;vector<substitution> subs; subs.push_back(sub);subst2(subs, bindingAbs, NULL);

Defines:subst, used in chunks 44b, 46a, 52c, 54e, 55b, 60, and 63d.

Uses labelStaticBoundVars 25e 27a, replace 20a 20b, subst2 31b 32b, substitution 31a, tag 10c,and term_schema 9.

2.2. TERM REWRITING 35

Comment 2.1.73. A correct implementation of substitution should get the following right. Giventhe statement

(func z) = \x.\y.\x.(&& z (|| x y)).

and the query

: (func (f x y)),

Escher should produce the following

\pve0.\pve1.\pve0.(&& (f x y) (|| pve0 pve1)).

Notice that two free variables got captured along the way.

2.2 Term Rewriting

2.2.1 Internal Rewrite Routines

Comment 2.2.1. To capture precisely and completely statement schemas in Escher’s booleansmodule, some of which have complicated side conditions on syntactical variables, we implementthem as algorithms. These algorithms form the internal rewrite module of Escher, and they arecalled before any other program statements.

Comment 2.2.2. This next function implements the following equality statements:

= : a→ a→ Ω

(C x1 . . . xn = C y1 . . . yn) = (x1 = y1) ∧ · · · ∧ (xn = yn)

% where C is a data constructor of arity n.

(C x1 . . . xn = D y1 . . . ym) = ⊥

% where C and D are data constructors of arity n and m respectively, and C 6= D.

(() = ()) = ⊤

((x1, . . . , xn) = (y1, . . . , yn)) = (x1 = y1) ∧ · · · ∧ (xn = yn)

% where n = 2, 3, . . . .

35 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 31b 39a ⊲

bool simplifyEquality(term_schema ∗ parent, uint id);

Defines:simplifyEquality, used in chunks 37c and 61a.

Uses term_schema 9.


36a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 34 40a ⊲

bool term_schema::simplifyEquality(term_schema ∗ parent, uint id) bool changed = false;term_schema ∗ ret = this;term_schema ∗ t1 = leftc()→rightc(), ∗ t2 = rightc();〈simplifyEquality::local variables 37d〉

〈simplifyEquality::identical variables 36c〉〈simplifyEquality::irrelevant cases 36d〉〈simplifyEquality::case of products 37a〉〈simplifyEquality::case of applications 38a〉

simplifyEquality_cleanup:if (changed) 〈simplify update pointers 36b〉 return changed;

Defines:

simplifyEquality, used in chunks 37c and 61a.Uses leftc 11d, rightc 11d, and term_schema 9.

Comment 2.2.3. The pointer ret points to the current term under consideration. If changedis true by the end of the operation, then an equality embedded in the current term would havebeen simplified. Otherwise, it stays the same as before the function is called. Assuming the termhas been changed, we have two cases to consider. If the current term is the root term (parent ==

NULL), then we overwrite the current term with ret. Otherwise, we simply redirect the pointerparent->fields[id] to ret. Note that this code chunk is used in the other simplification routinesas well.

36b 〈simplify update pointers 36b〉≡ (36a 40a 43 44b 49a 51a 54d 55a)

if (parent ≡ NULL) this→replace(ret); ret→freememory();

else parent→fields[id]→freememory(); parent→fields[id] = ret; Uses fields 11c, freememory 19a 19c, and replace 20a 20b.

36c 〈simplifyEquality::identical variables 36c〉≡ (36a)

if (t1→isVar() ∧ t2→isVar() ∧ t1→name ≡ t2→name) changed = true;ret = new_term(D, "True");goto simplifyEquality_cleanup;

Uses isVar 11a and new_term 17b.

Comment 2.2.4. This simplification does not apply when one of the terms is a variable. We alsodo not handle equality of abstractions. That is done using statements in the booleans module.

36d 〈simplifyEquality::irrelevant cases 36d〉≡ (36a)

if (t1→isVar() ∨ t2→isVar()) return false;if (t1→isAbs()) return false;

Uses isAbs 11a and isVar 11a.

Comment 2.2.5. We need to check that both t1 and t2 are products before proceeding becauseone of them can be a (nullary) function symbol that stands for another product. However, oncewe have done that, we only have to check the dimension of t1 because the type checker wouldhave made sure that t2 has the same dimension. Given (x1, . . . , xn) = (y1, . . . , yn), we create aterm of the form ((· · · ((x1 ∧ y1) ∧ (x2 ∧ y2)) · · · ) ∧ (xn ∧ yn)).


37a 〈simplifyEquality::case of products 37a〉≡ (36a)

if (t1→isProd() ∧ t2→isProd()) changed = true;uint t1_args = t1→fields.size();

〈simplifyEquality::case of products::empty tuples 37b〉〈simplifyEquality::case of products::error handling 37c〉

term_schema ∗ eq1 = newT2Args(F, "==");eq1→initT2Args(t1→fields[0]→reuse(), t2→fields[0]→reuse());term_schema ∗ eq2 = newT2Args(F, "==");eq2→initT2Args(t1→fields[1]→reuse(), t2→fields[1]→reuse());

ret = newT2Args(F, "&&");ret→initT2Args(eq1, eq2);

for (uint i=0; i6=t1_args-2; i++) term_schema ∗ eqi = newT2Args(F, "==");eqi→initT2Args(t1→fields[i+2]→reuse(),

t2→fields[i+2]→reuse());term_schema ∗ temp = newT2Args(F, "&&");temp→initT2Args(ret, eqi);ret = temp;

goto simplifyEquality_cleanup;

Uses fields 11c, initT2Args 14b, isProd 11a, newT2Args 14a, reuse 20e, and term_schema 9.

Comment 2.2.6. The boolean module as it stands in [Llo03] does not handle the expression() = (). We will cater for that case here, which should of course evaluate to ⊤.

37b 〈simplifyEquality::case of products::empty tuples 37b〉≡ (37a)

if (t1_args ≡ 0) ret = new_term(D, "True"); goto simplifyEquality_cleanup;

Uses new_term 17b.

Comment 2.2.7. Besides the empty tuple, we handle all finite-length tuples of dimension at leasttwo. It does not make a great deal of sense to have a tuple of dimension one.

37c 〈simplifyEquality::case of products::error handling 37c〉≡ (37a)

if (t1_args 6= t2→fields.size() ∨ t1_args < 2) setSelector(STDERR); ioprint("Error in simplifyEquality:products\n");t1→print(); ioprintln(); t2→print(); ioprintln();

assert(t1_args ≡ t2→fields.size() ∧ t1_args ≥ 2);

Uses fields 11c, ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, simplifyEquality 35 36a,and STDERR 246.

37d 〈simplifyEquality::local variables 37d〉≡ (36a)

int t1_arity = 0, t2_arity = 0;


38a 〈simplifyEquality::case of applications 38a〉≡ (36a)

〈simplifyEquality::check whether we have data constructors 38b〉changed = true;

if (t1_arity ≡ 0 ∧ t2_arity ≡ 0) if (t1→spinetip→isfloat ∧ t2→spinetip→isfloat)

if (t1→spinetip→numf ≡ t2→spinetip→numf)ret = new_term(D,"True");

else ret = new_term(D, "False");goto simplifyEquality_cleanup;

else if (t1→spinetip→isint ∧ t2→spinetip→isint) if (t1→spinetip→numi ≡ t2→spinetip→numi)

ret = new_term(D,"True");else ret = new_term(D, "False");goto simplifyEquality_cleanup;

if (t1→spinetip→name ≡ t2→spinetip→name)

ret = new_term(D,"True");else ret = new_term(D, "False");goto simplifyEquality_cleanup;

if (t1_arity 6= t2_arity ∨ t1→spinetip→name 6= t2→spinetip→name)

ret = new_term(D,"False"); goto simplifyEquality_cleanup;

ret = newT2Args(F, "==");ret→initT2Args(t1→fields[1]→reuse(), t2→fields[1]→reuse());t1_arity−−;while (t1_arity 6= 0)

t1 = t1→fields[0]; t2 = t2→fields[0];term_schema ∗ temp = newT2Args(F, "==");temp→initT2Args(t1→fields[1]→reuse(), t2→fields[1]→reuse());

term_schema ∗ temp2 = newT2Args(F, "&&");temp2→initT2Args(temp, ret);ret = temp2;t1_arity−−;

Uses fields 11c, initT2Args 14b, isfloat 12b, isint 12b, new_term 17b, newT2Args 14a, numf 12b, numi 12b,reuse 20e, spinetip 21, and term_schema 9.

Comment 2.2.8. We need to check whether the leftmost symbol of both t1 and t2 is a dataconstructor. If we go pass this point, t1 and t2 have the right form for comparison. The parameterspinetip should have been initialized by labelVariables by this stage.

38b 〈simplifyEquality::check whether we have data constructors 38b〉≡ (38a)

assert(t1→spinetip ∧ t2→spinetip);t1_arity = t1→spinelength-1;if (t1→spinetip→isNotD()) return false;t2_arity = t2→spinelength-1;if (t2→spinetip→isNotD()) return false;

Uses isNotD 11a, spinelength 21, and spinetip 21.

Comment 2.2.9. This function implements the different arithmetic operations.


39a 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 35 42a ⊲

bool simplifyArithmetic(term_schema ∗ parent, uint id);

Defines:simplifyArithmetic, used in chunk 61a.

Uses term_schema 9.

Comment 2.2.10. We currently support the following functions on integers. More can be addedif necessary.

39b 〈term-schema::definitions 16b〉+≡ (9) ⊳ 16b 42b ⊲

#define ADD 1

#define SUB 2

#define MAX 3

#define MIN 4

#define MUL 5

#define DIV 6

#define MOD 7

Defines:ADD, used in chunk 40a.DIV, used in chunk 40a.MAX, used in chunk 40a.MIN, used in chunk 40a.MOD, used in chunk 40a.MUL, used in chunk 40a.SUB, used in chunk 40a.


Comment 2.2.11. Also in the following, we record op in function because comparing integersis a lot cheaper than comparing strings.

40a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 36a 43 ⊲

bool term_schema::simplifyArithmetic(term_schema ∗ parent, uint id) if (¬(rightc()→isD() ∧ leftc()→rightc()→isD()))

return false;string op = fields[0]→leftc()→name;int function;if (op ≡ "add") function = ADD;else if (op ≡ "sub") function = SUB;else if (op ≡ "max") function = MAX;else if (op ≡ "min") function = MIN;else if (op ≡ "mul") function = MUL;else if (op ≡ "div") function = DIV;else if (op ≡ "mod") function = MOD;else return false;

term_schema ∗ t1, ∗ t2;t1 = leftc()→rightc() ;t2 = rightc() ;

term_schema ∗ ret = NULL;if (function ≡ ADD) 〈simplifyArithmetic::add 40b〉 else if (function ≡ SUB) 〈simplifyArithmetic::sub 41a〉 else if (function ≡ MUL) 〈simplifyArithmetic::mul 41b〉 else if (function ≡ DIV) 〈simplifyArithmetic::div 41c〉 else if (function ≡ MAX) 〈simplifyArithmetic::max 41d〉 else if (function ≡ MIN) 〈simplifyArithmetic::min 41e〉 else if (function ≡ MOD)

assert(t1→isint ∧ t2-isint);ret = new_term_float(t1→numi % t2→numi);

〈simplify update pointers 36b〉return true;

Defines:simplifyArithmetic, used in chunk 61a.

Uses ADD 39b, DIV 39b, fields 11c, isD 11a, isint 12b, leftc 11d, MAX 39b, MIN 39b, MOD 39b, MUL 39b,new_term_float 18a, numi 12b, rightc 11d, SUB 39b, and term_schema 9.

Comment 2.2.12. We overload the basic addition, subtraction, multiplication and division op-erations to act on numbers, be they integers or floating-point numbers. The definitions are fairlystandard, when one of the arguments is a floating-point number, the result is a floating-pointnumber. When both arguments are integers, the result is an integer, except when we are dividingtwo integers, in which case the result is a floating-point number.

40b 〈simplifyArithmetic::add 40b〉≡ (40a)

if (t1→isfloat ∧ t2→isfloat) ret = new_term_float(t1→numf + t2→numf);else if (t1→isfloat ∧ t2→isint) ret = new_term_float(t1→numf + t2→numi);else if (t1→isint ∧ t2→isfloat) ret = new_term_float(t1→numi + t2→numf);else if (t1→isint ∧ t2→isint) ret = new_term_int(t1→numi + t2→numi);else return false;

Uses isfloat 12b, isint 12b, new_term_float 18a, numf 12b, and numi 12b.


41a 〈simplifyArithmetic::sub 41a〉≡ (40a)

if (t1→isfloat ∧ t2→isfloat) ret = new_term_float(t1→numf - t2→numf);else if (t1→isfloat ∧ t2→isint) ret = new_term_float(t1→numf - t2→numi);else if (t1→isint ∧ t2→isfloat) ret = new_term_float(t1→numi - t2→numf);else if (t1→isint ∧ t2→isint) ret = new_term_int(t1→numi - t2→numi);else return false;


41b 〈simplifyArithmetic::mul 41b〉≡ (40a)

if (t1→isfloat ∧ t2→isfloat) ret = new_term_float(t1→numf ∗ t2→numf);else if (t1→isfloat ∧ t2→isint) ret = new_term_float(t1→numf ∗ t2→numi);else if (t1→isint ∧ t2→isfloat) ret = new_term_float(t1→numi ∗ t2→numf);else if (t1→isint ∧ t2→isint) ret = new_term_int(t1→numi ∗ t2→numi);else return false;


41c 〈simplifyArithmetic::div 41c〉≡ (40a)

if (t1→isfloat ∧ t2→isfloat) ret = new_term_float(t1→numf ÷ t2→numf);else if (t1→isfloat ∧ t2→isint) ret = new_term_float(t1→numf ÷ t2→numi);else if (t1→isint ∧ t2→isfloat) ret = new_term_float(t1→numi ÷ t2→numf);else if (t1→isint ∧ t2→isint)

double res = (double)t1→numi ÷ (double)t2→numi;if (res ≡ fabs(res))

ret = new_term_int(t1→numi ÷ t2→numi);else ret = new_term_float(res);

else return false;


41d 〈simplifyArithmetic::max 41d〉≡ (40a)

if (t1→isfloat ∧ t2→isfloat) if (t1→numf ≥ t2→numf)

ret = new_term_float(t1→numf);else ret = new_term_float(t2→numf) ;

else if (t1→isint ∧ t2→isint) if (t1→numi ≥ t2→numi)

ret = new_term_int(t1→numi);else ret = new_term_int(t2→numi) ;

else return false;


41e 〈simplifyArithmetic::min 41e〉≡ (40a)

if (t1→isfloat ∧ t2→isfloat) if (t1→numf ≤ t2→numf)

ret = new_term_float(t1→numf);else ret = new_term_float(t2→numf) ;

else if (t1→isint ∧ t2→isint) if (t1→numi ≤ t2→numi)

ret = new_term_int(t1→numi);else ret = new_term_int(t2→numi) ;

else return false;Uses isfloat 12b, isint 12b, new_term_float 18a, numf 12b, and numi 12b.


Comment 2.2.13. This function implements the different inequalities. It has the same overallstructure as simplifyArithmetic.

42a 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 39a 44a ⊲

bool simplifyInequalities(term_schema ∗ parent, uint id);Defines:

simplifyInequalities, used in chunk 61a.Uses term_schema 9.

Comment 2.2.14. These works on integers. They are all we need really; the relations > and ≥(on integers) can be obtained by swapping the arguments to < and ≤.

42b 〈term-schema::definitions 16b〉+≡ (9) ⊳ 39b

#define MYLT 1

#define MYLEQ 2

#define MYGT 3

#define MYGEQ 4

Defines:MYGEQ, used in chunk 43.MYGT, used in chunk 43.MYLEQ, used in chunk 43.MYLT, used in chunk 43.


43 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 40a 44b ⊲

bool term_schema::simplifyInequalities(term_schema ∗ parent, uint id) if (¬(rightc()→isD() ∧ leftc()→rightc()→isD())) return false;string rel = leftc()→leftc()→name;int relation;if (rel ≡ "<") relation = MYLT;else if (rel ≡ "<=") relation = MYLEQ;else if (rel ≡ ">") relation = MYGT;else if (rel ≡ ">=") relation = MYGEQ;else return false;

term_schema ∗ t1 = leftc()→rightc() ;term_schema ∗ t2 = rightc() ;

if (t1→isfloat 6= t2→isfloat ∨ t1→isint 6= t2→isint) return false;

term_schema ∗ ret = NULL;if (relation ≡ MYLT)

if (t1→isint ∧ t2→isint) if (t1→numi < t2→numi) ret = new_term(D,"True");else ret = new_term(D,"False");

else if (t1→isfloat ∧ t2→isfloat) if (t1→numf < t2→numf) ret = new_term(D,"True");else ret = new_term(D,"False");

else if (relation ≡ MYLEQ)

if (t1→isint ∧ t2→isint) if (t1→numi ≤ t2→numi) ret = new_term(D,"True");else ret = new_term(D,"False");

else if (t1→isfloat ∧ t2→isfloat) if (t1→numf ≤ t2→numf) ret = new_term(D,"True");else ret = new_term(D,"False");

else if (relation ≡ MYGT)

if (t1→isint ∧ t2→isint) if (t1→numi > t2→numi) ret = new_term(D,"True");else ret = new_term(D,"False");

else if (t1→isfloat ∧ t2→isfloat) if (t1→numf > t2→numf) ret = new_term(D,"True");else ret = new_term(D,"False");

else if (relation ≡ MYGEQ)

if (t1→isint ∧ t2→isint) if (t1→numi ≥ t2→numi) ret = new_term(D,"True");else ret = new_term(D,"False");

else if (t1→isfloat ∧ t2→isfloat) if (t1→numf ≥ t2→numf) ret = new_term(D,"True");else ret = new_term(D,"False");

〈simplify update pointers 36b〉return true;

Defines:


simplifyInequalities, used in chunk 61a.Uses isD 11a, isfloat 12b, isint 12b, leftc 11d, MYGEQ 42b, MYGT 42b, MYLEQ 42b, MYLT 42b, new_term 17b,

numf 12b, numi 12b, rightc 11d, and term_schema 9.

Comment 2.2.15. The β-reduction rule (λx.u t) = ux/t in the booleans module is not reallya valid program statement. (The leftmost symbol on the LHS of the equation is not a functionsymbol.) It should therefore be thought of as a part of the internal simplification routine of Escher.This rule is also the first among a few we will encounter where sharing of nodes in the currentterm is not safe because of the appearance of term substitutions on the RHS of the equation. (SeeComments 2.2.16, 2.2.33 and 2.2.44 for the other such rules. The existence and (heavy) use of suchrules in Escher is one important reason I gave up on sharing of nodes. See Comment 2.1.29 for amore detailed discussion on the advantages and disadvantages of sharing.) In a typical programstatement h = b without term substitutions in the body, rewriting a subterm that is α-equivalent(see §2.2.3 for the exact details) to h in the current term with b involves only the creation anddestruction of terms and redirection of pointers to terms. No actual modification to an atomicterm embedded inside the current term actually takes place, which means sharing is always safe.This scenario is no longer true when term substitutions appear in the body of statements.

44a 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 42a 45 ⊲

bool betaReduction(term_schema ∗ parent, uint id);Defines:

betaReduction, used in chunk 61a.Uses term_schema 9.

44b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 43 46a ⊲

bool term_schema::betaReduction(term_schema ∗ parent, uint id) if (leftc()→isAbs() ≡ false) return false;

substitution bind(leftc()→fields[0]→name, rightc());leftc()→fields[1]→subst(bind);term_schema ∗ ret = leftc()→fields[1]→reuse();〈simplify update pointers 36b〉return true;

Defines:

betaReduction, used in chunk 61a.Uses fields 11c, isAbs 11a, leftc 11d, reuse 20e, rightc 11d, subst 31b 32a 34, substitution 31a,

and term_schema 9.

Comment 2.2.16. This function implements the following conjunction rule:

u ∧ (x = t) ∧ v = ux/t ∧ (x = t) ∧ vx/t. (2.1)

Here, t is not a variable and x is a variable free in u or v or both, but not free in t. The LHS of theequation is supposed to capture every term that has a subterm (x = t) embedded conjunctively(see Definition 2.2.18) inside it. All the variables in the rule are syntactical variables because asubterm that pattern matches with the LHS of the equation can occur inside a term that bindsthe variable x, in which case the standard term substitution routine will not give us what we want.

The condition that t is not a variable is important. If t is a free variable and we interpret (x = t)to stand for (x = t) or (t = x), I think the correct interpretation, then loops can result fromrepeated application of the rule. In [Llo99], the statement

(t = x) = (x = t) where x is a variable, and t is not a variable

is used to capture the symmetry between x and t in the rule. In the current implementation, wedo away with the swapping rule and implement the symmetry directly to gain better efficiency.


The condition that t is not a variable does not appear in [Llo03]; this suggests that the rule as itappears in the book is either loopy or incomplete, depending on how one interprets the rule.

There is another small problem with the rule. Note that I have been calling it a rule, not astatement. Why? In any instantiation of the rule, the variable x must occur free in at leasttwo places, which means the instantiation cannot be a statement because of the no repeated freevariables condition. This error appears in every description of Escher before 22 Sep 2005, the dayit was discovered. The use of this rule, among other things, affects the run-time type checking isunnecessary result (See Proposition 5.1.3 in [Llo03]). This is not as bad as it sounds; we only haveto type-check every time we use the conjunction rule, not after every computation step.

Problem 2.2.17. What is the cost, in terms of expressiveness, of omitting this rule?

Definition 2.2.18. A term t is embedded conjunctively in t and, if t is embedded conjunctivelyin r (or s), then t is embedded conjunctively in r ∧ s.

Comment 2.2.19. We could implement the rule completely using the following set of statements.

((x = t) ∧ u) = ((x = t) ∧ ux/t)

(u ∧ (x = t)) = ((x = t) ∧ u)

where u does not have the form (y = s) for some terms y and s.

(((x = t) ∧ u) ∧ v) = ((x = t) ∧ (u ∧ v))

(u ∧ ((x = t) ∧ v)) = ((x = t) ∧ (u ∧ v))

where u does not have the form (y = s) for some terms y and s.

The last three statements can bring out conjunctively embedded equations to the front of the term,which can then be simplified using the first statement. A loop can occur if the side conditions inthe second and fourth statements are not imposed.

Comment 2.2.20. Notice that we do not need the parent pointer for this particular rewriting.

45 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 44a 46b ⊲

bool simplifyConjunction();Defines:

simplifyConjunction, used in chunk 61a.


Comment 2.2.21. In the following, we first check that the current term has the right form, thenwe find (using findEq (see Comment 2.2.23)) a variable-instantiating equation inside the currentterm. By a variable-instantiating equation I mean a (sub)term having the form (x = t) embeddedconjunctively inside the current term which satisfies all the side conditions of Equation 2.1. (Ifthere are more than one variable-instantiating equation, the leftmost is selected. Subsequent callsto findEq on the current term (rewritten using Equation 2.1) will find the remaining variable-instantiating equations in the left-to-right order.) If no such equation exists, findEq returns anull pointer. We rename the x in (x = t) temporarily so that it does not get substituted witht by subst. Since we will not call freememory on the current term, we need to reuse the termp->fields[1] when creating bind to make sure the term substitution works as expected.

46a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 44b 47a ⊲

bool term_schema::simplifyConjunction() term_schema ∗ p = findEq(this);if (p ≡ NULL) return false;

term_schema ∗ varp = p→leftc()→rightc();string oldvar = varp→name; varp→name = newVar();

substitution bind(oldvar, p→fields[1]→reuse());subst(bind); p→fields[1]→refcount−−;varp→name = oldvar;

return true;

Defines:simplifyConjunction, used in chunk 61a.

Uses fields 11c, findEq 46b 47a, leftc 11d, newVar 29c, refcount 20c, reuse 20e, rightc 11d, subst 31b 32a 34,substitution 31a, and term_schema 9.

Comment 2.2.22. The function findEq seeks a variable-instantiating equation inside the root

term with the help of isEq.

46b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 45 48c ⊲

term_schema ∗ findEq(term_schema ∗ root);bool isEq(term_schema ∗ root);

Defines:findEq, used in chunk 46a.isEq, used in chunks 47a and 52–54.

Uses term_schema 9.


Comment 2.2.23. The function findEq assumes that the calling term is a conjunction of theform t1∧t2. (See Definition 2.2.18.) If t1 is a variable-instantiating equation, we return. Otherwise,we recurse on t1 if it has the right (conjunctive) form. Then we do the same on t2. This gives usthe left-to-right selection order.

47a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 46a 47b ⊲

term_schema ∗ term_schema::findEq(term_schema ∗ root) term_schema ∗ p = NULL;term_schema ∗ t1 = leftc()→rightc();if (t1→isEq(root)) return t1;if (t1→isFunc2Args("&&")) p = t1→findEq(root); if (p) return p;

term_schema ∗ t2 = rightc();if (t2→isEq(root)) return t2;if (t2→isFunc2Args("&&")) p = t2→findEq(root); if (p) return p; return NULL;

Defines:

findEq, used in chunk 46a.Uses isEq 46b 47b 52d 53b, isFunc2Args 13b 13c, leftc 11d, rightc 11d, and term_schema 9.

Problem 2.2.24. Getting findEq to run fast is an interesting search problem. The first questionis whether left-to-right is the right search order? We can implement top-to-bottom search byisEqing t1 and t2 first, followed by recursion into each of them. Would that be better? Anotherquestion is can we improve search time by representing conjunctive terms differently, for examplein a flat representation? Or if we do stick with the tree representation, can we augment nodes (inthe spirit of binary search algorithms) to make the search go faster?

Comment 2.2.25. This function checks whether the current term is a variable-instantiating term,that is, whether it has the form (t1 = t2), where t1 is a variable, t2 is a (non-variable) term suchthat t1 does not occur free in it, and t1 occurs free elsewhere in the term root. Note the symmetrybetween t1 and t2. We must check both because any one of them can turn out to be the variablethat satisfies all the conditions.


bool term_schema::isEq(term_schema ∗ root) if (isFunc2Args("==") ≡ false) return false;term_schema ∗ t1 = leftc()→rightc(), ∗ t2 = rightc();if (t1→isVar() ∧ t2→isVar() ≡ false)

string varname = t1→name;if (t2→occursFree(varname) ≡ false)

term_schema ∗ temp = t1; 〈isEq::common code 48a〉if (ret) return ret;

if (t2→isVar() ∧ t1→isVar() ≡ false)

string varname = t2→name;if (t1→occursFree(varname) ≡ false)

term_schema ∗ temp = t2; 〈isEq::common code 48a〉if (ret) 〈isEq::switch t1 and t2 48b〉 return ret;

return false;

Defines:

isEq, used in chunks 47a and 52–54.Uses isFunc2Args 13b 13c, isVar 11a, leftc 11d, occursFree 27c 28a, rightc 11d, and term_schema 9.


Comment 2.2.26. We use occursFreeNaive for root here because the temporary variable re-naming we do affects the correctness of the caching of computed free variables.

48a 〈isEq::common code 48a〉≡ (47b)

bool ret = false;temp→name = newVar();if (root→occursFreeNaive(varname)) ret = true;temp→name = varname;

Uses newVar 29c and occursFreeNaive 27c 28b.

Comment 2.2.27. We need to swap t1 and t2 because procedures that call findEq expect thevariable that satisfies all the conditions to be on the LHS of the equation.

48b 〈isEq::switch t1 and t2 48b〉≡ (47b 53b)

term_schema ∗ temp = t1; leftc()→fields[1] = t2; fields[1] = temp;

Uses fields 11c, leftc 11d, and term_schema 9.

Comment 2.2.28. Example execution of simplifyConjunction.

Query: ((&& y) ((&& ((== x) T1)) ((&& ((== T2) y)) x)))

Time = 1 Answer: ((&& y) ((&& ((== x) T1)) ((&& ((== T2) y)) T1)))

Time = 2 Answer: ((&& T2) ((&& ((== x) t1)) ((&& ((== y) T2)) T1)))

There are two variable-instantiating equations in the query. It is easy to get this wrong if one isnot careful.

Comment 2.2.29. We next look at the implementation of the rules

u ∧ (∃x1. · · · ∃xn.v) = ∃x1. · · · ∃xn.(u ∧ v) (2.2)

(∃x1. · · · ∃xn.v) ∧ u = ∃x1. · · · ∃xn.(v ∧ u) (2.3)

Note that the convention on syntactic variables dictate that none of the variables xi can appearfree in u. The two rules can be captured by repeated applications of the following two specialcases of the rules

u ∧ (∃x.v) = ∃x.(u ∧ v) (2.4)

(∃x.v) ∧ u = ∃x.(v ∧ u) (2.5)

and these are what we will actually implement. We choose to implement these easier rules becausechecking that each xi is free in u would be an expensive exercise.

Interestingly, it is actually quite important to get the order of u and v right in the conjunction.For example, implementing

(∃x.v) ∧ u = ∃x.(u ∧ v)

instead of Statement 2.5 will seriously slow down the predicate permute

48c 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 46b 50 ⊲

bool simplifyConjunction2(term_schema ∗ parent, uint id);Defines:

simplifyConjunction2, used in chunk 61a.Uses term_schema 9.


49a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 47b 51a ⊲

bool term_schema::simplifyConjunction2(term_schema ∗ parent, uint id) term_schema ∗ t1 = leftc()→rightc(), ∗ t2 = rightc();term_schema ∗ sigma, ∗ other;if (t2→isApp() ∧ t2→leftc()→isF("sigma"))

sigma = t2; other = t1; else if (t1→isApp() ∧ t1→leftc()→isF("sigma"))

sigma = t1; other = t2; else return false;

string var = sigma→rightc()→fields[0]→name;if (other→occursFree(var)) return false;

〈simplifyConjunction2::create body 49b〉〈simplify update pointers 36b〉return true;

Defines:simplifyConjunction2, used in chunk 61a.

Uses fields 11c, isApp 11a, isF 11a 11b, leftc 11d, occursFree 27c 28a, rightc 11d, and term_schema 9.

Comment 2.2.30. We could recycle ∃x but choose not to.

49b 〈simplifyConjunction2::create body 49b〉≡ (49a)

term_schema ∗ con = newT2Args(F, "&&");if (sigma ≡ t2)

con→leftc()→insert(other→reuse());con→insert(sigma→rightc()→fields[1]→reuse());

else con→leftc()→insert(sigma→rightc()→fields[1]→reuse());con→insert(other→reuse());

term_schema ∗ abs = new_term(ABS);abs→insert(new_term(V, var)); abs→insert(con);term_schema ∗ ret = new_term(APP);ret→insert(new_term(F, "sigma")); ret→insert(abs);

Uses fields 11c, insert 11d, leftc 11d, new_term 17b, newT2Args 14a, reuse 20e, rightc 11d, and term_schema 9.


Comment 2.2.31. Example execution of simplifyConjunction2.

Query: ((&& (sigma \x1.(sigma \x2.v))) u)

Time = 1 Answer: (sigma \x1.((&& u) (sigma \x2.v)))

Time = 2 Answer: (sigma \x1.(sigma \x2.((&& u) v)))

Comment 2.2.32. The use of Statements 2.4 and 2.5 introduces a peculiar behaviour into Escherin that the same query, when asked using two different variable names, can result in two differentcomputation sequences. To illustrate this, consider the following statement:

f : Int× (Int→ Ω)→ Ω

f(x, s) = (x ≤ 8) ∧ ∃z.(z ∈ s ∧ (prime z)).

Now, if we ask Escher to compute the value of f(y, 2, 3), we get

f(y, 2, 3) = (y ≤ 8) ∧ ∃z.(z ∈ 2, 3 ∧ (prime z))

= ∃z.((y ≤ 8) ∧ z ∈ 2, 3 ∧ (prime z))

= . . .

= ∃z.((y ≤ 8) ∧ (z = 2))

= (y ≤ 8).

However, if we ask Escher to compute the value of f(z, 2, 3), we will get

f(z, 2, 3) = (z ≤ 8) ∧ ∃z.(z ∈ 2, 3 ∧ (prime z))

= . . .

= (z ≤ 8) ∧ ∃z.(z = 2)

= (z ≤ 8) ∧ ⊤

= (z ≤ 8).

The two computation sequences are different from the second step onwards. The second sequencetakes one step longer than the first. Just about the only comforting thing is that the end resultsare equivalent. The questions then are

1. Should we retain Statements 2.4 and 2.5 in the booleans module? Judging from the (limittedset of) test programs I have, there is no actual need for the two statements. In general, Ibelieve they make certain computations go faster, although one can also show instanceswhere they actually make things go slightly slower.

2. If we retain the two statements, should we modify the convention so that a variable renamingis done to make them applicable in cases where they cannot be applied?

Comment 2.2.33. This function implements the following existential rules:

∃x1. · · · ∃xn.⊤ = ⊤ (2.6)

∃x1. · · · ∃xn.⊥ = ⊥ (2.7)

∃x1. · · · ∃xn.(x ∧ (x1 = u) ∧ y) = ∃x2. · · · ∃xn.(xx1/u ∧ ⊤ ∧ yx1/u). (2.8)

The other rules are implemented in the Booleans module. I suppose they can be implementedhere if we really need to maximise efficiency at the price of complicated code.

50 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 48c 52d ⊲

bool simplifyExistential(term_schema ∗ parent, uint id);

Defines:simplifyExistential, used in chunk 61a.

Uses term_schema 9.


Comment 2.2.34. We first check whether the current term starts with ∃x1 · · · ∃xn. We thenmove to the subterm after ∃x1 · · · ∃xn and perform surgery on it if possible.

51a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 49a 53a ⊲

bool term_schema::simplifyExistential(term_schema ∗ parent, uint id) if (fields[0]→isF("sigma") ≡ false) return false;

term_schema ∗ ret = NULL; term_schema ∗ p = NULL;string var = rightc()→fields[0]→name;substitution bind;

〈simplifyExistential::move to the body 51b〉

〈simplifyExistential::case one and two 51c〉〈simplifyExistential::tricky case 52a〉

simplifyExistential_cleanup:〈simplify update pointers 36b〉return true;

Defines:

simplifyExistential, used in chunk 61a.Uses fields 11c, isF 11a 11b, rightc 11d, substitution 31a, and term_schema 9.

Comment 2.2.35. The following allows us to move past the remaining ∃xi to get to the body ofthe term.

51b 〈simplifyExistential::move to the body 51b〉≡ (51a)

term_schema ∗ body = fields[1]→fields[1]; // term_schema * bdparent=fields[1];while (body→isApp() ∧ body→leftc()→isF("sigma"))

body = body→rightc()→fields[1]; // bdparent = body->rightc();

Uses fields 11c, isApp 11a, isF 11a 11b, leftc 11d, rightc 11d, and term_schema 9.

Comment 2.2.36. This handles Statements 2.6 and 2.7. Completeness of specification is not anissue here. The two statements can be captured by repeated application of the statements

∃x.⊤ = ⊤ and ∃x.⊥ = ⊥.

Making them part of the internal simplification routine gives us efficiency advantages.

51c 〈simplifyExistential::case one and two 51c〉≡ (51a)

if (body→isD() ∧ (body→name ≡ "True" ∨ body→name ≡ "False")) ret = body→reuse(); goto simplifyExistential_cleanup;

Uses isD 11a and reuse 20e.

Comment 2.2.37. We next discuss Statement 2.8. The pattern in the head of Statement 2.8should be interpreted in the same way as the corresponding pattern in the conjunction rule de-scribed in Comment 2.2.16. Note that the statement is slightly different from that given in [Llo03],which takes the following form:

∃x1. · · · ∃xn.(x ∧ (xi = u) ∧ y) = ∃x1. · · · ∃xi−1.∃xi+1. · · · ∃xn.(xxi/u ∧ yxi/u).

First, restricting xi to be x1 as we did incurs a small computational cost in that we need to moveto the subterm starting with ∃xi during pattern matching to apply Statement 2.8. In return, we


can write simpler code. The second change is that instead of dropping the term (x1 = u), we puta ⊤ in its place. The two expressions are equivalent, of course. The advantage of that is the same:we can write simpler code. Another advantage of this latter change is that, unlike the originalstatement, we do end up with a natural special case. (See Comment 2.2.38.)

52a 〈simplifyExistential::tricky case 52a〉≡ (51a)

〈simplifyExistential::tricky case::special case 52b〉〈simplifyExistential::tricky case::general case 52c〉

Comment 2.2.38. A special case of Statement 2.8 is the following:

∃x1. · · · ∃xn.(x1 = u) = ∃x2. · · · ∃xn.⊤. (2.9)

The body of the statement can be further simplified to ⊤, of course.

52b 〈simplifyExistential::tricky case::special case 52b〉≡ (52a)

if (body→isEq(var)) ret = new_term(D, "True");goto simplifyExistential_cleanup;

Uses isEq 46b 47b 52d 53b and new_term 17b.

Comment 2.2.39. In the general case, we first check that the body has the overal form t1 ∧ t2.Then we attempt to find in the body an equation that instantiates the first quantified variable andreplaces it with ⊤. (This is performed all at the same time by replaceEq.) If that operation issuccessful, we perform term substitutions on the body and then get rid of the first quantification.

52c 〈simplifyExistential::tricky case::general case 52c〉≡ (52a)

if (body→isFunc2Args("&&") ≡ false) return false;p = body→replaceEq(var); if (p ≡ NULL) return false;

bind.first = p→leftc()→rightc()→name; bind.second = p→fields[1];body→subst(bind);p→freememory();

ret = rightc()→fields[1]→reuse();Uses fields 11c, freememory 19a 19c, isFunc2Args 13b 13c, leftc 11d, replaceEq 52d 53a, reuse 20e, rightc 11d,

and subst 31b 32a 34.

Comment 2.2.40. The function replaceEq finds a subterm of the form (x = t) embeddedconjunctively inside a term (with the help of isEq), replaces it with ⊤ and then returns a pointerto (x = t).

52d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 50 54a ⊲

term_schema ∗ replaceEq(string var);bool isEq(string var);

Defines:isEq, used in chunks 47a and 52–54.replaceEq, used in chunks 52c and 55b.

Uses term_schema 9.

Comment 2.2.41. We assume that the calling term is a conjunction of the form t1 ∧ t2. If t1is a variable-instantiating equation, we return. Otherwise, we recurse on t1 if it has the right(conjunctive) form. Then we do the same on t2. (See also Comment 2.2.23.)


53a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 51a 53b ⊲

term_schema ∗ term_schema::replaceEq(string var) term_schema ∗ p = NULL;term_schema ∗ t1 = leftc()→rightc();if (t1→isEq(var))

leftc()→fields[1] = new_term(D, "True"); return t1; if (t1→isFunc2Args("&&")) p = t1→replaceEq(var); if (p) return p;

term_schema ∗ t2 = rightc();if (t2→isEq(var)) fields[1] = new_term(D, "True"); return t2; if (t2→isFunc2Args("&&")) p = t2→replaceEq(var); if (p) return p; return NULL;

Defines:

replaceEq, used in chunks 52c and 55b.Uses fields 11c, isEq 46b 47b 52d 53b, isFunc2Args 13b 13c, leftc 11d, new_term 17b, rightc 11d,

and term_schema 9.

Comment 2.2.42. This function checks whether the current term has the form (x = t) where xis the input variable and t is a term such that x does not occur free in t. If the current term hasthe form (t = x) where x does not occur free in t, we need to swap the two arguments becauseprocedures that call replaceEq expect the variable x to be on the LHS of the equation.

53b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 53a 54b ⊲

bool term_schema::isEq(string x) if (isFunc2Args("==") ≡ false) return false;term_schema ∗ t1 = leftc()→rightc(), ∗ t2 = rightc();

if (t1→isVar(x) ∧ t2→occursFree(x) ≡ false) return true;if (t2→isVar(x) ∧ t1→occursFree(x) ≡ false)

〈isEq::switch t1 and t2 48b〉return true;

return false;

Defines:isEq, used in chunks 47a and 52–54.

Uses isFunc2Args 13b 13c, isVar 11a, leftc 11d, occursFree 27c 28a, rightc 11d, and term_schema 9.

Comment 2.2.43. Example execution of simplifyExistential.

Query: (sigma \x3.(sigma \x2.(sigma \x1.((== x3) t1))))

Time = 1 Answer: True

Query: (sigma \x3.(sigma \x.(sigma \y.(&& y (&& (== x T1) (&& (== t2 y) x))))))

Time = 1 Answer: (sigma \x3.(sigma \y.(&& y (&& True (&& (== t2 y) T1)))))

Time = 2 Answer: (sigma \x3.(&& t2 (&& True (&& True T1))))

Time = 3 Answer: (sigma \x3.(&& t2 (&& True T1)))

Time = 4 Answer: (sigma \x3.(&& t2 T1))

Comment 2.2.44. This function implements the following universal rules:

∀x1. · · · ∀xn.(⊥ → u) = ⊤ (2.10)

∀x1. · · · ∀xn.(x ∧ (x1 = u) ∧ y→ v) = ∀x2. · · · ∀xn.((x ∧ ⊤ ∧ y→ v)x1/u). (2.11)

Statement 2.11 is equivalent to the following rule given in [Llo03]:

∀x1. · · · ∀xn.(x ∧ (x1 = u) ∧ y→ v) = ∀x1. · · · ∀xi−1.∀xi+1. · · · ∀xn.((x ∧ y→ v)xi/u).


54a 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 52d 56b ⊲

bool simplifyUniversal(term_schema ∗ parent, uint id);

Defines:simplifyUniversal, used in chunk 61a.

Uses term_schema 9.

54b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 53b 57a ⊲

bool term_schema::simplifyUniversal(term_schema ∗ parent, uint id) if (leftc()→isF("pi") ≡ false) return false;

string var = rightc()→fields[0]→name;〈simplifyUniversal::check the form of body 54c〉〈simplifyUniversal::true statement 54d〉〈simplifyUniversal::special case 54e〉〈simplifyUniversal::general case 55b〉

Defines:

simplifyUniversal, used in chunk 61a.Uses fields 11c, isF 11a 11b, leftc 11d, rightc 11d, and term_schema 9.

Comment 2.2.45. We move past the remaining ∀s to get to the body and check whether it hasthe form t1 → t2. If so, we move to t1.

54c 〈simplifyUniversal::check the form of body 54c〉≡ (54b)

term_schema ∗ body = rightc()→fields[1];while (body→isApp() ∧ body→leftc()→isF("pi"))

body = body→rightc()→fields[1];if (body→isFunc2Args("implies") ≡ false) return false;term_schema ∗ t1 = body→leftc()→rightc();

Uses fields 11c, isApp 11a, isF 11a 11b, isFunc2Args 13b 13c, leftc 11d, rightc 11d, and term_schema 9.

Comment 2.2.46. This code chunk implements Statement 2.10.

54d 〈simplifyUniversal::true statement 54d〉≡ (54b)

if (t1→isD("False")) term_schema ∗ ret = new_term(D, "True");〈simplify update pointers 36b〉return true;

Uses isD 11a, new_term 17b, and term_schema 9.

Comment 2.2.47. A special case of Statement 2.11 is the following:

∀x1. · · · ∀xn.((x1 = u)→ v) = ∀x2. · · · ∀xn.(⊤ → v)x1/u = ∀x2. · · · ∀xn.vx1/u.

54e 〈simplifyUniversal::special case 54e〉≡ (54b)

if (t1→isEq(var)) term_schema ∗ t2 = body→rightc();substitution bind(t1→leftc()→rightc()→name, t1→rightc());t2→subst(bind);body→replace(t2→reuse());t2→freememory();〈simplifyUniversal::change end game 55a〉

Uses freememory 19a 19c, isEq 46b 47b 52d 53b, leftc 11d, replace 20a 20b, reuse 20e, rightc 11d,subst 31b 32a 34, substitution 31a, and term_schema 9.


Comment 2.2.48. After changing the body, we remove the quantifier of x1 and return.

55a 〈simplifyUniversal::change end game 55a〉≡ (54e 55b)

term_schema ∗ ret = rightc()→fields[1]→reuse();〈simplify update pointers 36b〉return true;

Uses fields 11c, reuse 20e, rightc 11d, and term_schema 9.

Comment 2.2.49. We first check whether the LHS of → has the form t3 ∧ t4. If so, we seek tofind an equation instantiating the first quantified variable and replace it with ⊤. (This is againdone using replaceEq.) Then we make the necessary term substitutions and return.

55b 〈simplifyUniversal::general case 55b〉≡ (54b)

if (t1→isFunc2Args("&&") ≡ false) return false;term_schema ∗ p = t1→replaceEq(var); if (p ≡ NULL) return false;

substitution bind(p→leftc()→rightc()→name, p→fields[1]);body→subst(bind);p→freememory();

〈simplifyUniversal::change end game 55a〉Uses fields 11c, freememory 19a 19c, isFunc2Args 13b 13c, leftc 11d, replaceEq 52d 53a, rightc 11d,

subst 31b 32a 34, substitution 31a, and term_schema 9.

Comment 2.2.50. Example execution of simplifyUniversal.

Query: (pi \x2.(pi \x1.(pi \x3.((implies ((== x1) t1)) ((&& x1) x1)))))

Time = 1 Answer: (pi \x2.(pi \x3.((&& t1) t1)))

Query: (pi \x3.(pi \x1.(pi \x2.((implies ((&& ((== True) x2))

((&& ((== x1) True)) ((&& x1) x2)))) t1))))

Time = 1 Answer: (pi \x3.(pi \x2.((implies ((&& ((== True) x2))

((&& True) ((&& True) x2)))) t1)))

Time = 2 Answer: (pi \x3.((implies ((&& True) ((&& True) (&& True True)))) t1))

Time = 3 Answer: (pi \x3.((implies ((&& True) ((&& True) True))) t1))

Time = 4 Answer: (pi \x3.((implies ((&& True) True)) t1))

Time = 5 Answer: (pi \x3.((implies True) t1))

Time = 6 Answer: (pi \x3.t1)

2.2.2 Computing and Reducing Candidate Redexes

Comment 2.2.51. We now describe the function reduce that dynamically computes the candi-date redexes inside a term (in the leftmost outermost order) and tries to reduce them.

Definition 2.2.52. A redex of a term t is an occurrence of a subterm of t that is α-equivalent toan instance of the head of a statement.

Comment 2.2.53. Informally, given a term t, every term s represented by a subtree of the syntaxtree representing t, with the exception of the variable directly following a λ, is a subterm of t. Thepath expression leading from the root of the syntax tree representing t to the root of the syntaxtree representing s is called the occurrence of s. For exact formal definitions of these concepts, see[Llo03, pp. 46].

Comment 2.2.54. There is an easy way to count the number of subterms in a term t. A token iseither a left bracket ’(’, a variable, a constant, or an expression of the form λx for some variable x.The number of subterms in a term t is simply the number of tokens in (the string representation)of t. For example, the term ((f (1, (2, 3), 4)) λx.(g x)) has 13 subterms.


Comment 2.2.55. There are obviously many subterms. For redex testing, it is important thatwe rule out as many of these as posible up front. The following result is a start.

Proposition 2.2.56. Let t be a term. A subterm r of t cannot be a redex if any one of thefollowing is true:

1. r is a variable;

2. r = λx.t for some variable x and term t;

3. r = D t1 . . . tn, n ≥ 0, where D is a data constructor of arity m ≥ n, and each ti is a term;

4. r = (t1, . . . , tn) for some n ≥ 0.

Proof. Consider any statement h = b in the program. By definition, h has the form f t1 . . . tn,n ≥ 0 for some function f . In each of the cases above, r 6= hθ for any θ and therefore r cannot bea redex.

56a 〈cannot possibly be a redex 56a〉≡ (59a) 57b ⊲

if (tag ≡ V ∨ tag ≡ D) return false;if (tag ≡ ABS ∨ tag ≡ PROD ∨ isData()) goto not_a_redex;

Uses isData 56b 57a and tag 10c.

Comment 2.2.57. This function checks whether the current term has the form D t1 . . . tn, n ≥ 1,where D is a data constructor of arity m ≥ n and each ti is a term.

56b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 54a 57c ⊲

bool isData();

Defines:isData, used in chunk 56a.

Comment 2.2.58. When we see a term t = D t1 . . . tn, n ≥ 1, where D is a data constructor ofarity n and each ti is a term, we can immediately deduce that any prefix of t cannot be a redex.The variable is_data is used to store this information.

56c 〈term-schema parts 10c〉+≡ (9) ⊳ 26a 57d ⊲

bool is_data;

Defines:is_data, used in chunks 56 and 57a.

56d 〈term-schema initializations 12c〉+≡ (12a) ⊳ 26b 57e ⊲

is_data = false;

Uses is_data 56c.

56e 〈term-schema clone parts 12d〉+≡ (19b) ⊳ 26c 58a ⊲

ret→is_data = is_data;

Uses is_data 56c.

Comment 2.2.59. We can probably sometimes recycle t->is_data here, but decided to alwaysuse the safe false value instead.

56f 〈term-schema replace parts 12e〉+≡ (20b) ⊳ 26d 58b ⊲

is_data = false;

Uses is_data 56c.


Comment 2.2.60. If the current term is a data term, then the left subterm of the current termis also a data term.

57a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 54b 58c ⊲

bool term_schema::isData() if (tag 6= APP) return false;if (is_data) fields[0]→is_data = true; return true; if (¬spinetip)

setSelector(STDERR);ioprint("term = "); print(); ioprintln();

assert(spinetip);if (spinetip→isD())

is_data = true; fields[0]→is_data = true; return true; return false;

Defines:isData, used in chunk 56a.

Uses fields 11c, ioprint 246 247a, ioprintln 246 247a, is_data 56c, isD 11a, setSelector 246 247a,spinetip 21, STDERR 246, tag 10c, and term_schema 9.

Comment 2.2.61. Proposition 2.2.56 allows us to focus on terms of the form (f t1 . . . tn), n ≥ 0,in finding redexes. What else do we know that can be used to rule out as potential redexes sub-terms of this form?

Given a function symbol f , we define the effective arity of f to be the number of argument(s) fis applied to in the head of any statement in the program. Clearly, given a term t = (f t1 . . . tn),n ≥ 0, if n is not equal to the effective arity of f , then t cannot possibly be a redex.

57b 〈cannot possibly be a redex 56a〉+≡ (59a) ⊳ 56a

if (tag ≡ F ∧ getFuncEArity(name) 6= 0) return false;if (isFuncNotRightArgs()) goto not_a_redex;

Uses getFuncEArity 98c, isFuncNotRightArgs 57c 58c, and tag 10c.

Comment 2.2.62. This function checks whether the current term, which is an application node, isa function applied to the right number of arguments (its effective arity). The number of argumentscan be more than the effective arity of the leftmost function symbol. The term (((remove s) t) x)is one such example.

57c 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 56b 58e ⊲

bool isFuncNotRightArgs();

Defines:isFuncNotRightArgs, used in chunk 57b.

Comment 2.2.63. This is used to capture the fact that every prefix of a function applicationterm that does not have enough arguments will not have enough arguments.

57d 〈term-schema parts 10c〉+≡ (9) ⊳ 56c 65c ⊲

bool notEnoughArgs;

Defines:notEnoughArgs, used in chunks 57–59.

57e 〈term-schema initializations 12c〉+≡ (12a) ⊳ 56d

notEnoughArgs = false;

Uses notEnoughArgs 57d.


58a 〈term-schema clone parts 12d〉+≡ (19b) ⊳ 56e

ret→notEnoughArgs = notEnoughArgs;


Comment 2.2.64. We can probably safely recycle t->notEnoughArgs here.

58b 〈term-schema replace parts 12e〉+≡ (20b) ⊳ 56f 67e ⊲

notEnoughArgs = false;


Comment 2.2.65. If we have an excess of arguments, return true. Otherwise, if we have anunder supply of arguments, mark the notEnoughArgs flag of the left subterm and then returntrue.


bool term_schema::isFuncNotRightArgs() if (tag 6= APP) return false;if (notEnoughArgs) fields[0]→notEnoughArgs = true; return true; assert(spinetip);int numargs = spinelength-1;if (spinetip→isF() ≡ false) return false;int arity = getFuncEArity(spinetip→name);〈isFuncNotRightArgs::error handling 58d〉if (arity > numargs)

notEnoughArgs = true; fields[0]→notEnoughArgs = true;return true;

return (arity < numargs);

Defines:isFuncNotRightArgs, used in chunk 57b.

Uses fields 11c, getFuncEArity 98c, isF 11a 11b, notEnoughArgs 57d, spinelength 21, spinetip 21, tag 10c,and term_schema 9.

Comment 2.2.66. If the function is unknown, then we just return true as a conservative measure.

58d 〈isFuncNotRightArgs::error handling 58d〉≡ (58c)

if (arity ≡ -1) return true;

Comment 2.2.67. We now describe the reduce function. We compute the subterms one by onein the left-to-right, outermost to innermost order. For each subterm, we first determine whetherit can possibly be a candidate redex. If not, we proceed to the next subterm. Otherwise, weattempt to match and reduce it using try_match_n_reduce. If this is successful, we return true.Otherwise, we proceed to the next subterm. The parameter tried records the total number ofcandidate redexes actually tried by this function. All the other parameters are needed only bytry_match_n_reduce.

58e 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 57c 64b ⊲

bool reduce(term_schema ∗ parent, uint cid, term_schema ∗ root, int & tried, int & changed);

Defines:reduce, used in chunks 59b, 62c, and 190a.

Uses term_schema 9.


59a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 58c 64c ⊲

bool term_schema::reduce(term_schema ∗ parent, uint cid, term_schema ∗ root,int & tried, int & changed)

〈cannot possibly be a redex 56a〉tried++;if (try_match_n_reduce(this, parent, cid, root, tried, changed))

return true;not_a_redex:

if (tag ≡ ABS) return fields[1]→reduce(this, 1, root, tried,changed);if (tag ≡ APP)

〈reduce::small APP optimization 59b〉if (leftc()→reduce(this,0, root, tried, changed)) return true;return rightc()→reduce(this, 1, root, tried, changed);

if (tag ≡ PROD)

uint dimension = fields.size();for (uint i=0; i6=dimension; i++)

if (fields[i]→reduce(this, i, root, tried, changed))return true;

return false;setSelector(STDERR);cerr ≪ "term = "; print(); ioprintln();cerr ≪ "tag = " ≪ tag ≪ endl;assert(false); return false;

Defines:

reduce, used in chunks 59b, 62c, and 190a.Uses fields 11c, ioprintln 246 247a, leftc 11d, rightc 11d, setSelector 246 247a, STDERR 246, tag 10c,

term_schema 9, and try_match_n_reduce 60.

Comment 2.2.68. When we see a term of the form (f t) where f has effective arity greater than1, we can immediately deduce that f cannot be a redex. This would have been picked out if werecurse on f , but we can save a call to getFuncEArity by having a special case here.

59b 〈reduce::small APP optimization 59b〉≡ (59a)

if (leftc()→isF() ∧ notEnoughArgs)return rightc()→reduce(this, 1, root, tried, changed);

Uses isF 11a 11b, leftc 11d, notEnoughArgs 57d, reduce 58e 59a, and rightc 11d.

Comment 2.2.69. It is easy to add code to calculate the occurrence of each subterm if thisinformation is desired.

Comment 2.2.70. The function reduce uses the following function to try and match and reducea candidate redex. The function try_match_n_reduce works as follows. Given a candidate redex,we first examine whether it can be simplified using the internal simplification routines of Escher.If so, we are done and can return. Otherwise, we try to pattern match (using redex_match) thecandidate redex with the head of suitable statements in the program. If the head of a statementh = b is found to match with candidate using some term substitution θ, then we construct bθand replace candidate with bθ. Depending on whether candidate has a parent, we either onlyneed to redirect a pointer or we need to replace in place.

59c 〈terms.cc::local functions 14a〉+≡ (10a) ⊳ 29c

#include "global.h"

#include "pattern-match.h"

〈try match 60〉Uses global.h 232 and pattern-match.h 75c.


60 〈try match 60〉≡ (59c)

bool try_match_n_reduce(term_schema ∗ candidate, term_schema ∗ parent,uint cid,term_schema ∗ root, int & tried, int & changed)

〈try match::different simplifications 61a〉〈debug matching 1 63e〉

vector<substitution> theta;int bm_size = statements.size();for (int j=0; j6=bm_size; j++)

〈try match::find special cases where no matching is required 62a〉theta.clear();term_schema ∗ head = statements[j].stmt→leftc()→rightc();term_schema ∗ body = statements[j].stmt→rightc();〈debug matching 2 63f〉if (redex_match(head, candidate, theta))

〈try match::side conditions on types 62b〉〈try match::eager statements 62c〉changed = candidate→label;〈try match::unimportant things 63a〉term_schema ∗ temp = body→clone();temp→subst(theta);temp→label = changed;

if (parent) parent→setSubterm(temp, cid);candidate→freememory();

else candidate→replace(temp); temp→freememory(); 〈try match::output answer 63c〉return true;

〈debug matching 4 64a〉

return false;

Defines:try_match_n_reduce, used in chunk 59a.

Uses clear 145b, clone 19a 19b, freememory 19a 19c, label 21, leftc 11d, redex_match 70a 70b 70c,replace 20a 20b, rightc 11d, setSubterm 11d 11e, statements 242, subst 31b 32a 34, substitution 31a,and term_schema 9.


Comment 2.2.71. The different simplification routines described in 2.2.1 are used here. Wecheck the form of candidate before attempting to apply suitable routines.

61a 〈try match::different simplifications 61a〉≡ (60)

int olabel = candidate→label;if (candidate→isFunc2Args("_"))

string f = candidate→spinetip→name;if (f ≡ "==")

if (candidate→simplifyEquality(parent, cid)) ioprint("Equalities simplification\n");〈simpl output 61b〉

else if (f ≡ "&&") if (candidate→simplifyConjunction())

ioprint("And rule simplification\n");〈simpl output 61b〉

if (candidate→simplifyConjunction2(parent, cid)) ioprint("And2 rule simplification\n");〈simpl output 61b〉

if (candidate→simplifyInequalities(parent, cid))

ioprint("Inequalities simplification\n"); 〈simpl output 61b〉

if (candidate→simplifyArithmetic(parent, cid)) ioprint("Arithmetic simplification\n"); 〈simpl output 61b〉

if (candidate→isApp())

if (candidate→simplifyExistential(parent, cid)) ioprint("Existential rule simplification\n");〈simpl output 61b〉

if (candidate→simplifyUniversal(parent, cid)) ioprint("Universal rule simplification\n"); 〈simpl output 61b〉

if (candidate→betaReduction(parent, cid)) ioprint("Beta reduction\n"); 〈simpl output 61b〉

Uses betaReduction 44a 44b, ioprint 246 247a, isApp 11a, isFunc2Args 13b 13c, label 21, simplifyArithmetic39a 40a, simplifyConjunction 45 46a, simplifyConjunction2 48c 49a, simplifyEquality 35 36a,simplifyExistential 50 51a, simplifyInequalities 42a 43, simplifyUniversal 54a 54b, and spinetip 21.

Comment 2.2.72. The redex is marked out in the answer. We do not print the term before thesimplification; that would be too messy though.

61b 〈simpl output 61b〉≡ (61a)

ltime++;if (verbose) ioprint("Time = "); ioprintln(ltime);

candidate→redex = true;ioprint("Answer: "); root→print(); ioprint("\n\n");candidate→redex = false;

changed = olabel;if (parent) parent→fields[cid]→label = olabel;else candidate→label = olabel;return true;

Uses fields 11c, ioprint 246 247a, ioprintln 246 247a, label 21, ltime 242, redex 15b, and verbose 242.


Comment 2.2.73. We do not want to waste energy pattern matching if we can tell up front thata term and the head of a statement is not going to match. The following proposition provides asimple check for this.

Proposition 2.2.74. Let r = (f r1 . . . rn), n ≥ 0, be a term where is f a function symbol and leth = (g t1 . . . rm), m ≥ 0, be the head of a statement. If n 6= m or f 6= g, then r cannot be a redex.

Proof. In both cases there is no θ such that hθ = r.

62a 〈try match::find special cases where no matching is required 62a〉≡ (60)

if (candidate→spinelength -1 6= statements[j].numargs) continue;if (candidate→spinetip→name 6= statements[j].anchor) continue;

Uses spinelength 21, spinetip 21, and statements 242.

Comment 2.2.75. Some (overloaded) statements have side conditions on the type of subtermsappearing in the head. We will see how this is handled now.

62b 〈try match::side conditions on types 62b〉≡ (60)

if (statements[j].tycond) int sterm_index = statements[j].tycond→sterm;int osel = getSelector(); setSelector(STDOUT);ioprint("Checking type side condition: ");vector<pair<string, type ∗> > slns;bool res = unify(slns, stat_term_types[j][sterm_index].second,

statements[j].tycond→dtype);res ? ioprint(" True\n") : ioprint(" False\n");setSelector(osel);for (uint j=0; j6=slns.size(); j++) delete_type(slns[j].second);slns.clear();if (¬res) return false;

Uses clear 145b, condition 16c, delete_type 77b 77c, getSelector 246 247a, ioprint 246 247a,

setSelector 246 247a, stat_term_types 242, statements 242, STDOUT 246, and unify 88.

Comment 2.2.76. We now look at how eager statements are handled. When we matched asubterm of the form (f t1 . . . tn) with the head of a statement that is to be evaluated eagerly, weproceed to evaluate the arguments t1 to tn first. The whole expression can only be rewritten ifnone of the ti’s contain a redex.

62c 〈try match::eager statements 62c〉≡ (60)

if (statements[j].eager ∧ candidate→isApp()) // try reduce the arguments first, return true if any one can be reducedfor (int i=candidate→spinelength-1; i6=0; i−−)

// go to argument spinelength - i argumentterm_schema ∗ arg = candidate;for (int j=1; j6=i; j++) arg = arg→leftc();if (arg→rightc()→reduce(arg, 1, root, tried, changed))

return true;

Uses isApp 11a, leftc 11d, reduce 58e 59a, rightc 11d, spinelength 21, statements 242, and term_schema 9.

Comment 2.2.77. We are done talking about important things. We now list the not-so-importantthings like reporting and debugging checks.


63a 〈try match::unimportant things 63a〉≡ (60)

〈try match::debugging code 1 63d〉〈debug matching 3 63g〉〈try match::output pattern matching information 63b〉

63b 〈try match::output pattern matching information 63b〉≡ (63a)

ltime++;if (verbose)

ioprint("Time = "); ioprintln(ltime);ioprint("Matched "); head→print(); ioprintln(); // ioprint(" and ");// candidate->print();// ioprint("\nReplacing with "); body->print(); ioprint(’ ’);// printTheta(theta);

candidate→redex = true;ioprint("Query: "); root→print(); ioprintln();candidate→redex = false;

Uses ioprint 246 247a, ioprintln 246 247a, ltime 242, printTheta 74a 74b, redex 15b, and verbose 242.

63c 〈try match::output answer 63c〉≡ (60)

if (verbose) ioprint("Answer: "); root→print(); ioprint("\n\n"); Uses ioprint 246 247a and verbose 242.

Comment 2.2.78. This is a simple check to make sure the candidate redex and the instantiatedhead are really α-equivalent.

63d 〈try match::debugging code 1 63d〉≡ (63a)

#ifdef MAIN_DEBUG1

term_schema ∗ head1 = head→clone();head1→subst(theta);head1→applySubst();assert(head1→equal(candidate));

#endifUses clone 19a 19b, equal 14c 14d, subst 31b 32a 34, and term_schema 9.

Comment 2.2.79. These code allows us to track what is going on during matching.

63e 〈debug matching 1 63e〉≡ (60)

if (verbose ≡ 3) setSelector(STDOUT);ioprint("Trying to redex match "); candidate→print(); ioprintln();

Uses ioprint 246 247a, ioprintln 246 247a, redex 15b, setSelector 246 247a, STDOUT 246, and verbose 242.

63f 〈debug matching 2 63f〉≡ (60)

if (verbose ≡ 3) ioprint("\tand "); head→print(); ioprint(" ... "); Uses ioprint 246 247a and verbose 242.

63g 〈debug matching 3 63g〉≡ (63a)

if (verbose ≡ 3) ioprint("\t[succeed]\n");Uses ioprint 246 247a and verbose 242.


64a 〈debug matching 4 64a〉≡ (60)

if (verbose ≡ 3) ioprint("\t[failed]\n");

Uses ioprint 246 247a and verbose 242.

2.2.3 Pattern Matching

2.2.3.1 Preprocessing of Statements

Comment 2.2.80. During pattern matching, the name of bound variables in the head of a pro-gram statement s = t needs to be changed repeatedly. The corresponding variables in t must bechanged accordingly to preserve the original meaning of the statement. As this is a key operationthat needs to be done repeatedly very many times, an efficient algorithm is needed. The key ideahere is that we can use the same variable node to represent corresponding variables in s and t.This way, when we change a variable in s during pattern matching, all corresponding variablesin s and t get changed automatically. The term representations produced by the parser are treeswithout shared node. The following function collectSharedVars implements this kind of sharing.The procedure is simple. We first collect together all the shared variables in s and t separatelyusing shareLambdaVars. Then we redirect shared variables in t to their corresponding variablesin s using shareHeadLambdaVars.

Note that only bound variables are shared by this operation. The correctness of the functionlabelStaticBoundVars (see Comment 2.1.48) is thus not affected.

64b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 58e 64d ⊲

void collectSharedVars();Defines:

collectSharedVars, used in chunk 216e.


void term_schema::collectSharedVars() term_schema ∗ head = fields[0]→fields[1];term_schema ∗ body = fields[1];vector<term_schema ∗> headlvars;head→shareLambdaVars(headlvars, true);body→shareLambdaVars(headlvars, false);body→shareHeadLambdaVars(headlvars);

Defines:collectSharedVars, used in chunk 216e.

Uses fields 11c, shareHeadLambdaVars 66a, shareLambdaVars 65a, and term_schema 9.

Comment 2.2.81. Here are the auxiliary functions for collectSharedVars.

64d 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 64b 66b ⊲

void shareLambdaVars(vector<term_schema ∗> & lvars, bool use);void shareVar(term_schema ∗ var, term_schema ∗ parent, uint id);void shareHeadLambdaVars(vector<term_schema ∗> & hlvars);

Uses shareHeadLambdaVars 66a, shareLambdaVars 65a, shareVar 65b, and term_schema 9.

Comment 2.2.82. The input vector lvars is used to collect all the lambda variables in a term.We only need to do this for the head. The input parameter use controls this. The procedure ofshareLambdaVars is as follows: every time we see a term of the form λx.t, we use shareVar toredirect all occurrences of x in t to point to the x straight after the λ sign.


65a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 64c 65b ⊲

void term_schema::shareLambdaVars(vector<term_schema ∗> & lvars, bool use) if (tag ≡ ABS)

if (use) lvars.push_back(fields[0]);fields[1]→shareVar(fields[0], this, 1);fields[1]→shareLambdaVars(lvars, use);return;


fields[i]→shareLambdaVars(lvars, use);

Defines:shareLambdaVars, used in chunk 64.

Uses fields 11c, shareVar 65b, tag 10c, and term_schema 9.

Comment 2.2.83. The procedure shareVar with input variable x is only ever called within thecorrect scope t of a term λx.t. (If a subterm λx.t2 occurs inside t, we will skip that subterm.)This guarantees that all the variables that get redirected in the (tag == V) case are exactly thosevariables bound by the input variable var. The pointer parent->fields[id] points to the currentterm.


void term_schema::shareVar(term_schema ∗ var, term_schema ∗ parent, uint id) if (tag ≡ SV ∨ tag ≡ D ∨ tag ≡ F) return;if (tag ≡ ABS) if (var→name ≡ fields[0]→name) return;

fields[1]→shareVar(var, this, 1); return; if (tag ≡ V)

if (name ≡ var→name) parent→fields[id] = var→reuse();var→parents.push_back(&parent→fields[id]);this→freememory();

return;uint size = fields.size();for (uint i=0; i6=size; i++) fields[i]→shareVar(var, this, i);

Defines:shareVar, used in chunks 64d and 65a.

Uses fields 11c, freememory 19a 19c, reuse 20e, tag 10c, and term_schema 9.

Comment 2.2.84. Pointers to term schema pointers that got redirected in shareVar are storedin parents. These are then used for further redirection in shareHeadLambdaVars. At present, thisis the only place where parents is used. The parents parameter need not be initialized duringterm construction. It need not be copied during cloning. Its value also does not get affected duringreplacing.

65c 〈term-schema parts 10c〉+≡ (9) ⊳ 57d 67d ⊲

vector<term_schema ∗∗> parents;Uses term_schema 9.

Comment 2.2.85. The procedure for shareHeadLambdaVars is as follows. Every time we see aterm of the form λx.t, we redirect x and all occurrences of x in t pointing to it (these are recordedin parents) if x is in hlvars and then we recurse on t.


66a 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 65b 66c ⊲

void term_schema::shareHeadLambdaVars(vector<term_schema ∗> & hlvars) if (hlvars.empty()) return;if (tag ≡ ABS)

string vname = fields[0]→name;int size = hlvars.size();for (int i=0; i6=size; i++)

if (vname 6= hlvars[i]→name) continue;int psize = fields[0]→parents.size();for (int j=0; j6=psize; j++)

∗(fields[0]→parents[j]) = hlvars[i]→reuse();fields[0]→freememory();

fields[0]→freememory();fields[0] = hlvars[i]→reuse();break;

fields[1]→shareHeadLambdaVars(hlvars);return;


fields[i]→shareHeadLambdaVars(hlvars);

Defines:shareHeadLambdaVars, used in chunk 64.

Uses fields 11c, freememory 19a 19c, reuse 20e, tag 10c, and term_schema 9.

Comment 2.2.86. Free variables in program statements may also need to be changed duringpattern matching. To do away with the need for tree traversal, we employ the same trick to sharecorresponding free variables in the head and body of statements. This is done in a preprocessingstep. The following function performs this task. It works as follows. Every time we see a freevariable x in the head, we traverse the body to redirect all free occurrences of x to the one in thehead. Redirection is accomplished using shareFreeVar. We assume that labelStaticBoundVarshas been called to label the variables.

66b 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 64d 67a ⊲

void collectFreeVars(term_schema ∗ bodyparent, uint id);Defines:

collectFreeVars, used in chunk 216e.Uses term_schema 9.

66c 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 66a 67b ⊲

void term_schema::collectFreeVars(term_schema ∗ bodyparent, uint id) if (tag ≡ V ∧ isFree())

bodyparent→fields[id]→shareFreeVar(this, bodyparent, id);if (tag ≡ SV ∨ tag ≡ D ∨ tag ≡ F) return;if (tag ≡ ABS)

fields[1]→collectFreeVars(bodyparent, id);int size = fields.size();for (int i=0; i6=size; i++)

fields[i]→collectFreeVars(bodyparent, id);

Defines:collectFreeVars, used in chunk 216e.

Uses fields 11c, isFree 26e, shareFreeVar 67a 67b, tag 10c, and term_schema 9.


Comment 2.2.87. The return value of shareFreeVar can be used to implement the idea de-scribed in Comment 2.2.89.

67a 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 66b 67c ⊲

bool shareFreeVar(term_schema ∗ v, term_schema ∗ parent, uint id);Defines:

shareFreeVar, used in chunk 66c.Uses term_schema 9.

67b 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 66c 68 ⊲

bool term_schema::shareFreeVar(term_schema ∗ v, term_schema ∗ parent, uint id)if (tag ≡ V ∧ isFree() ∧ name ≡ v→name)

freememory(); parent→fields[id] = v→reuse(); return true;

if (tag ≡ SV ∨ tag ≡ D ∨ tag ≡ F) return false;if (tag ≡ ABS) return fields[1]→shareFreeVar(v, this, 1);bool ret = false;int size = fields.size();for (int i=0; i6=size; i++)

if (fields[i]→shareFreeVar(v, this, i)) ret = true;return ret;

Defines:shareFreeVar, used in chunk 66c.

Uses fields 11c, freememory 19a 19c, isFree 26e, reuse 20e, tag 10c, and term_schema 9.

Comment 2.2.88. The following function pre-computes all the free variables inside a subtermand put them in the vector preFVars. Pointers to terms instead of strings are used to allow us torename free variables directly without doing another traversal.

67c 〈term-schema::function declarations 11a〉+≡ (9) ⊳ 67a

void precomputeFreeVars();

Defines:precomputeFreeVars, used in chunk 216e.

67d 〈term-schema parts 10c〉+≡ (9) ⊳ 65c

vector<term_schema ∗> preFVars;

Uses term_schema 9.

67e 〈term-schema replace parts 12e〉+≡ (20b) ⊳ 58b

preFVars.clear();

Uses clear 145b.


68 〈term-schema::function definitions 11b〉+≡ (10a) ⊳ 67b

void term_schema::precomputeFreeVars() if (tag ≡ SV ∨ tag ≡ D ∨ tag ≡ F) return;if (tag ≡ V ∧ isFree()) preFVars.push_back(this);if (tag ≡ ABS)

fields[1]→precomputeFreeVars();preFVars = fields[1]→preFVars; return;


fields[i]→precomputeFreeVars();int size2 = fields[i]→preFVars.size();for (int j=0; j6=size2; j++)

preFVars.push_back(fields[i]→preFVars[j]);

Defines:

precomputeFreeVars, used in chunk 216e.Uses fields 11c, isFree 26e, tag 10c, and term_schema 9.

Comment 2.2.89. UNIMPLEMENTED IDEA: A variable that occurs in the head but not inthe body of a statement can be flagged so that we do not have to put its substitution in θ.

2.2.3.2 Redex Determination

Definition 2.2.90. A redex of a term t is an occurrence of a subterm of t that is α-equivalent toan instance of the head of a statement.

Fact 2.2.91. Two α-equivalent terms can only differ in the names of their bound variables. (Seealso [Llo03, pp. 71].)

Algorithm 2.2.92. To determine whether a term t is a redex with respect to the head h ofa statement, we need to determine whether there exists a term substitution θ such that hθ isα-equivalent to t. There is a simple algorithm for doing that:

θ ← while (hθ 6= t) do

o← leftmost innermost occurrence in t such that o is also in h and hθ|o 6= t|o;if hθ|o and t|o are both λ-terms then

change name of bound variable in hθ|o to that in t|o, renaming freevariables in hθ|o to avoid free-variable capture whenever necessary;

else if hθ|o is a free occurrence of a variable x in h and no free variable in t|owould be captured by the substitution x/t|o then

θ ← θ ∪ x/t|o;

else return failure;

return θ;

Comment 2.2.93. The no-free-variable-capture condition in the else if case is needed to preventmatching on statements like h = λy.x and t = λy.(g y). Without the condition, we would bind(g y) to x, but the end result of doing hx/(g y) is actually λz.(g y), which is not equal to t. (SeeDefinition 2.5.3 in [Llo03].) If this kind of matching is desired, a syntactical variable must be used.


Comment 2.2.94. Algorithm 2.2.92 does not take syntatical variables into account. Concep-tually, given an equation with syntatical variables in it, we should first initialise the syntacticalvariables to obtain a valid statement. This will then allow us to use Algorithm 2.2.92 to do patternmatching on it. In practice, we do the instantiation of syntactical variables and pattern matchingat the same time. The following modified algorithm is used.

Algorithm 2.2.95. Given terms h with syntactical variables in it and a candidate redex t, thealgorithm decides whether there exists θ such that hθ is α-equivalent to t.

θ ← while (hθ 6= t) do

o← leftmost innermost occurrence in t such that o is also in h and hθ|o 6= t|o;if hθ|o and t|o are both λ-terms then

change name of bound variable in hθ|o to that in t|o, renaming freevariables in hθ|o to avoid free-variable capture whenever necessary;

else if hθ|o is a free occurrence of a variable x in h and no free variable in t|owould be captured by the substitution x/t|o then

θ ← θ ∪ x/t|o;

else if hθ|o is a syntactical variable x in h

θ ← θ ∪ x/t|o;

else return failure;

return θ;

Provided syntactical variables only ever occur at places where a (normal) variable can appear, Ithink the algorithm is complete in the sense that if there is a way to instantiate the syntacticalvariables so that a matching can occur, Algorithm 2.2.95 will find it.

Comment 2.2.96. Algorithm 2.2.95 renames variables as necessary when both hθ|o and t|o areλ-terms. Renaming of free variables in h is safe only because the head of a statement cannotcontain more than one occurrence of a free variable.

Comment 2.2.97. Typically, the h considered in Algorithm 2.2.95 is the head of a statementh = b. When we rename free variables in h, we also need to rename the corresponding variablesin b so as not to change the original meaning of the statement. How about the bound variables?When we change a bound variable in h, do we need to rename its corresponding variables in b?

In the presence of syntactical variables, the answer is a definite yes. Consider the statement(f λx.u) = λx.u. Given candidate redex (f λy.(g y)), we will get the incorrect answer λx.(g y) ifwe do not rename the x in the body of the statement during pattern matching. Efficient algorithmsfor doing such renaming of variables are described in Comments 2.2.80 and 2.2.86.

Comment 2.2.98. There is a simple way to realise Algorithm 2.2.95. Start with two pointerspt and ph pointing, respectively, at t and h. Denote by [pt] and [ph] the subterms of t and hpointed to by pt and ph. Move the pointers forward one step at a time to the next subterm in theleft-to-right, outermost-to-innermost order. At each time step, if [ph] 6= [pt] then:

1. if [ph] is a syntactical variable, add [ph]/[pt] to θ;

2. else if [ph] is a variable free in h and the free variable capture condition does not occur, add[ph]/[pt] to θ;

3. else if [pt] and [ph] are both lambda terms and xt and xh are the corresponding lambdavariables, then set all occurrences of xh in [ph] to xt, renaming as necessary free variablesthat get captured as a result;

4. else return failure.


70a 〈pattern-match::function declarations 70a〉≡ (75c) 74a ⊲

bool redex_match(term_schema ∗ head, term_schema ∗ term,vector<substitution> & theta);

bool redex_match(term_schema ∗ head, term_schema ∗ term,vector<substitution> & theta, vector<term_schema ∗> boundVars);

Defines:redex_match, used in chunks 60, 72, and 73.


70b 〈pattern-match::functions 70b〉≡ (75d) 70c ⊲

bool redex_match(term_schema ∗ head, term_schema ∗ term,vector<substitution> & theta)

vector<term_schema ∗> boundVars;return redex_match(head, term, theta, boundVars);

Defines:

redex_match, used in chunks 60, 72, and 73.Uses substitution 31a and term_schema 9.

70c 〈pattern-match::functions 70b〉+≡ (75d) ⊳ 70b 74b ⊲

bool redex_match(term_schema ∗ head, term_schema ∗ term,vector<substitution> & theta, vector<term_schema ∗> boundVars)

kind head_tag = head→tag;kind term_tag = term→tag;

if (head_tag ≡ SV) 〈redex-match::case of SV 70d〉 if (head_tag ≡ V) 〈redex-match::case of V 72a〉 if (head_tag 6= term_tag) return false;

if (head_tag ≡ F) return (head→name 6= term→name) ? false : true;

if (head_tag ≡ D) if (head→isfloat ∧ term→isfloat)

return (head→numf 6= term→numf) ? false : true;else if (head→isint ∧ term→isint)

return (head→numi 6= term→numi) ? false : true;return (head→name 6= term→name) ? false : true;// isstring is covered in the default case

if (head_tag ≡ APP) 〈redex-match::case of APP 72c〉 if (head_tag ≡ PROD) 〈redex-match::case of PROD 73a〉 if (head_tag ≡ ABS) 〈redex-match::case of ABS 73b〉 assert(false); return false;

Defines:

redex_match, used in chunks 60, 72, and 73.Uses isfloat 12b, isint 12b, isstring 12b, numf 12b, numi 12b, substitution 31a, tag 10c, and term_schema 9.

Comment 2.2.99. Here we consider matching on syntactical variables. A syntactical variablematches anything, if all the constraints are obeyed, that is.

70d 〈redex-match::case of SV 70d〉≡ (70c)

〈redex-match::case of SV::check constraints 71a〉substitution sub(head→name, term);theta.push_back(sub);return true;

Uses substitution 31a.


Comment 2.2.100. The constraint /VAR/ means that the term bound to the current syntacticalvariable must be a variable. The constraint /CONST/ means that the term bound to the currentsyntactical variable must be a data constructor. The constraint /EQUAL,x_SV/, where x_SV isanother syntactical variable appearing before the current one, means that the term bound to thecurrent syntactical variable must be equal to the term bound to x_SV.

71a 〈redex-match::case of SV::check constraints 71a〉≡ (70d)

condition ∗ constraint = head→cond;if (constraint)

int ctag = constraint→tag;string cname = constraint→name;

if (ctag ≡ CVAR ∧ term_tag 6= V) return false; // problematic?if (ctag ≡ CCONST ∧ term_tag 6= D) return false;if (ctag ≡ CEQUAL)

// if (term_tag != D && term_tag != V) return false;term_schema ∗ bound = findBinding(cname, theta);〈error handling::get previously bound 71b〉if (term→equal(bound) ≡ false) return false;

if (ctag ≡ CNOTEQUAL)

// if (term_tag != D) return false;term_schema ∗ bound = findBinding(cname, theta);〈error handling::get previously bound 71b〉if (term→equal(bound) ≡ true) return false;

// assert(ctag != CVAR);assert(ctag 6= CNOTEQUAL);

Uses CCONST 16b, CEQUAL 16b, CNOTEQUAL 16b, cond 16d, condition 16c, CVAR 16b, equal 14c 14d,

findBinding 75a 75b, tag 10c, and term_schema 9.

71b 〈error handling::get previously bound 71b〉≡ (71a)

if (bound ≡ NULL) setSelector(STDERR);ioprint("The constraint EQUAL or NOTEQUAL on syntactical "

"variables is used incorrectly; it appears before "

"its argument is instantiated.\n");assert(false);

Uses EQUAL 221, ioprint 246 247a, NOTEQUAL 221, setSelector 246 247a, and STDERR 246.


Comment 2.2.101. We next examine the case of variables. We do not have to do anything ifhead is identical to term. If head is a bound variable, then term must be identical to head formatching to succeed.

72a 〈redex-match::case of V 72a〉≡ (70c)

string head_name = head→name;if (term_tag ≡ V ∧ head_name ≡ term→name) return true;if (head→isFree() ≡ false) return false;

〈redex-match::case of V::check free variable capture condition 72b〉

substitution sub(head_name, term);theta.push_back(sub);return true;

Uses isFree 26e and substitution 31a.

Comment 2.2.102. We need to check that no variable in term would be captured by the sub-stitution head/term.

72b 〈redex-match::case of V::check free variable capture condition 72b〉≡ (72a)

int captd;if (term→captured(boundVars, captd))

setSelector(STDERR);cerr ≪" ** Matching Failed: Free variable capture in redex-match.\n";ioprint("head = "); head→print(); ioprintln();ioprint("term = "); term→print(); ioprintln();return false;

Uses captured 28d 29a, ioprint 246 247a, ioprintln 246 247a, redex 15b, setSelector 246 247a, and STDERR 246.

Comment 2.2.103. The case of applications is particularly simple. We first try to match theleft child. If successful, we match the right child.

72c 〈redex-match::case of APP 72c〉≡ (70c)

if (¬redex_match(head→leftc(), term→leftc(), theta, boundVars)) return false;return redex_match(head→rightc(), term→rightc(), theta, boundVars);

Uses leftc 11d, redex_match 70a 70b 70c, and rightc 11d.

Comment 2.2.104. These can be used to debug redex_match.

72d 〈redex-match::case of APP::debug matching 1 72d〉≡if (verbose ≡ 3)

ioprint("\n\t\tmatching "); head→leftc()→print();ioprint(" and "); term→leftc()→print(); ioprint(" ... ");

Uses ioprint 246 247a, leftc 11d, and verbose 242.

72e 〈redex-match::case of APP::debug matching 2 72e〉≡if (verbose ≡ 3)

ioprint(" successful\n");ioprint("\t\tmatching "); head→rightc()→print(); ioprint(" and ");term→rightc()→print(); ioprint(" ... ");

Uses ioprint 246 247a, rightc 11d, and verbose 242.


Comment 2.2.105. We now look at the case of products. We cannot assume that the dimensionsof head and term are equal even when the type-checker says they have the same types. Why?Well, sometimes we use a function name to represent data.

73a 〈redex-match::case of PROD 73a〉≡ (70c)

uint size = head→fields.size();if (size 6= term→fields.size()) return false;

for (uint i=0; i6=size; i++)if (¬redex_match(head→fields[i],term→fields[i],theta,boundVars))

return false;return true;

Uses fields 11c and redex_match 70a 70b 70c.

Comment 2.2.106. The last case is that of abstraction. We change the name of lambda variablesto avoid having to worry about α-equivalence later on.

73b 〈redex-match::case of ABS 73b〉≡ (70c)

〈redex-match::case of ABS::change variable name 73c〉boundVars.push_back(head);return redex_match(head→fields[1], term→fields[1], theta, boundVars);

Uses fields 11c and redex_match 70a 70b 70c.

Comment 2.2.107. If necessary, we need to change the name of the bound variable in head sothat it is the same as the bound variable in term. In so doing, we may inadvertently capture afree variable inside head. (This is an extremely rare scenario. I have never seen it happen in anynon-simulated computation.) Another variable renaming is necessary in this case.

Thanks to the preprocessing we did (see Comments 2.2.80 and 2.2.86), we need to set only thename of one variable in each case.

73c 〈redex-match::case of ABS::change variable name 73c〉≡ (73b)

string term_var = term→fields[0]→name;if (head→fields[0]→name 6= term_var)

int size = head→preFVars.size();for (int i=0; i6=size; i++)

if (term_var ≡ head→preFVars[i]→name) 〈redex-match::write a small warning message 73d〉head→preFVars[i]→name = newVar();

head→fields[0]→name = term_var;

Uses fields 11c and newVar 29c.

73d 〈redex-match::write a small warning message 73d〉≡ (73c)

int osel = getSelector(); setSelector(STDERR);ioprint(" ** Trouble. Variable "); head→preFVars[i]→print();ioprint(" captured after lambda variable renaming.\n");setSelector(osel);

Uses captured 28d 29a, getSelector 246 247a, ioprint 246 247a, setSelector 246 247a, and STDERR 246.

Comment 2.2.108. We now look at some instructive test cases for the procedure. Evaluatingthe following program


(f \y.x) = True

: (f \y.(g y y))

will result in

** Matching Error: Free variable capture in redex-match.

Final Answer: (f \y.((g y) y)).

To force a matching here, we can use the statement (f \y.x_SV) = True instead. Evaluating thesame query will then result in True.

Evaluating the program

(f \x.(g x y)) = (g y y)

: (f \y.(g y y))

will result in

** Trouble. Variable y captured after lambda variable renaming.

** Matching Failed: Free variable capture in redex-match.

Final Answer: (f \y.((g y) y)).

The lambda variable x in the head of the statement is successfully renamed at first. Matchingfails when we subsequently try to match the free variable y in the head of the statement with thebound variable y in the query. The reader should convince herself that matching should indeedfail in this case.

Evaluating this next prggram

(f \x.(g y x)) = (g y y)

: (f \y.(g z y))

will produce the answer ((g z) z).

2.2.3.3 Manipulating Substitutions

74a 〈pattern-match::function declarations 70a〉+≡ (75c) ⊳ 70a 75a ⊲

void printTheta(vector<substitution> & theta);Defines:

printTheta, used in chunk 63b.Uses substitution 31a.

74b 〈pattern-match::functions 70b〉+≡ (75d) ⊳ 70c 75b ⊲

void printTheta(vector<substitution> & theta) if (getSelector() ≡ SILENT) return;ioprint(’’);int size = theta.size();if (size ≡ 0) ioprint("\n"); return; for (int i=0; i6=size-1; i++)

ioprint(’(’); ioprint(theta[i].first); ioprint(’/’);theta[i].second→print(); ioprint("), ");

ioprint(’(’); ioprint(theta[size-1].first); ioprint(’/’);theta[size-1].second→print(); ioprint(’)’);ioprint("\n");

Defines:

printTheta, used in chunk 63b.Uses getSelector 246 247a, ioprint 246 247a, SILENT 246, and substitution 31a.

2.3. TYPES 75

75a 〈pattern-match::function declarations 70a〉+≡ (75c) ⊳ 74a

term_schema ∗ findBinding(string name, vector<substitution> & theta);

Defines:findBinding, used in chunk 71a.


75b 〈pattern-match::functions 70b〉+≡ (75d) ⊳ 74b

term_schema ∗ findBinding(string name, vector<substitution> & theta) int size = theta.size();for (int i=0; i6=size; i++)

if (theta[i].first ≡ name) return theta[i].second;return NULL;

Defines:

findBinding, used in chunk 71a.Uses substitution 31a and term_schema 9.

2.2.3.4 File Organization

75c 〈pattern-match.h 75c〉≡#ifndef _PATTERN_MATCH_H

#define _PATTERN_MATCH_H

#include "terms.h"

〈pattern-match::function declarations 70a〉

#endif

Defines:pattern-match.h, used in chunks 59c, 75d, and 206.

Uses terms.h 9.

75d 〈pattern-match.cc 75d〉≡#include <iostream>#include <utility>#include <vector>#include "io.h"


#include "tables.h"

〈pattern-match::functions 70b〉Uses io.h 246, pattern-match.h 75c, and tables.h 96b.

2.3 Types

Comment 2.3.1. Types are defined inductively in the logic, thus lending itself nicely to the useof composite pattern [GHJV95, p.163] for its implementation. We differentiate between atomicand composite types. Atomic types are obtained from type constructors with arity 0. Examplesof these include int, float, nat, char, string, etc. (Note that string is a nullary type constructorin this case. Strings in general can also be constructed from List char.) They are the base types,and occupy the leaf nodes of a composite type structure. Everything else are composite types.Examples of composite types include types obtained from type constructors of non-zero arity likeList α, Btree α, Graph α β, etc; function types like set α (this is equivalent to α → Ω) andmultiset α (α→ nat); and product types obtained from the tuple-forming operator. The following

2.3. TYPES 76

is an outline of the data types module. We first give the abstract classes, followed by the actualdata types.

76a 〈types.h 76a〉≡#ifndef _DATATYPE_H_

#define _DATATYPE_H_

#include <set>#include <vector>#include <string>#include <assert.h>#include <iostream>using namespace std;#define dcast dynamic_cast

〈type::function declarations 81a〉〈type::type 77a〉〈type::composite types 78a〉〈type::parameters 79c〉〈type::tuples 81d〉〈type::algebraic types 84b〉〈type::abstractions 82c〉〈type::synonyms 81c〉

#endif

Defines:types.h, used in chunks 76b, 85, 130c, 232, and 241.

76b 〈types.cc 76b〉≡#include "types.h"

#define uint unsigned int

〈type::functions 77c〉〈type::composite types::implementation 78b〉〈type::parameters::implementation 80a〉〈type::tuples::implementation 82a〉〈type::algebraic types::implementation 85a〉〈type::abstractions::implementation 83a〉

Uses types.h 76a.

2.3. TYPES 77

Comment 2.3.2. The top-level type structure contains as members those variables and functionsthat are common to all types. Every type obviously has a name. The functions setAlpha andaddAlpha are used to configure subtypes; they are defined only for composite types like tuples andlist. (See Comment 2.3.4 for details.)

77a 〈type::type 77a〉≡ (76a) 77b ⊲

class type public:

int count;type() count = 0; type(string n) : tag(n) count = 0; virtual ∼type() virtual void setAlpha(type ∗ x, unsigned int y) virtual void addAlpha(type ∗ x) virtual type ∗ getAlpha(unsigned int x) return NULL; virtual int alphaCount() return 0; virtual bool isComposite() return false; virtual bool isTuple() return false; virtual bool isAbstract() return false; virtual bool isParameter() return false; virtual bool isSynonym() return false; virtual bool isUdefined() return false; virtual string getName() return tag; virtual string getTag() return tag; virtual type ∗ clone() count++; return this; virtual void deccount() count−−; virtual set<string> getParameters() set<string> ret; return ret; virtual void renameParameters() virtual void renameParameter(string name)

protected:string tag;

;

Uses clone 19a 19b, getParameters 79a 80c, isUdefined 84b, renameParameter 79b 80e, renameParameters 79b 80d,and tag 10c.

Comment 2.3.3. We use reference counting for the memory management of the base types. Thevariable count keeps track of the number of references to a type. Deallocation of a type structureis done using the function delete-type defined as follows.

77b 〈type::type 77a〉+≡ (76a) ⊳ 77a

void delete_type(type ∗ x);

Defines:delete_type, used in chunks 62b, 78b, 81c, 86–88, 92b, 95b, 102d, 107, 110a, 117, 118c, 173a, 214, 227a, 236a,

238b, 241, and 243a.

77c 〈type::functions 77c〉≡ (76b) 81b ⊲

void delete_type(type ∗ x) if (x→count ≡ 0) delete x; else x→deccount();

Defines:delete_type, used in chunks 62b, 78b, 81c, 86–88, 92b, 95b, 102d, 107, 110a, 117, 118c, 173a, 214, 227a, 236a,

238b, 241, and 243a.

Comment 2.3.4. The following is the class declaration for composite types. The member alphastores the sub-types in the composite structure. It serves different purposes for different kinds ofcomposite types.

2.3. TYPES 78

78a 〈type::composite types 78a〉≡ (76a)

class type_composite : public type protected:

vector<type ∗> alpha;public:

virtual ∼type_composite();bool isComposite() return true; virtual void deccount();virtual void setAlpha(type ∗ x, unsigned int y);virtual void addAlpha(type ∗ x) alpha.push_back(x); virtual type ∗ getAlpha(unsigned int x);virtual int alphaCount() return alpha.size(); virtual string getName();virtual type ∗ clone() assert(false); virtual set<string> getParameters();virtual void renameParameters();virtual void renameParameter(string name);

;

Defines:type_composite, used in chunks 78, 79, 81d, 82c, and 84c.

Uses clone 19a 19b, getParameters 79a 80c, renameParameter 79b 80e, and renameParameters 79b 80d.

78b 〈type::composite types::implementation 78b〉≡ (76b) 78c ⊲

type_composite::∼type_composite() for (unsigned int i=0; i6=alpha.size(); i++) delete_type(alpha[i]);

Uses delete_type 77b 77c and type_composite 78a.

78c 〈type::composite types::implementation 78b〉+≡ (76b) ⊳ 78b 78d ⊲

void type_composite::deccount() count−−;for (unsigned int i=0; i6=alpha.size(); i++) alpha[i]→deccount();

Uses type_composite 78a.

78d 〈type::composite types::implementation 78b〉+≡ (76b) ⊳ 78c 79a ⊲

void type_composite::setAlpha(type ∗ x, unsigned int y) assert(y < alpha.size()); alpha[y] = x;

type ∗ type_composite::getAlpha(unsigned int x) assert(x < alpha.size()); return alpha[x];

string type_composite::getName() assert(false);

Uses type_composite 78a.

2.3. TYPES 79

Comment 2.3.5. The following functions are used during unification and type-checking. Thefirst one collects in a set all the parameters in a type. This is used in the unification algorithm. Thesecond and third functions are used to rename parameters during instantiation and type checking.

79a 〈type::composite types::implementation 78b〉+≡ (76b) ⊳ 78d 79b ⊲

set<string> type_composite::getParameters() set<string> ret;for (unsigned int i=0; i6=alpha.size(); i++)

set<string> temp = alpha[i]→getParameters();ret.insert(temp.begin(), temp.end());

return ret;

Defines:

getParameters, used in chunks 77–79, 86e, and 109d.Uses insert 11d and type_composite 78a.

79b 〈type::composite types::implementation 78b〉+≡ (76b) ⊳ 79a

void type_composite::renameParameters() set<string> ps = getParameters();set<string>::iterator p = ps.begin();while (p 6= ps.end()) renameParameter(∗p); inc_counter(); p++;

void type_composite::renameParameter(string name)

for (unsigned int i=0; i6=alpha.size(); i++)alpha[i]→renameParameter(name);

Defines:

renameParameter, used in chunks 77–80.renameParameters, used in chunks 77–79, 91a, 102a, 107a, 123b, and 219b.

Uses getParameters 79a 80c, inc_counter 81a 81b, and type_composite 78a.

Comment 2.3.6. Parameters are type variables.

79c 〈type::parameters 79c〉≡ (76a)

class type_parameter : public type public: type_parameter();

type_parameter(string x) tag = "Parameter"; vname = x; type ∗ clone() return new type_parameter(vname); bool isParameter() return true; string getName() return tag + "_" + vname; set<string> getParameters();void renameParameters();void renameParameter(string name);

private:string vname;

;

extern string newParameterName();Defines:

type_parameter, used in chunks 80, 91c, 93a, 94a, 108c, 227a, and 243c.Uses clone 19a 19b, getParameters 79a 80c, newParameterName 80b, renameParameter 79b 80e,

renameParameters 79b 80d, and tag 10c.

Comment 2.3.7. When we create a new type parameter, a distinct name of the form alpha_i

where i is a number will be assigned to the parameter.

2.3. TYPES 80

80a 〈type::parameters::implementation 80a〉≡ (76b) 80b ⊲

#include "global.h"

static int parameterCount = 0;type_parameter::type_parameter()

tag = "Parameter"; vname = newParameterName(); Uses global.h 232, newParameterName 80b, tag 10c, and type_parameter 79c.

Comment 2.3.8. New parameter names are created using this next function. The variableparameterCount is used here as the index for new parameter names.

80b 〈type::parameters::implementation 80a〉+≡ (76b) ⊳ 80a 80c ⊲

string newParameterName() string vname("alpha"); vname += numtostring(parameterCount++);return vname;

Defines:newParameterName, used in chunks 79c, 80a, and 94a.

Uses numtostring 241.

80c 〈type::parameters::implementation 80a〉+≡ (76b) ⊳ 80b 80d ⊲

set<string> type_parameter::getParameters() set<string> ret; string temp = tag + "_" + vname;ret.insert(temp); return ret;

Defines:

getParameters, used in chunks 77–79, 86e, and 109d.Uses insert 11d, tag 10c, and type_parameter 79c.

80d 〈type::parameters::implementation 80a〉+≡ (76b) ⊳ 80c 80e ⊲

void type_parameter::renameParameters() string temp = tag+"_"+vname; renameParameter(temp); inc_counter();

Defines:renameParameters, used in chunks 77–79, 91a, 102a, 107a, 123b, and 219b.

Uses inc_counter 81a 81b, renameParameter 79b 80e, tag 10c, and type_parameter 79c.

Comment 2.3.9. If a parameter has been indexed, we will first remove its index and then attacha new one. The function rfind returns npos if an underscore cannot be found in vname. (Searchproceeds from the end of vname.)

80e 〈type::parameters::implementation 80a〉+≡ (76b) ⊳ 80d

void type_parameter::renameParameter(string name) string tname = tag + "_" + vname;if (tname 6= name) return;char temp[10]; sprintf(temp, "_%d", get_counter_value());

uint i = vname.rfind("_");if (i 6= string::npos) vname.erase(i, vname.size()-i);

string temp2(temp); vname = vname + temp2;

Defines:renameParameter, used in chunks 77–80.

Uses get_counter_value 81a 81b, tag 10c, and type_parameter 79c.

2.3. TYPES 81

Comment 2.3.10. Some times, parameters need to be renamed to avoid name capture. We usea global counter for this purpose.

81a 〈type::function declarations 81a〉≡ (76a)

void inc_counter();int get_counter_value();

Defines:get_counter_value, used in chunk 80e.inc_counter, used in chunks 79b and 80d.

81b 〈type::functions 77c〉+≡ (76b) ⊳ 77c

static int counter = 0;void inc_counter() counter++; int get_counter_value() return counter;

Defines:get_counter_value, used in chunk 80e.inc_counter, used in chunks 79b and 80d.

Comment 2.3.11. Users can define type synonyms of the form t1 = t2, where t1 is an identifierand t2 the actual type. These are handled using the following class. The identifier t1 is stored intname; the actual type t2 is stored in actual.

81c 〈type::synonyms 81c〉≡ (76a)

class type_synonym : public type public:

type_synonym(string name, type ∗ ac) tag = name; tname = name; actual = ac;

∼type_synonym() delete_type(actual); type ∗ clone()

// assert(actual); count++; actual->count++; return this; assert(actual);return new type_synonym(tname,actual→clone());

void deccount() assert(false); bool isSynonym() return true; type ∗ getActual() return actual; string getName() return actual→getName();

private:type ∗ actual;string tname;

;Defines:

type_synonym, used in chunks 87b, 117a, 173a, and 227a.Uses clone 19a 19b, delete_type 77b 77c, and tag 10c.

Comment 2.3.12. The following is used to create product types.

81d 〈type::tuples 81d〉≡ (76a)

class type_tuple : public type_composite public:

type_tuple() tag = "Tuple"; type ∗ clone();bool isTuple() return true; string getName();

;Defines:

type_tuple, used in chunks 82, 84c, 94c, 227, and 228a.Uses clone 19a 19b, tag 10c, and type_composite 78a.

2.3. TYPES 82

82a 〈type::tuples::implementation 82a〉≡ (76b) 82b ⊲

type ∗ type_tuple::clone() type_tuple ∗ ret = new type_tuple;for (int i=0; i6=alphaCount(); i++)

ret→addAlpha(alpha[i]→clone());return ret;

Uses clone 19a 19b and type_tuple 81d.

82b 〈type::tuples::implementation 82a〉+≡ (76b) ⊳ 82a

string type_tuple::getName() string ret = "( ";for (unsigned int i=0; i6=alpha.size()-1; i++)

ret = ret + alpha[i]→getName() + " * ";ret = ret + alpha[alpha.size()-1]→getName() + ")";return ret;

Uses type_tuple 81d.

Comment 2.3.13. This is used for the construction of function types. It is worth mentioningthat sets and multisets have function types.

Function types of particular interest here are those for transformations. The variable rank

is used to record the rank of transformations. This value can be calculated using compRank.The functions getSource and getTarget returns the source and target of a transformation. Thefunction getArg returns the n-th argument.

82c 〈type::abstractions 82c〉≡ (76a)

class type_abstraction : public type_composite public:

int rank;type_abstraction() tag = "Arrow"; rank = -5; type_abstraction(type ∗ source, type ∗ target)

tag = "Arrow"; rank = -5;addAlpha(source); addAlpha(target);

bool isAbstract() return true; type ∗ clone();type ∗ getArg(int n);type ∗ getSource();type ∗ getTarget();string getName();int compRank();

;

Defines:getArg, used in chunk 107a.getSource, used in chunks 108, 109, 121b, 122a, 124, 129, and 217c.getTarget, used in chunks 108 and 109.type_abstraction, used in chunks 83, 84a, 93a, 94a, 106–109, 121a, 122a, 216b, 227d, and 243c.

Uses clone 19a 19b, compRank 84a, tag 10c, and type_composite 78a.

2.3. TYPES 83

83a 〈type::abstractions::implementation 83a〉≡ (76b) 83b ⊲

type ∗ type_abstraction::clone() type_abstraction ∗ ret =

new type_abstraction(alpha[0]→clone(), alpha[1]→clone());ret→rank = rank;return ret;

Uses clone 19a 19b and type_abstraction 82c.

83b 〈type::abstractions::implementation 83a〉+≡ (76b) ⊳ 83a 83c ⊲

string type_abstraction::getName() string ret;if (alpha[0]→isComposite())

ret = "(" + alpha[0]→getName() + ") -> ";else ret = alpha[0]→getName() + " -> ";if (alpha[1]→isComposite())

ret = ret + "(" + alpha[1]→getName() + ")";else ret = ret + alpha[1]→getName();return ret;

Uses type_abstraction 82c.

83c 〈type::abstractions::implementation 83a〉+≡ (76b) ⊳ 83b 83d ⊲

type ∗ type_abstraction::getArg(int n) assert(n < rank);type ∗ p = this;int temp = 0;while (temp 6= n) p = p→getAlpha(1); temp++; return p→getAlpha(0);

Defines:

getArg, used in chunk 107a.Uses type_abstraction 82c.

83d 〈type::abstractions::implementation 83a〉+≡ (76b) ⊳ 83c 83e ⊲

type ∗ type_abstraction::getSource() assert(rank 6= -5);type ∗ p = this;for (int i=0; i6=rank; i++) p = p→getAlpha(1);assert(p→getAlpha(0)); return p→getAlpha(0);

Defines:

getSource, used in chunks 108, 109, 121b, 122a, 124, 129, and 217c.Uses type_abstraction 82c.

83e 〈type::abstractions::implementation 83a〉+≡ (76b) ⊳ 83d 84a ⊲

type ∗ type_abstraction::getTarget() assert(rank 6= -5);type ∗ p = this;for (int i=0; i6=rank; i++) p = p→getAlpha(1);return p→getAlpha(1);

Defines:

getTarget, used in chunks 108 and 109.Uses type_abstraction 82c.

2.3. TYPES 84

Comment 2.3.14. This function computes the rank of a transformation. We inspect the spineof the type and count the number of predicate types appearing in it.

84a 〈type::abstractions::implementation 83a〉+≡ (76b) ⊳ 83e

int type_abstraction::compRank() if (alpha[1]→isAbstract() ∧ alpha[0]→isAbstract() ∧

alpha[0]→getAlpha(1)→getTag() ≡ "Bool") type_abstraction ∗ t = dcast<type_abstraction ∗>(alpha[1]);return 1 + t→compRank();

return 0;

Defines:

compRank, used in chunks 82c and 216b.Uses type_abstraction 82c.

Comment 2.3.15. Algebraic types are supported using the following classes. The class type_udefinedsupports nullary type constructors; the class type_alg supports non-nullary type constructors.Perhaps it makes sense to combine the two in one type.

84b 〈type::algebraic types 84b〉≡ (76a) 84c ⊲

class type_udefined : public type const vector<string> values;

public:type_udefined(string & tname, const vector<string> &vals)

: type(tname), values(vals) type_udefined(string & tname) : type(tname) bool isUdefined() return true; // type * clone() count++; return this; const vector<string> & getValues() return values;

;Defines:

isUdefined, used in chunks 77a and 208.type_udefined, used in chunks 208a, 210e, and 227a.

Uses clone 19a 19b.

84c 〈type::algebraic types 84b〉+≡ (76a) ⊳ 84b

class type_alg : public type_composite public:

type_alg(string tid) tag = tid; type_alg(string tid, vector<type ∗> x)

tag = tid;for (unsigned int i=0; i6=x.size(); i++)

addAlpha(x[i]→clone());type_alg(string tid, type_tuple ∗ x)

tag = tid;for (int i=0; i6=x→alphaCount(); i++)

addAlpha(x→getAlpha(i)→clone());type ∗ clone() return new type_alg(tag, alpha); string getName();

;Defines:

type_alg, used in chunks 85a, 227a, and 243c.Uses clone 19a 19b, tag 10c, type_composite 78a, and type_tuple 81d.

2.3. TYPES 85

85a 〈type::algebraic types::implementation 85a〉≡ (76b)

string type_alg::getName() string ret = "(" + tag;for (uint i=0; i6=alpha.size()-1; i++)

ret = ret + " " + alpha[i]→getName();ret = ret + " " + alpha[alpha.size()-1]→getName() + ")";return ret;

Uses tag 10c and type_alg 84c.

2.3.1 Unification

Comment 2.3.16. We now discuss type unification. The type unification algorithm given hereis adapted from the one given in [Pey87, Chap.5].

85b 〈unification.h 85b〉≡#ifndef _UNIFICATION_H_

#define _UNIFICATION_H_

#include "terms.h"

#include "types.h"

#include <vector>#include <utility>struct term_type term_schema ∗ first; type ∗ second; ;extern bool unify(vector<pair<string,type ∗> > &eqns,type ∗tvn,type ∗t);extern type ∗ apply_subst(vector<pair<string, type ∗> > & eqns, type ∗ x);extern type ∗ wellTyped(term_schema ∗ t);extern pair<type ∗, vector<term_type> > mywellTyped(term_schema ∗ t);extern type ∗ get_type_from_syn(type ∗ in);

#endif

Defines:term_type, used in chunks 90b, 91b, 96a, and 241–43.unification.h, used in chunks 117b, 131, 206, 232, and 241.

Uses apply_subst 86b, get_type_from_syn 87b, term_schema 9, terms.h 9, types.h 76a, and unify 88.

85c 〈unification.cc 85c〉≡#include <iostream>#include <utility>#include <vector>#include <string>#include "types.h"


bool unify_verbose = false; // set this to see the unification process〈unification body 86a〉〈type checking 95c〉

Defines:unify_verbose, used in chunks 88, 89, and 93c.

Uses types.h 76a.

Comment 2.3.17. The function getBinding returns the binding for parameter x in a typesubstitution θ.

2.3. TYPES 86

86a 〈unification body 86a〉≡ (85c) 86b ⊲

type ∗ getBinding(vector<pair<string, type ∗> > & eqns, type ∗ x) assert(x→isParameter());string vname = x→getName();for (unsigned int i=0; i6=eqns.size(); i++)

if (eqns[i].first ≡ vname) return eqns[i].second;return x;

Defines:

getBinding, used in chunks 86b and 88.

Comment 2.3.18. Given a type substitution θ and a type t with parameters, apply_subst

computes tθ.

86b 〈unification body 86a〉+≡ (85c) ⊳ 86a 86c ⊲

type ∗ apply_subst(vector<pair<string, type ∗> > & eqns, type ∗ t) if (t→isParameter())

return getBinding(eqns, t)→clone();type ∗ ret = t→clone();for (int i=0; i6=ret→alphaCount(); i++)

type ∗ temp = apply_subst(eqns, ret→getAlpha(i));delete_type(ret→getAlpha(i));ret→setAlpha(temp, i);

return ret;

Defines:apply_subst, used in chunks 85b, 87a, 88, 92b, and 107.

Uses clone 19a 19b, delete_type 77b 77c, and getBinding 86a.

Comment 2.3.19. This function extends a substitution θ with an additional equation x = t.If t is x, then the extension succeeds trivially. Otherwise, unless x appears in t, the extensionsucceeds.

86c 〈unification body 86a〉+≡ (85c) ⊳ 86b 87b ⊲

bool extend(vector<pair<string, type ∗> > & eqns, type ∗ x, type ∗ t) assert(x→isParameter());〈delete eqns of the form x = x 86d〉〈if x appears in t, return false 86e〉〈apply (x,t) to each eqn in eqns, extend eqns and return true 87a〉

Defines:extend, used in chunk 88.

86d 〈delete eqns of the form x = x 86d〉≡ (86c)

if (t→isParameter())if (x→getName() ≡ t→getName()) return true;

86e 〈if x appears in t, return false 86e〉≡ (86c)

// case of t not a parameterset<string> parameters = t→getParameters();if (parameters.find(x→getName()) 6= parameters.end())

return false;

Uses getParameters 79a 80c.

2.3. TYPES 87

87a 〈apply (x,t) to each eqn in eqns, extend eqns and return true 87a〉≡ (86c)

for (unsigned int i=0; i6=eqns.size(); i++) type ∗ temp = eqns[i].second;eqns[i].second = apply_subst(eqns, temp);delete_type(temp);

pair<string, type ∗> eqn(x→getName(), t→clone());eqns.push_back(eqn);return true;

Uses apply_subst 86b, clone 19a 19b, and delete_type 77b 77c.

Comment 2.3.20. This function extracts the actual type of a synonym. We may need to gothrough several redirections to get to the actual type.

87b 〈unification body 86a〉+≡ (85c) ⊳ 86c 88 ⊲

type ∗ get_type_from_syn(type ∗ in) type ∗ ret = in;while (ret→isSynonym())

ret = dcast<type_synonym ∗>(ret)→getActual();return ret;

Defines:get_type_from_syn, used in chunks 85b, 88, and 214b.

Uses type_synonym 81c.

2.3. TYPES 88

Comment 2.3.21. This function returns whether two types tvn and t are unifiable. If one of thetwo, say tvn, is a parameter, we will try extending eqns with the equation (tvn = t). Otherwise,we compare the tags and try to recursively unify the subtypes if the tags match.

88 〈unification body 86a〉+≡ (85c) ⊳ 87b

bool unify(vector<pair<string,type ∗> > &eqns, type ∗ tvn, type ∗ t) 〈unify::verbose 1 89b〉if (tvn→isSynonym()) tvn = get_type_from_syn(tvn);if (t→isSynonym()) t = get_type_from_syn(t);〈unify::verbose 2 89c〉

bool ret = false;if (tvn→isParameter())

type ∗ phitvn = getBinding(eqns, tvn)→clone();

type ∗ phit = apply_subst(eqns, t);// if phitvn == tvnif (phitvn→isParameter())

if (tvn→getName() ≡ phitvn→getName()) ret = extend(eqns, tvn, phit);delete_type(phit); delete_type(phitvn);if (unify_verbose) cerr ≪ ret ≪ endl;return ret;

else

ret = unify(eqns, phitvn, phit);delete_type(phit); delete_type(phitvn);if (unify_verbose) cerr ≪ ret ≪ endl;return ret;

// switch placeif (tvn→isParameter() ≡ false ∧ t→isParameter())

return unify(eqns, t, tvn);

〈unify::case of both non-parameters 89a〉return true;

Defines:unify, used in chunks 62b, 85b, 89a, 92b, 93c, 107a, 110a, 118a, and 214b.

Uses apply_subst 86b, clone 19a 19b, delete_type 77b 77c, extend 86c, get_type_from_syn 87b, getBinding 86a,and unify_verbose 85c.

2.3. TYPES 89

89a 〈unify::case of both non-parameters 89a〉≡ (88)

if (tvn→isParameter() ≡ false ∧ t→isParameter() ≡ false) if (tvn→getTag() 6= t→getTag()) return false;if (tvn→getTag() ≡ "Tuple" ∧ t→getTag() ≡ "Tuple")

if (tvn→alphaCount() 6= t→alphaCount()) if (unify_verbose) cerr ≪ false ≪ endl;return false;

// unify each componentif (tvn→alphaCount() 6= t→alphaCount())

cerr ≪ "Error in unification. Argument counts don’t match.\n";cerr ≪ "tvn = " ≪ tvn→getName() ≪ endl;cerr ≪ " t = " ≪ t→getName() ≪ endl;assert(false);

for (int i=0; i6=tvn→alphaCount(); i++)

bool r = unify(eqns,tvn→getAlpha(i),t→getAlpha(i));if (r ≡ false) return false;

Uses unify 88 and unify_verbose 85c.

Comment 2.3.22. We print out some information to help debugging.

89b 〈unify::verbose 1 89b〉≡ (88 89c)

if (unify_verbose)cerr ≪ "Unifying " ≪ tvn→getName() ≪ " and " ≪ t→getName() ≪endl;

Uses unify_verbose 85c.

89c 〈unify::verbose 2 89c〉≡ (88)

if (unify_verbose) cerr ≪ "After transformation:\n";〈unify::verbose 1 89b〉

Uses unify_verbose 85c.

2.3. TYPES 90

2.3.2 Type Checking

Comment 2.3.23. The type-checking procedure implements the following algorithm. For moredetails on type checking and type inference, see, for example, [Mit96, Chap. 11].

WT (C ) = α where α is the declared signature of C

WT (x ) =

α if WT (x ) = α has been established before;

a otherwise; here, a is a fresh parameter.

WT ((t1 , . . . , tn)) = WT (t1 )× · · · ×WT (tn)

WT (λx .t) =

α→ β if WT (t) = β and x is free with relative type α in t.

a→ β where a is a parameter otherwise.

WT ((s t)) = βθ if WT (s) = α→ β, WT (t) = γ, and α and γ are unifiable using θ.

The input term is not well-typed if any one of the WT calls on its subterms fails.

90a 〈type checking actual 90a〉≡ (95c)

type ∗ wellTyped2(term_schema ∗ t, vector<var_name> bvars, int scope) type ∗ ret = NULL;〈wellTyped2::case of t a constant 91a〉〈wellTyped2::case of t a variable 91c〉〈wellTyped2::case of t an application 92b〉〈wellTyped2::case of t an abstraction 94a〉〈wellTyped2::case of t a tuple 94c〉return ret;

Defines:wellTyped2, used in chunks 92b and 94–96.

Uses term_schema 9 and var_name 90b.

Comment 2.3.24. We first look at some data structures. The vector term_types is used to storethe inferred type for each subterm of the input term. The structure var_name is used to handlevariables; see Comment 2.3.27 for more details.

90b 〈type checking variables 90b〉≡ (95c)

vector<term_type> term_types;struct var_name string vname; string pname; ;

Defines:term_types, used in chunks 91–96.var_name, used in chunks 90a and 94–96.

Uses term_type 85b.

Comment 2.3.25. If the input term t is a constant, we find its signature α from the globalconstants repository (the function get_signature will halt with an error if t is unknown), renameall the parameters in α to obtain α′ and then return α′. We need to rename parameters becausesome of the parameters in α may have been introduced (and constrained) up to this point in thetype checking process. To illustrate, consider the following type declarations.

top : a → Ω

ind : a → Ω

The term (top ind) is clearly well-typed. But the type checking procedure will fail if we do notfirst rename, say, the first parameter a because the unification procedure will fail when attemptingto equate a and a→ Ω.

2.3. TYPES 91

91a 〈wellTyped2::case of t a constant 91a〉≡ (90a)

if (t→isF() ∨ t→isD()) if (t→isint) ret = new type("Int");else if (t→isfloat) ret = new type("Float");else if (t→isstring) ret = new type("String");else ret = get_signature(t→name);

if (ret) ret = ret→clone(); ret→renameParameters(); else return NULL;

〈wellTyped2::save n return 91b〉

Uses clone 19a 19b, get_signature 245b, isD 11a, isF 11a 11b, isfloat 12b, isint 12b, isstring 12b,and renameParameters 79b 80d.

Comment 2.3.26. Each subterm is stored in term_types the moment its type is inferred. Theseentries may be updated later on when parameters get instantiated further. See Comment 2.3.28.

91b 〈wellTyped2::save n return 91b〉≡ (91 92 94)

term_type res; res.first = t; res.second = ret;term_types.push_back(res);// setSelector(STDOUT); ioprint("\t"); t->print(); ioprint(" : ");// ioprintln(ret->getName()); setSelector(SILENT);return ret;

Uses ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, SILENT 246, STDOUT 246, term_type 85b,and term_types 90b.

Comment 2.3.27. To determine the type of a variable x, we need to know two things:

1. Is it a bound or a free variable?2. Has it occurred before?

If x is a bound variable that has occurred previously, we just recycle the previously computedtype. Else if x is a bound variable that has not occurred previously, we use the parameter namethat has been assigned earlier to create a new parameter. (See Comment 2.3.31.) Otherwise, if xis free, we check (in term_types) to see whether a type for x has been inferred earlier. If so, wereturn the inferred type. Otherwise, we create a new parameter with a new parameter name.

91c 〈wellTyped2::case of t a variable 91c〉≡ (90a)

if (t→isVar() ∨ t→tag ≡ SV) uint start = 0;for (int i=(int)bvars.size()-1; i6=-1; i−−)

if (t→name ≡ bvars[i].vname) start = scope;〈variable case::lookup previous occurrence 92a〉ret = new type_parameter(bvars[i].pname);〈wellTyped2::save n return 91b〉

〈variable case::lookup previous occurrence 92a〉ret = new type_parameter();〈wellTyped2::save n return 91b〉

Uses isVar 11a, tag 10c, and type_parameter 79c.

2.3. TYPES 92

92a 〈variable case::lookup previous occurrence 92a〉≡ (91c)

for (uint j=start; j6=term_types.size(); j++)if (term_types[j].first→isVar())

if (t→name ≡ term_types[j].first→name) ret = term_types[j].second→clone();〈wellTyped2::save n return 91b〉

Uses clone 19a 19b, isVar 11a, and term_types 90b.

Comment 2.3.28. If the input term is an application of the form (s t), we first infer the typesof s and t separately. Assuming the type of s has the form α → β, we then attempt to unify αwith γ, the type of t. If there exists a θ that unifies the two, we can then return βθ as the typefor (s t). We also update entries in term_types with θ to reflect new knowledge. The variablevlength keeps track of the part of term_types we can safely change.

92b 〈wellTyped2::case of t an application 92b〉≡ (90a)

if (t→isApp()) unsigned int vlength = term_types.size();type ∗ t1 = wellTyped2(t→leftc(), bvars, scope);〈wellTyped2::application::t1 should have right form 93a〉type ∗ t2 = wellTyped2(t→rightc(), bvars, scope);〈wellTyped2::application::error reporting 93b〉

vector<pair<string, type ∗> > slns;bool result = unify(slns, t1→getAlpha(0), t2);if (¬result) 〈wellTyped2::application::error reporting2 93c〉 ret = apply_subst(slns, t1→getAlpha(1));

for (uint i=vlength; i6=term_types.size(); i++) type ∗ temp = term_types[i].second;term_types[i].second = apply_subst(slns, temp);delete_type(temp);

for (uint j=0; j6=slns.size(); j++) delete_type(slns[j].second);slns.clear();〈wellTyped2::save n return 91b〉

Uses apply_subst 86b, clear 145b, delete_type 77b 77c, isApp 11a, leftc 11d, rightc 11d, term_types 90b,unify 88, and wellTyped2 90a.

2.3. TYPES 93

Comment 2.3.29. The type t1 should be a function type. If this is not the case but t1 is aparameter, we can rescue the situation by making t1 a type of the form a→ b, where both a andb are parameters. (This is equivalent to saying that s has type c, and that c = a → b.) If t1 isnot a parameter and not a function type, we have a typing error.

93a 〈wellTyped2::application::t1 should have right form 93a〉≡ (92b)

if (¬t1) int osel = getSelector();setSelector(STDERR); t→leftc()→print();ioprintln(" is not well typed."); setSelector(osel); return NULL;

if (¬t1→isAbstract() ∧ t1→isParameter()) type ∗ temp = t1;t1 = new type_abstraction(temp, new type_parameter());term_types[term_types.size()-1].second = t1;

if (¬t1→isAbstract())

int osel = getSelector();setSelector(STDERR); ioprint("*** Error: ");t→leftc()→print(); ioprint(" : "); ioprintln(t1→getName());ioprintln(" does not have function type.");setSelector(osel);return NULL;

Uses getSelector 246 247a, ioprint 246 247a, ioprintln 246 247a, leftc 11d, setSelector 246 247a, STDERR 246,term_types 90b, type_abstraction 82c, and type_parameter 79c.

93b 〈wellTyped2::application::error reporting 93b〉≡ (92b)

if (¬t2) int osel = getSelector(); setSelector(STDERR);t→rightc()→print(); ioprintln(" is not well typed.");setSelector(osel); return NULL;

Uses getSelector 246 247a, ioprintln 246 247a, rightc 11d, setSelector 246 247a, and STDERR 246.

Comment 2.3.30. Given s : α→ β and t : γ, the term (s t) is not well typed if we cannot unifyα and γ.

93c 〈wellTyped2::application::error reporting2 93c〉≡ (92b)

int osel = getSelector();setSelector(STDERR); t→print(); ioprintln(" is not well typed.");ioprint(t1→getAlpha(0)→getName()); ioprint(" and ");ioprint(t2→getName()); ioprintln(" are not unifiable\n");slns.clear();unify_verbose = true;unify(slns, t1→getAlpha(0), t2);setSelector(osel);return NULL;

Uses clear 145b, getSelector 246 247a, ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a,STDERR 246, unify 88, and unify_verbose 85c.

Comment 2.3.31. Given a lambda term λx.t, the variable x is given a new parameter name(stored in bvars), and every occurrence of x in t will use the same parameter name afterwards.

The type checking procedure is simple. We first check the type of t. Then we find the relativetype of x in t (recorded in term_types). If t does not contain x, then we just use the initiallyassigned parameter name to create a new parameter. If x has type α and t has type β, we returnα→ β.

2.3. TYPES 94

94a 〈wellTyped2::case of t an abstraction 94a〉≡ (90a)

if (t→isAbs()) uint vlength = term_types.size();

var_name tmp; tmp.vname = t→fields[0]→name;tmp.pname = newParameterName();bvars.push_back(tmp);

type ∗ t2 = wellTyped2(t→fields[1], bvars, vlength);〈wellTyped2::abstraction::error reporting 94b〉

type ∗ vt = NULL;for (uint i=vlength; i6=term_types.size(); i++)

if (term_types[i].first→isVar(t→fields[0]→name)) vt = term_types[i].second→clone(); break;

if (vt ≡ NULL) vt = new type_parameter(tmp.pname);

ret = new type_abstraction(vt, t2→clone());〈wellTyped2::save n return 91b〉

Uses clone 19a 19b, fields 11c, isAbs 11a, isVar 11a, newParameterName 80b, term_types 90b,type_abstraction 82c, type_parameter 79c, var_name 90b, and wellTyped2 90a.

94b 〈wellTyped2::abstraction::error reporting 94b〉≡ (94a)

if (¬t2) int osel = getSelector();setSelector(STDERR); ioprint("*** Error: "); t→print();ioprintln(" not well typed.\n"); setSelector(osel);return NULL;

Uses getSelector 246 247a, ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, and STDERR 246.

Comment 2.3.32. The case for tuples is easy. We just infer the types of each component andthen put them together.

94c 〈wellTyped2::case of t a tuple 94c〉≡ (90a)

if (t→isProd()) ret = new type_tuple;for (unsigned int i=0; i6=t→fields.size(); i++)

type ∗ ti = wellTyped2(t→fields[i], bvars, scope);〈wellTyped2::tuple::error reporting 94d〉ret→addAlpha(ti→clone());

〈wellTyped2::save n return 91b〉

Uses clone 19a 19b, fields 11c, isProd 11a, type_tuple 81d, and wellTyped2 90a.

94d 〈wellTyped2::tuple::error reporting 94d〉≡ (94c)

if (¬ti) int osel = getSelector();setSelector(STDERR); ioprint("*** Error: "); t→print();ioprintln(" not well typed.\n"); setSelector(osel);return NULL;

Uses getSelector 246 247a, ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, and STDERR 246.

2.3. TYPES 95

Comment 2.3.33. This is a function written for debugging purposes. It prints out the contentsof term_types.

95a 〈type checking subsidiary functions 95a〉≡ (95c) 95b ⊲

void print_term_types() int osel = getSelector(); setSelector(STDOUT);ioprintln(" *** ");for (uint i=0; i6=term_types.size(); i++)

term_types[i].first→print();ioprint(" : "); ioprintln(term_types[i].second→getName());

setSelector(osel);

Uses getSelector 246 247a, ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, STDOUT 246,

and term_types 90b.

Comment 2.3.34. We need to free up the memory occupied by the intermediate types inferredfor the subterms.

95b 〈type checking subsidiary functions 95a〉+≡ (95c) ⊳ 95a

void cleanup_term_types() // print_term_types();for (uint i=0; i6=term_types.size(); i++)

delete_type(term_types[i].second);term_types.clear();

Defines:cleanup_term_types, used in chunk 95c.

Uses clear 145b, delete_type 77b 77c, and term_types 90b.

Comment 2.3.35. The function wellTyped is a wrapper around the actual type-checking pro-cedure wellTyped2.

95c 〈type checking 95c〉≡ (85c) 96a ⊲

#include <string>#include <vector>#include "global.h"

#include "terms.h"

〈type checking variables 90b〉〈type checking subsidiary functions 95a〉〈type checking actual 90a〉

type ∗ wellTyped(term_schema ∗ t) vector<var_name> bvars;type ∗ ret = wellTyped2(t, bvars, 0);if (¬ret) int osel = getSelector(); setSelector(STDERR);

t→print(); ioprint(" is not well typed.\n");setSelector(osel); return NULL;

ret = ret→clone();cleanup_term_types();return ret;

Uses cleanup_term_types 95b, clone 19a 19b, getSelector 246 247a, global.h 232, ioprint 246 247a,

setSelector 246 247a, STDERR 246, term_schema 9, terms.h 9, var_name 90b, and wellTyped2 90a.

2.4. FUNCTION SYMBOL TABLE 96

Comment 2.3.36. The following is a version of wellTyped that returns both the type of theterm being checked and the type of each subterm computed. The latter is needed for checkingtypeof side conditions on statements.

96a 〈type checking 95c〉+≡ (85c) ⊳ 95c

pair<type ∗, vector<term_type> > mywellTyped(term_schema ∗ t) pair<type ∗, vector<term_type> > res;vector<var_name> bvars;type ∗ ret = wellTyped2(t, bvars, 0);if (¬ret) int osel = getSelector(); setSelector(STDERR);

t→print(); ioprint(" is not well typed.\n");setSelector(osel); res.first = NULL; return res;

ret = ret→clone();res.first = ret; res.second = term_types;term_types.clear();return res;

Uses clear 145b, clone 19a 19b, getSelector 246 247a, ioprint 246 247a, setSelector 246 247a, STDERR 246,term_schema 9, term_type 85b, term_types 90b, var_name 90b, and wellTyped2 90a.

2.4 Function Symbol Table

Comment 2.4.1. Information about function symbols (collected during parsing) are stored in ahash table for quick and easy access. We now describe this module.

96b 〈tables.h 96b〉≡#ifndef _TABLES_

#define _TABLES_

#include <string>#include <vector>#include <cassert>#include <iostream>using namespace std;

void initFuncTable();void insert_ftable(string func, int earity);int getFuncEArity(string func);void print_ftable();#endif

Defines:tables.h, used in chunks 9, 75d, 97a, and 206.

Uses getFuncEArity 98c, initFuncTable 97c, insert_ftable 98a, and print_ftable 99.


Comment 2.4.2. The function symbol table is represented as a (fixed-sized) array, where eachelement in the array is a bucket of function entries that got hashed to the same index. Each bucketis represented as a vector of fEntry structures.

97a 〈tables.cc 97a〉≡ 97b ⊲

#include "tables.h"

struct fEntry string name;int arity;int effectArity; // effective arity

;

#define TABLESIZE 501

static vector<fEntry> func_info[TABLESIZE];

Defines:fEntry, used in chunk 98a.func_info, used in chunks 98 and 99.TABLESIZE, used in chunks 97b and 99.

Uses tables.h 96b.

Comment 2.4.3. Clearly, we want a hash function that can be computed efficiently. Looking atthe first and last characters in the function name seemed a reasonable idea. (Looking at everycharacter seemed expensive, but there is probably not much in it.) We need to add size to makesure functions that begin and end with the same characters are hashed to different indices withhigh probability.

97b 〈tables.cc 97a〉+≡ ⊳ 97a 97c ⊲

static int hash(string name) int size = name.size();int ret = name[0] ∗ name[size-1] - (name[0] + name[size-1]) + size;ret = ret % TABLESIZE;return ret;

Defines:hash, used in chunk 98.

Uses TABLESIZE 97a.

Comment 2.4.4. We can probably have a scheme whereby we try out different hash functionsat run time and decide on one that induces the best distribution of functions in the table.

Comment 2.4.5. Here we need to initialise information for functions that are implemented insidethe code.

97c 〈tables.cc 97a〉+≡ ⊳ 97b 98a ⊲

void initFuncTable() insert_ftable("add", 2); insert_ftable("sub", 2);insert_ftable("max", 2); insert_ftable("min", 2);insert_ftable("mul", 2); insert_ftable("div", 2);insert_ftable("mod", 2);insert_ftable("<", 2); insert_ftable("<=", 2);insert_ftable(">", 2); insert_ftable(">=", 2);

Defines:initFuncTable, used in chunks 96b and 213d.

Uses insert_ftable 98a.


Comment 2.4.6. Basic insertion is okay. We first check whether func is already present beforeinserting.

98a 〈tables.cc 97a〉+≡ ⊳ 97c 98c ⊲

void insert_ftable(string func, int earity) int index = hash(func);int size = func_info[index].size();for (int i=0; i6=size; i++)

if (func_info[index][i].name ≡ func) int effarity = func_info[index][i].effectArity;if (effarity ≡ earity) return;else 〈insert ftable::error handling 98b〉

fEntry f; f.name = func; f.effectArity = earity;func_info[index].push_back(f);// print_ftable();

Defines:insert_ftable, used in chunks 96–98 and 216d.

Uses fEntry 97a, func_info 97a, hash 97b, and print_ftable 99.

98b 〈insert ftable::error handling 98b〉≡ (98a)

cerr ≪ "Error in insert_ftable. Function " ≪ func ≪" is already initialized with effective arity " ≪func_info[index][i].effectArity ≪" New value is " ≪ earity ≪ endl;

exit(1);

Uses func_info 97a and insert_ftable 98a.

98c 〈tables.cc 97a〉+≡ ⊳ 98a 99 ⊲

int getFuncEArity(string func) int index = hash(func);int size = func_info[index].size();for (int i=0; i6=size; i++)

if (func ≡ func_info[index][i].name)return func_info[index][i].effectArity;

cerr ≪ "Error: Function " ≪ func ≪ " unknown. "

"Effective arity could not be determined.\n";// assert(false);return -1;

Defines:getFuncEArity, used in chunks 57b, 58c, and 96b.

Uses func_info 97a and hash 97b.


99 〈tables.cc 97a〉+≡ ⊳ 98c

void print_ftable() for (int j=0; j6=TABLESIZE; j++)

cout ≪ j ≪ ": ";int size = func_info[j].size();for (int i=0; i6=size; i++)

cout ≪ "(" ≪ func_info[j][i].name ≪ " "

≪ func_info[j][i].effectArity ≪ ")\t";cout ≪ endl;

Defines:print_ftable, used in chunks 96b and 98a.

Uses func_info 97a and TABLESIZE 97a.

Chapter 3

Predicate Construction

If you want to make an apple pie from scratch,you must first create the universe.

Carl Sagan

3.1 Introduction

Comment 3.1.1. This chapter describes the predicate construction mechanism within Alkemy.We start off with a discussion of transformations and standard predicates, and the way they areimplemented in the system. This is followed by a fairly technical treatment of predicate rewritesystems. The system does not support the full generality of the theory. The subset implementedis identified and described in the last section.

For a rigorous development of the ideas, the reader is referred to [Llo03]. Some earlier ideason predicate construction can be found in [Llo00, Llo02].

3.2 Transformations and Standard Predicates

Comment 3.2.1. Predicates are built up by composing transformations. Composition is handledby the (reverse) composition function

: (a→ b)→ (b→ c)→ (a→ c)

defined by ((f g) x) = (g (f x)).

Definition 3.2.2. A transformation f is a function having a signature of the form

f : (1 → Ω)→ · · · → (k → Ω)→ µ→ σ,

where any parameters in 1, . . . , k and σ appear in µ, and k ≥ 0. The type µ is distinguished andis called the source of the transformation, while the type σ is called the target of the transformation.The number k is called the rank of the transformation.

Comment 3.2.3. It is understood that some collection of functions having appropriate signaturesare declared to be transformations for each particular application. Examples of transformationscan be found in [Llo03].

A standard predicate is defined by induction on the number of (occurrences of) transformationsit contains as follows.

Definition 3.2.4. A standard predicate is a term of the form

(f1 b1,1 . . . b1,k1) · · · (fn bn,1 . . . bn,kn

),

where fi is a transformation of rank ki (i = 1, . . . , n), the target of fn is Ω, bi,jiis a standard

predicate (i = 1, . . . , n, ji = 1, . . . , ki), ki ≥ 0 (i = 1, . . . , n) and n ≥ 1.

100

3.2. TRANSFORMATIONS AND STANDARD PREDICATES 101

Comment 3.2.5. We now look at the data structures for transformations and standard predi-cates. Their definitions are mutually dependent, hence the need for the forward declaration of thestandard predicate structure. There are a lot of interplay between their member functions. Thecorrectness of a function in one structure often depends on the correctness of the correspondingfunction in the other structure.

The member tnum is the unique identifier number for each transformation. It is used to definean order on the transformations. (This is used in Sect. 3.3.) The type of a transformation isrecorded in ttype. The parent links in the two structures are used to identify the enclosingtransformations when they appear as arguments in transformations.

Transformations and standard predicates are nothing more than terms. We have separate datastructures for them to make manipulation easier. These data structures can be turned into Escherterms easily. See Comments 3.2.11 and 3.2.19.

101a 〈predicate::representations 101a〉≡ (130c) 101b ⊲

struct std_predicate;

struct transformation_t int tnum;int rank;type ∗ ttype;transformation_t ∗ parent;vector<std_predicate ∗> args;transformation_t(int tnum, int rank, transformation_t ∗ parent);transformation_t();〈transformation function declarations 103c〉

;Defines:

transformation_t, used in chunks 101–107, 113b, 119b, 121–23, 128, 129, 150b, 162d, 173a, 218c, and 219b.Uses std_predicate 101b.

101b 〈predicate::representations 101a〉+≡ (130c) ⊳ 101a

struct std_predicate transformation_t ∗ parent;vector<transformation_t ∗> transformations;std_predicate();〈std predicate function declarations 105c〉

;

Defines:std_predicate, used in chunks 101–105, 108–110, 113b, 115–17, 119–23, 128b, 149, 150b, 155b, 162d, 171b,

173, 181a, 184–86, 189–91, 218, and 220b.Uses transformation_t 101a.


Comment 3.2.6. We now look at the implementation of some basic member functions of trans-formation. We have two constructors. The first creates a transformation according to some initialvalues. The second produces a top. We need to clone the type of top and rename the variables init. More about type issues later on.

102a 〈transformation member functions 102a〉≡ (131a) 102b ⊲

transformation_t::transformation_t(int tn, int rk, transformation_t ∗ p) tnum = tn; rank = rk; parent = p;ttype = NULL;

transformation_t::transformation_t()

tnum = getTopID();ttype = find_trans_info(tnum)→ttype→clone();ttype→renameParameters();rank = 0; parent = NULL;

Uses clone 19a 19b, find_trans_info 237a, getTopID 238a, renameParameters 79b 80d, and transformation_t

101a.

Comment 3.2.7. This function implements a (deep) cloning of the entire structure.

102b 〈transformation member functions 102a〉+≡ (131a) ⊳ 102a 102c ⊲

transformation_t ∗ transformation_t::clone() transformation_t ∗ ret = new transformation_t(tnum, rank, parent);if (¬ttype) cerr ≪ "Error. " ≪ ret→tnum ≪ endl; exit(1); ret→ttype = ttype→clone();for (uint i=0; i6=args.size(); i++)

std_predicate ∗ temp = args[i]→clone();temp→adopt_parent(ret);ret→args.push_back(temp);

return ret;

Uses adopt_parent 104b, clone 19a 19b, std_predicate 101b, and transformation_t 101a.

Comment 3.2.8. Printing is a straightforward operation.

102c 〈transformation member functions 102a〉+≡ (131a) ⊳ 102b 102d ⊲

void transformation_t::print() ioprint(find_trans_info(tnum)→name);for (uint i=0; i6=args.size(); i++)

ioprint(" ("); args[i]→print(); ioprint(")");

Uses find_trans_info 237a, ioprint 246 247a, and transformation_t 101a.

Comment 3.2.9. This function deallocates the memory occupied by the transformation.

102d 〈transformation member functions 102a〉+≡ (131a) ⊳ 102c 103a ⊲

void transformation_t::freememory() for (uint i=0; i6=args.size(); i++) args[i]→freememory();delete_type(ttype);delete this;

Uses delete_type 77b 77c, freememory 19a 19c, and transformation_t 101a.


Comment 3.2.10. This function checks whether two transformations are identical to each other.

103a 〈transformation member functions 102a〉+≡ (131a) ⊳ 102d 103b ⊲

bool transformation_t::tEqual(transformation_t ∗ t) if (tnum 6= t→tnum) return false;if (rank 6= t→rank) return false;if (args.size() 6= t→args.size()) return false;for (uint i=0; i6=args.size(); i++)

if (args[i]→spEqual(t→args[i]) ≡ false)return false;

return true;

Defines:tEqual, used in chunks 103c and 105a.

Uses spEqual 105a and transformation_t 101a.

Comment 3.2.11. This function transforms the calling transformation into an Escher term. Notethat application is left associative.

103b 〈transformation member functions 102a〉+≡ (131a) ⊳ 103a 106 ⊲

term_schema ∗ transformation_t::makeTerm() term_schema ∗ ret = new_term(F, find_trans_info(tnum)→name);for (int i=0; i6=rank; i++)

term_schema ∗ temp = new_term(APP);temp→insert(ret);temp→insert(args[i]→makeTerm());ret = temp;

return ret;

Defines:makeTerm, used in chunks 103c, 105c, 153b, 154a, 174a, and 190b.

Uses find_trans_info 237a, insert 11d, new_term 17b, term_schema 9, and transformation_t 101a.

Comment 3.2.12. Here is a quick summary of what we have defined for transformations.

103c 〈transformation function declarations 103c〉≡ (101a) 108a ⊲

transformation_t ∗ clone();void print();void freememory();bool tEqual(transformation_t ∗ t);term_schema ∗ makeTerm();

Uses clone 19a 19b, freememory 19a 19c, makeTerm 103b 105b, tEqual 103a, term_schema 9,and transformation_t 101a.

Comment 3.2.13. We next look at the implementation of some basic member functions of thestandard predicate data structure, starting with the constructor. We initialise the different fieldswith default values in the constructor.

103d 〈standard predicate member functions 103d〉≡ (131a) 104a ⊲

std_predicate::std_predicate() marker = 0; parent = NULL; polymorphic = -5;transformations.clear(); redexes.clear(); encoding.clear();

Uses clear 145b, encoding 122d, marker 125a, polymorphic 108b, redexes 119a, and std_predicate 101b.


Comment 3.2.14. This function provides a deep cloning of the standard predicate.

104a 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 103d 104b ⊲

std_predicate ∗ std_predicate::clone() std_predicate ∗ ret = new std_predicate;ret→parent = parent;ret→polymorphic = polymorphic;for (uint i=0; i6=transformations.size(); i++)

transformation_t ∗ temp = transformations[i]→clone();temp→parent = ret→parent;ret→transformations.push_back(temp);

ret→marker = marker;ret→encoding = encoding;ret→calRedexes();return ret;

Uses calRedexes 119a 119b, clone 19a 19b, encoding 122d, marker 125a, polymorphic 108b, std_predicate 101b,and transformation_t 101a.

Comment 3.2.15. This is a supporting procedure of transformation_t::clone. During cloning,new structures are allocated and pointers to parents must be reassigned from the old parent tothe newly created structures.

104b 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 104a 104c ⊲

void std_predicate::adopt_parent(transformation_t ∗ p) parent = p;for (uint i=0; i6=transformations.size(); i++)

transformations[i]→parent = p;

Defines:adopt_parent, used in chunks 102b, 105c, and 219a.

Uses std_predicate 101b and transformation_t 101a.

Comment 3.2.16. The procedure print outputs the calling standard predicate.

104c 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 104b 104d ⊲

void std_predicate::print() for (uint i=0; i6=transformations.size(); i++)

transformations[i]→print();if (i 6= transformations.size()-1) ioprint(".");

Uses ioprint 246 247a and std_predicate 101b.

Comment 3.2.17. This function frees the memory occupied by the calling predicate.

104d 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 104c 105a ⊲

void std_predicate::freememory() for (uint i=0; i6=transformations.size(); i++)

transformations[i]→freememory();delete this;

Uses freememory 19a 19c and std_predicate 101b.


Comment 3.2.18. This function checks whether two standard predicates are identical.

105a 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 104d 105b ⊲

bool std_predicate::spEqual(std_predicate ∗ in) if (transformations.size() 6= in→transformations.size())

return false;for (uint i=0; i6=transformations.size(); i++)

if (transformations[i]→tEqual(in→transformations[i])≡false)return false;

return true;

Defines:spEqual, used in chunks 103a, 105c, and 128a.

Uses std_predicate 101b and tEqual 103a.

Comment 3.2.19. This function transforms the calling predicate into an Escher term.


term_schema ∗ std_predicate::makeTerm() term_schema ∗ ret = transformations[0]→makeTerm();for (uint i=1; i6=transformations.size(); i++)

term_schema ∗ temp = new_term(APP);temp→insert(new_term(APP));temp→leftc()→insert(new_term(F, "comp"));temp→leftc()→insert(ret);temp→insert(transformations[i]→makeTerm());ret = temp;

return ret;

Defines:makeTerm, used in chunks 103c, 105c, 153b, 154a, 174a, and 190b.

Uses insert 11d, leftc 11d, new_term 17b, std_predicate 101b, and term_schema 9.

Comment 3.2.20. Here is a quick summary of what we have defined for standard predicates.

105c 〈std predicate function declarations 105c〉≡ (101b) 108b ⊲

std_predicate ∗ clone();void adopt_parent(transformation_t ∗ p);void print();void freememory();bool spEqual(std_predicate ∗ p);term_schema ∗ makeTerm();

Uses adopt_parent 104b, clone 19a 19b, freememory 19a 19c, makeTerm 103b 105b, spEqual 105a,std_predicate 101b, term_schema 9, and transformation_t 101a.


Comment 3.2.21. Transformations and standard predicate have types, and they play an impor-tant role in the identification of redexes and the proper application of rewrites to new predicates.We describe the functions available for manipulating the types here, starting with some simpleones. The function getSource computes the source type of the transformation; the functiongetTarget computes the target type.

106 〈transformation member functions 102a〉+≡ (131a) ⊳ 103b 107a ⊲

type ∗ transformation_t::getSource() type_abstraction ∗ mytype = dcast<type_abstraction ∗>(ttype);assert(mytype); return mytype→getSource();

type ∗ transformation_t::getTarget()

type_abstraction ∗ mytype = dcast<type_abstraction ∗>(ttype);assert(mytype); return mytype→getTarget();

void transformation_t::printType()

cout ≪ tnum ≪ " " ≪ ttype→getName() ≪ endl;for (int i=0; i6=rank; i++) args[i]→printType();

Defines:getSource, used in chunks 108, 109, 121b, 122a, 124, 129, and 217c.getTarget, used in chunks 108 and 109.printType, used in chunks 108a and 109c.

Uses transformation_t 101a and type_abstraction 82c.


Comment 3.2.22. When predicates are rewritten, the effect of local type changes must bepropagated globally. This is achieved by the function recalculateType. The vector slns isused to record the result of unification of types.

107a 〈transformation member functions 102a〉+≡ (131a) ⊳ 106 107b ⊲

static vector<pair<string, type ∗> > slns;

void transformation_t::recalculateType() type ∗ temp = find_trans_info(tnum)→ttype→clone();type_abstraction ∗ type_scheme = dcast<type_abstraction ∗>(temp);type_scheme→renameParameters();

for (int i=0; i6=rank; i++) args[i]→recalculateType();type ∗ argtype = args[i]→getSourceTarget();type ∗ argscheme = type_scheme→getArg(i);assert(slns.size() ≡ 0);bool res = unify(slns, argscheme, argtype);if (res ≡ false) print(); cout ≪ endl; assert(false); type ∗ temp2 = type_scheme;type ∗ temp3 = apply_subst(slns, type_scheme);type_scheme = dcast<type_abstraction ∗>(temp3);delete_type(temp2);delete_type(argtype);instantiateParameters(slns);for (uint j=0; j6=slns.size(); j++)

delete_type(slns[j].second);slns.clear();

delete_type(ttype);ttype = type_scheme;

Defines:recalculateType, used in chunks 108a, 122a, 123b, and 217c.

Uses apply_subst 86b, clear 145b, clone 19a 19b, delete_type 77b 77c, find_trans_info 237a, getArg 82c 83c,getSourceTarget 109b 110d, instantiateParameters 107b 110c 110d, renameParameters 79b 80d,transformation_t 101a, type_abstraction 82c, and unify 88.

Comment 3.2.23. The function instantiateParameters instantiates the parameters insidettype using a type substitution s.

107b 〈transformation member functions 102a〉+≡ (131a) ⊳ 107a

void transformation_t::instantiateParameters(vector<pair<string, type ∗> > &s)type ∗ temp = ttype;ttype = apply_subst(s, ttype);delete_type(temp);for (int i=0; i6=rank; i++) args[i]→instantiateParameters(s);

Defines:

instantiateParameters, used in chunks 107a, 108a, and 110a.Uses apply_subst 86b, delete_type 77b 77c, and transformation_t 101a.


Comment 3.2.24. Here are the member functions we have defined so far.

108a 〈transformation function declarations 103c〉+≡ (101a) ⊳ 103c

type ∗ getSource();type ∗ getTarget();void instantiateParameters(vector<pair<string, type ∗> > & s);void recalculateType();void printType();

Uses getSource 82c 83d 106 110d, getTarget 82c 83e 106 110d, instantiateParameters 107b 110c 110d,printType 106 110d, and recalculateType 107a 110a 110d.

Comment 3.2.25. A standard predicate is polymorphic if its type contains parameters. Thisis recorded in polymorphic. When a standard predicate is first formed from composition oftransformations, we need to work out its type. This is done using initialiseType.

108b 〈std predicate function declarations 105c〉+≡ (101b) ⊳ 105c 110d ⊲

int polymorphic; // -5 undefined, 0 no, 1 yes.

Defines:polymorphic, used in chunks 103d, 104a, 109d, and 123b.

108c 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 105b 109a ⊲

type ∗ std_predicate::initialiseType() int size = transformations.size();type ∗ source = new type_parameter(), ∗ target = new type_parameter();type_abstraction ∗ ret = new type_abstraction(source, target);if (size ≡ 1) return ret;

type_abstraction ∗ argtype, ∗ temp2;type ∗ last_s = new type_parameter();argtype = new type_abstraction(last_s, target→clone());〈std predicate::initialiseType::repeat code 108d〉

for (int i=size-2; i6=0; i−−) type ∗ temp = new type_parameter();argtype = new type_abstraction(temp, last_s→clone());last_s = temp;〈std predicate::initialiseType::repeat code 108d〉

argtype = new type_abstraction(source→clone(), last_s→clone());〈std predicate::initialiseType::repeat code 108d〉

return ret;

Defines:initialiseType, used in chunk 110a.

Uses clone 19a 19b, std_predicate 101b, type_abstraction 82c, and type_parameter 79c.

108d 〈std predicate::initialiseType::repeat code 108d〉≡ (108c)

temp2 = ret;ret = new type_abstraction(argtype, temp2);

Uses type_abstraction 82c.

Comment 3.2.26. The functions getType computes the type of the standard predicate fromscratch.


109a 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 108c 109b ⊲

type ∗ std_predicate::getType() int size = transformations.size();if (size ≡ 1) return transformations[0]→ttype→clone();

type_abstraction ∗ ret = new type_abstraction;ret→addAlpha(transformations[0]→getSource()→clone());ret→addAlpha(transformations[size-1]→getTarget()→clone());for (int i=size-1; i6=-1; i−−)

type_abstraction ∗ argtype = new type_abstraction;argtype→addAlpha(transformations[i]→getSource()→clone());argtype→addAlpha(transformations[i]→getTarget()→clone());type_abstraction ∗ temp = ret;ret = new type_abstraction(argtype, temp);

return ret;

Defines:getType, used in chunk 110a.

Uses clone 19a 19b, getSource 82c 83d 106 110d, getTarget 82c 83e 106 110d, std_predicate 101b,and type_abstraction 82c.

Comment 3.2.27. The function getSourceTarget has the obvious functionality.


type ∗ std_predicate::getSourceTarget() return new type_abstraction(getSource()→clone(),getTarget()→clone());

Defines:getSourceTarget, used in chunk 107a.

Uses clone 19a 19b, getSource 82c 83d 106 110d, getTarget 82c 83e 106 110d, std_predicate 101b,and type_abstraction 82c.

109c 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 109b 109d ⊲

void std_predicate::printType() for (uint i=0; i6=transformations.size(); i++)

transformations[i]→printType();

Uses printType 106 110d and std_predicate 101b.

109d 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 109c 110a ⊲

bool std_predicate::isPolymorphic() if (polymorphic ≡ -5)

polymorphic = 0;for (uint i=0; i6=transformations.size(); i++)

if (transformations[i]→ttype→getParameters().size()) polymorphic = 1; return true;

if (polymorphic ≡ 0) return false; else return true;

Defines:isPolymorphic, used in chunk 123b.

Uses getParameters 79a 80c, polymorphic 108b, and std_predicate 101b.

Comment 3.2.28. The last two functions recalculateType and instantiateParameters servesthe same purpose as described earlier, but for standard predicates.


110a 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 109d 110c ⊲

void std_predicate::recalculateType() for (uint i=0; i6=transformations.size(); i++)

transformations[i]→recalculateType();type ∗ type_scheme = initialiseType();type ∗ inst_type = getType();bool res = unify(slns, type_scheme, inst_type);〈std predicate::recalculateType::error handling 110b〉instantiateParameters(slns);for (uint i=0; i6=slns.size(); i++) delete_type(slns[i].second);slns.clear();delete_type(type_scheme); delete_type(inst_type);

Defines:

recalculateType, used in chunks 108a, 122a, 123b, and 217c.Uses clear 145b, delete_type 77b 77c, getType 109a 110d, initialiseType 108c 110d, instantiateParameters

107b 110c 110d, std_predicate 101b, and unify 88.

110b 〈std predicate::recalculateType::error handling 110b〉≡ (110a)

if (res ≡ false) setSelector(STDERR);print(); ioprintln();ioprintln(type_scheme→getName()); ioprintln(inst_type→getName());assert(false);

Uses ioprintln 246 247a, setSelector 246 247a, and STDERR 246.

110c 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 110a 112 ⊲

void std_predicate::instantiateParameters(vector<pair<string, type ∗> > & s) for (uint i=0; i6=transformations.size(); i++)

transformations[i]→instantiateParameters(s);

Defines:instantiateParameters, used in chunks 107a, 108a, and 110a.

Uses std_predicate 101b.

110d 〈std predicate function declarations 105c〉+≡ (101b) ⊳ 108b 114b ⊲

type ∗ initialiseType();type ∗ getType();type ∗ getSourceTarget();type ∗ getSource() return transformations[0]→getSource(); type ∗ getTarget() return transformations.back()→getTarget(); void printType();bool isPolymorphic();void recalculateType();void instantiateParameters(vector<pair<string, type ∗> > & s);

Defines:getSource, used in chunks 108, 109, 121b, 122a, 124, 129, and 217c.getSourceTarget, used in chunk 107a.getTarget, used in chunks 108 and 109.getType, used in chunk 110a.initialiseType, used in chunk 110a.instantiateParameters, used in chunks 107a, 108a, and 110a.isPolymorphic, used in chunk 123b.printType, used in chunks 108a and 109c.recalculateType, used in chunks 108a, 122a, 123b, and 217c.

3.3. REGULAR PREDICATES 111

3.3 Regular Predicates

Comment 3.3.1. The class of standard predicates contains some redundancy in that there aresyntactically distinct predicates that are equivalent. For example, the standard predicates (∧2 p q)and (∧2 q p) are equivalent, where p and q are standard predicates. Similarly, (setExists2 p q) and(setExists2 q p) are equivalent. Less obviously, (domCard p) (> 0) and (setExists1 p) are equiv-alent. It is important to remove as much of this redundancy in standard predicates as possible.Ideally, one would like to be able to determine just one representative from each class of equiv-alent predicates. However, determining the equivalent predicates is undecidable, so one usuallysettles for some easily checked syntactic conditions that reveal equivalence of predicates. Thusthese syntactic conditions are sufficient, but not necessary, for equivalence. These considerationsmotivate the next definition.

Definition 3.3.2. A transformation f is symmetric if it has a signature of the form

f : (→ Ω)→ · · · → (→ Ω)→ µ→ σ,

and, whenever f p1 . . . pk is a standard predicate, it follows that f p1 . . . pk and f pi1 . . . pikare

equivalent, for all permutations i of 1, . . . , k, where k is the rank of f .

Comment 3.3.3. Since any permutation of the predicate arguments of a symmetric transforma-tion produces an equivalent function, it is advisable to choose one particular order of argumentsand ignore the others. For this purpose, a total order on standard predicates is defined and thenarguments for symmetric transformations are chosen in increasing order according to this totalorder. To define the total order on standard predicates, one must start with a total order ontransformations. Therefore, it is supposed that the transformations are ordered according to some(arbitrary) strict total order <.

The following definition of the relation p ≺ q uses induction on the maximum of the numberof (occurrences of) transformations in p and the number in q. To emphasise: the definition of ≺depends upon <. One can show that the relation ≺ is a strict total order on S. The relation isdefined by p q if either p = q or p ≺ q. Clearly, is a total order on S.

Definition 3.3.4. The binary relation ≺ on S is defined inductively. Let p, q ∈ S, where p is(f1 b1,1 . . . b1,k1

) · · · (fn bn,1 . . . bn,kn) and q is (g1 c1,1 . . . c1,s1

) · · · (gr cr,1 . . . cr,sr). Then

p ≺ q if one of the following holds.

1. p is a proper prefix of q.

2. There exists i such that fi < gi, and p and q agree to the left of fi and gi.

3. There exist i and ji such that bi,ji≺ ci,ji

, and p and q agree to the left of bi,jiand ci,ji

.

Definition 3.3.5. A standard predicate (f1 b1,1 . . . b1,k1) · · · (fn bn,1 . . . bn,kn

) is regular if bi,ji

is a regular predicate, for i = 1, . . . , n and ji = 1, . . . , ki, and fi is symmetric implies thatbi,1 · · · bi,ki

, for i = 1, . . . , n.

Example 3.3.6. The predicate

vertices (domCard (vertex (∧2 b c))) (> 7)

is a regular predicate iff b c, and b and c are regular predicates.

Comment 3.3.7. One can show that for every p ∈ S, there exists a regular predicate q suchthat p and q are equivalent. This observation allows us to confine attention to the generallymuch smaller class of regular predicates. Figure 3.1 gives an algorithm for constructing a regularpredicate equivalent to some given standard predicate.


function Regularise(p)returns a regular predicate q such that p and q are equivalent;

input: p, a standard predicate (f1 b1,1 . . . b1,k1) · · · (fn bn,1 . . . bn,kn

);

for i = 1, . . . , n do

for ji = 1, . . . , ki do

bi,ji:= Regularise(bi,ji

);

if fi is symmetric then

[ci,1, . . . , ci,ki] := Sort([bi,1, . . . , bi,ki

]);pi := (fi ci,1 . . . ci,ki

);

else

pi := (fi bi,1 . . . bi,ki);

return p1 · · · pn;

Figure 3.1: Algorithm for regularising a standard predicate

Comment 3.3.8. We perform regularistation on the external representation only. The signa-ture of the function regularise is different from that of the algorithm given above. Insteadof returning a regular predicate, we return a vector of integers (captured in the data structuretransvect below), which is a unique representation of the computed regular predicate. Checkingthe equivalence of two integer vectors is easier than the corresponding operation on two structures.

The transformations are numbered in the same order as they are declared in the input specifica-tion file. The vector representation captures the string representation of the standard predicates.Each argument of a transformation is enclosed with brackets, in the same way the grammar forstandard predicates are defined for the spec file parser. The composition symbol and the left andright parentheses are defined to be n, n + 1 and n + 2 respectively, where n is the total numberof transformations declared. (We start counting from zero, as is usual.) This representation isunique for each standard predicate. A formal proof of this unique readability property would benice here.

112 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 110c 113a ⊲

#define CDOT trans_table_size()

#define LPAR trans_table_size()+1

#define RPAR trans_table_size()+2

struct transvect vector<int> vec;transvect(vector<int> x) vec = x; bool operator≡(transvect const & other) const

return (vec ≡ other.vec); bool operator<(transvect const & other) const

return (vec < other.vec); bool operator>(transvect const & other) const

return (vec > other.vec); void print()

for (uint i=0; i6=vec.size(); i++) cout ≪ vec[i] ≪" ";cout ≪ endl;

;

Defines:


CDOT, used in chunk 113b.LPAR, used in chunk 114a.RPAR, used in chunk 114a.transvect, used in chunks 113 and 114a.

Uses trans_table_size 238a.

Comment 3.3.9. Given a list of vectors representing the arguments to a transformation, sortedcomputes whether they are sorted. The iterators p and q point to successive vectors in the list.

113a 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 112 113b ⊲

#include <list>static bool sorted(list<transvect> &in)

list<transvect>::iterator p = in.begin(), q = in.begin();while (p 6= in.end())

q++; if (q ≡ in.end()) break;if (∗p > ∗q) return false;p++;

return true;

Defines:sorted, used in chunk 113c.

Uses transvect 112.

113b 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 113a 119b ⊲

pair<vector<int>, bool> std_predicate::regularise() pair<vector<int>, bool> ret; ret.second = true;list<transvect> veclist;for (uint i=0; i6=transformations.size(); i++)

transformation_t ∗ p = transformations[i];for (uint j=0; j6=p→args.size(); j++)

pair<vector<int>,bool> tp = p→args[j]→regularise();veclist.push_back(transvect(tp.first));if (¬tp.second) ret.second = false;

〈regularise::check for symmetricity 113c〉〈regularise::insert regularised transformation into ret 114a〉if (i 6= transformations.size()-1) ret.first.push_back(CDOT);veclist.clear();

return ret;

Defines:regularise, used in chunks 117a, 123c, and 125b.

Uses CDOT 112, clear 145b, std_predicate 101b, transformation_t 101a, and transvect 112.

113c 〈regularise::check for symmetricity 113c〉≡ (113b)

if ((find_trans_info(p→tnum))→symmetricity) if (sorted(veclist) ≡ false) ret.second = false; veclist.sort();

Uses find_trans_info 237a and sorted 113a.

3.4. PREDICATE REWRITE SYSTEMS 114

114a 〈regularise::insert regularised transformation into ret 114a〉≡ (113b)

ret.first.push_back(p→tnum);list<transvect>::iterator lp = veclist.begin();while (lp 6= veclist.end())

ret.first.push_back(LPAR);for (uint k=0; k6=lp→vec.size(); k++)

ret.first.push_back(lp→vec[k]);ret.first.push_back(RPAR);lp++;

Uses LPAR 112, RPAR 112, and transvect 112.

114b 〈std predicate function declarations 105c〉+≡ (101b) ⊳ 110d 119a ⊲

pair<vector<int>, bool> regularise();

Defines:regularise, used in chunks 117a, 123c, and 125b.

3.4 Predicate Rewrite Systems

Comment 3.4.1. This section addresses the central issue of the construction of standard predi-cates using predicate rewrite systems.

Definition 3.4.2. A predicate rewrite system is a finite relation on S satisfying the followingtwo properties.

1. For each p q, the type of p is more general than the type of q.

2. For each p q, there does not exist s t such that q is a proper subterm of s.

If p q, then p q is called a predicate rewrite, p the head, and q the body of the predicaterewrite.

Comment 3.4.3. The second condition of Definition 3.4.2 states that no body of a rewrite is aproper subterm of the head of any rewrite. In practice, the heads of rewrites are usually standardpredicates consisting of just one transformation, such as top. In this case, the second condition isautomatically satisfied, since the heads have no proper subterms at all.

Definition 3.4.4. Let (f1 b1,1 . . . b1,k1) · · · (fn bn,1 . . . bn,kn

) be a standard predicate. A suffixof the standard predicate is a term of the form

(fi bi,1 . . . bi,ki) · · · (fn bn,1 . . . bn,kn

),

for some i ∈ 1, . . . , n. The suffix is proper if i > 1.

Definition 3.4.5. A subterm of a standard predicate (f1 b1,1 . . . b1,k1) · · · (fn bn,1 . . . bn,kn

)is eligible if it is a suffix of the standard predicate or it is an eligible subterm of bi,ji

, for somei ∈ 1, . . . , n and ji ∈ 1, . . . , ki.

Definition 3.4.6. Let be a predicate rewrite system and p a standard predicate. An eligiblesubterm r of p is a redex with respect to if there exists a predicate rewrite r b such that thetype of b and the positional type of r in p are unifiable. In this case, r is said to be a redex viar b.

Definition 3.4.7. Let be a predicate rewrite system, and p and q standard predicates. Thenq is obtained by a predicate derivation step from p using if there is a redex r via r b in pand q = p[r/b]. The redex r is called the selected redex.


Definition 3.4.8. A predicate derivation with respect to a predicate rewrite system is a finitesequence 〈p0, p1, . . . , pn〉 of standard predicates such that pi is obtained by a derivation step frompi−1 using , for i = 1, . . . , n. The length of the predicate derivation is n. The standard predicatep0 is called the initial predicate and the standard predicate pn is called the final predicate.

Comment 3.4.9. A predicate rewrite system is used to generate a search space of predicates.Starting from some predicate p0, one generates all the predicates that can be obtained by apredicate derivation step from p0, then all the predicates that can be obtained from those by apredicate derivation step, and so on. A path in the search space is a sequence of predicates eachof which (except for the first) can be obtained from its predecessor by a predicate derivation step.Thus each path in the search space is a predicate derivation.

Comment 3.4.10. We represent each predicate rewrite system as a vector of rewrites. The vectoris indexed by the heads of the rewrites, which is recorded here as a 2-tuple consisting of its integeridentifier and the type of the head. Note that the system only supports predicate rewrite systemsthat have zero-ranked transformations as heads of rewrites. The set of all heads are recorded inthe structure rewpreds, and this set is computed during the parsing of the input specification file.

115a 〈rewrite table data structure and functions 115a〉≡ (131c) 116b ⊲

static bool freeze = false;static vector<pair<pair<int, type ∗>, type_rewrites_t ∗> > rewrite_table;static vector<int> rewpreds;

Defines:freeze, used in chunks 117a and 118b.rewpreds, used in chunk 116b.rewrite_table, used in chunks 117, 118, 121b, 124, 129, and 130a.

Uses type_rewrites_t 115b.

Comment 3.4.11. The bodies of rewrites with the same head are collected in the followingstructure. rewrites is the list of bodies. The function insert_pred allows one to insert standardpredicates into rewrites.

115b 〈rewrite::struct type-rewrites-t 115b〉≡ (131b)

struct type_rewrites_t vector<std_predicate ∗> rewrites;∼type_rewrites_t()

for (uint i=0; i6=rewrites.size(); i++)rewrites[i]→freememory();

void insert_pred(std_predicate ∗ x) rewrites.push_back(x); void print()

for (uint i=0; i6=rewrites.size(); i++) rewrites[i]→print(); ;

Defines:insert_pred, used in chunk 117a.type_rewrites_t, used in chunks 115a, 117a, 121b, 124, 129, and 130a.


Comment 3.4.12. The rewrite table can be manipulated by the following public functions.

• insert-rewrite inserts a rewrite into the table.

• isRewpred returns true if tnum is a head; false otherwise.

• freeze-rewrite-table makes sure no more changes is made to the rewrite table.

• cleanup-rewrite frees up the memory occupied by the rewrite table.

• find-unifiers finds the applicable rewrites for a given redex.


116a 〈rewrite::public functions 116a〉≡ (131b) 120 ⊲

extern bool insert_rewrite(std_predicate ∗ pred, type ∗ rtype, int tnum);extern bool isRewpred(int tnum);extern void freeze_rewrite_table();extern void cleanup_rewrite();extern vector<int> find_unifiers(int tnum, type ∗ ttype);

Defines:cleanup_rewrite, used in chunk 8.find_unifiers, used in chunks 121b, 124, 129, and 130a.freeze_rewrite_table, used in chunk 217c.insert_rewrite, used in chunk 217c.isRewpred, used in chunks 119b and 129.


Comment 3.4.13. The function insert_rewpred() is used to insert the heads of rewrites intorewpreds.

We need to make the operation of checking whether an eligible subterm is a redex as fast aspossible, because it is done very frequently during training. The following scheme ensures it runsin O(1) time.

The heads are maintained as a vector of integer, where the integer identifier of a transformationserves as an index into the vector. The value of a particular transformation is one if it is a rewritablepredicate; zero otherwise.

The entries in the vector are dynamically allocated as they are needed. Transformations notexplicitly stated as rewritable predicates are assigned the value zero, including those that are neverinserted into the vector.

116b 〈rewrite table data structure and functions 115a〉+≡ (131c) ⊳ 115a 117a ⊲

static void insert_rewpred(int tnum) int rsize = rewpreds.size();if (tnum ≥ rsize)

for (int i=rsize; i6=tnum+1; i++) rewpreds.push_back(0);rewpreds[tnum] = 1;

bool isRewpred(int tnum)

assert(tnum ≥ 0);if (tnum ≥ (int)rewpreds.size()) return false;if (rewpreds[tnum]) return true;return false;

Defines:

insert_rewpred, used in chunk 117a.isRewpred, used in chunks 119b and 129.

Uses rewpreds 115a.


Comment 3.4.14. This function inserts a predicate rewrite into the table. The predicate pred

must be regular.

117a 〈rewrite table data structure and functions 115a〉+≡ (131c) ⊳ 116b 117b ⊲

bool insert_rewrite(std_predicate ∗ pred, type ∗ rtype, int tnum) if (pred→regularise().second ≡ false ∧ options.strategy ≡ LR)

return false;assert(freeze ≡ false);type ∗ temp_rtype = rtype;while (rtype→isSynonym())

rtype = dcast<type_synonym ∗>(rtype)→getActual(); for (uint i=0; i6=rewrite_table.size(); i++)

int f = rewrite_table[i].first.first;type ∗ s = rewrite_table[i].first.second;if (f ≡ tnum ∧ s→getName() ≡ rtype→getName())

rewrite_table[i].second→insert_pred(pred);delete_type(temp_rtype);return true;

pair<int, type ∗> key(tnum, rtype);type_rewrites_t ∗ rews = new type_rewrites_t;pair<pair<int, type ∗>, type_rewrites_t ∗> entry(key, rews);entry.second→insert_pred(pred);rewrite_table.push_back(entry);insert_rewpred(tnum);return true;

Defines:insert_rewrite, used in chunk 217c.

Uses delete_type 77b 77c, freeze 115a, insert_pred 115b, insert_rewpred 116b, LR 234, options 234 235,regularise 113b 114b, rewrite_table 115a, std_predicate 101b, type_rewrites_t 115b, and type_synonym 81c.

117b 〈rewrite table data structure and functions 115a〉+≡ (131c) ⊳ 117a 118a ⊲

#include "unification.h"

static vector<pair<string, type ∗> > slns;static void cleanup_slns()

for (uint i=0; i6=slns.size(); i++) delete_type(slns[i].second);slns.clear();

Defines:

cleanup_slns, used in chunk 118a.Uses clear 145b, delete_type 77b 77c, and unification.h 85b.


Comment 3.4.15. Given a transformation t, the function find_unifiers computes the indicesto every predicate rewrite whose head matches t.

118a 〈rewrite table data structure and functions 115a〉+≡ (131c) ⊳ 117b 118b ⊲

vector<int> find_unifiers(int tnum, type ∗ ttype) vector<int> ret;for (uint i=0; i6=rewrite_table.size(); i++)

if (tnum 6= rewrite_table[i].first.first) continue;assert(slns.size() ≡ 0);bool res = unify(slns, ttype, rewrite_table[i].first.second);if (res) ret.push_back(i);cleanup_slns();

return ret;

Defines:

find_unifiers, used in chunks 121b, 124, 129, and 130a.Uses cleanup_slns 117b, rewrite_table 115a, and unify 88.

Comment 3.4.16. We freeze the rewrite table after processing the input file. This is a safeprogramming step; it helps to rule out this part when tracking bugs.

118b 〈rewrite table data structure and functions 115a〉+≡ (131c) ⊳ 118a 118c ⊲

void freeze_rewrite_table() freeze = true;cerr ≪ "Rewrite table size = " ≪ rewrite_table.size() ≪ endl;

Defines:

freeze_rewrite_table, used in chunk 217c.Uses freeze 115a and rewrite_table 115a.

Comment 3.4.17. Here, we free up the memory occupied by the rewrite table. This functionshould only be called after the completion of the learning process.

118c 〈rewrite table data structure and functions 115a〉+≡ (131c) ⊳ 118b

void cleanup_rewrite() cerr ≪ "Cleaning up the rewrite system......";for (uint i=0; i6=rewrite_table.size(); i++)

delete_type(rewrite_table[i].first.second);delete rewrite_table[i].second;

cerr ≪ "Done.\n";

Defines:

cleanup_rewrite, used in chunk 8.Uses delete_type 77b 77c and rewrite_table 115a.

Comment 3.4.18. We calculate (potential) redexes with respect to the predicate rewrite systemon the external representations only. The function is described here.

For each transformation, the calRedexes function goes through the list of eligible subterms ofunit length and record down those that are in rewrpreds. (The test is done using the functionisRewpred.) The redexes in the arguments of the transformations are computed recursively.

Note that it is the standard predicate structure to which the redex belongs that is returned,not the redex itself. For example, in ∧2 (proj3 · top proj1 · (= A)), it is the standard predicateproj3 · top, and not top alone that is returned. This is one weakness in the representation. Thefact that a transformation is also a standard predicate cannot be captured properly.

3.5. PREDICATE ENUMERATION 119

The function in its current state only caters for the case when the heads of rewrites are allzero-ranked transformations (that are also predicates). This is true of most commonly encounteredpredicate rewrite systems. However, this limitation is undesirable and should be fixed in the future.Provisions for doing this has been designed into the data structures involved. Note that redexesis a vector of standard predicates, not a vector of transformations as one would expect in thiscase. This built-in generality provides scope for extension in the future. The isRewpred will needto be changed as well to take into account more complex structures as well.

119a 〈std predicate function declarations 105c〉+≡ (101b) ⊳ 114b 122d ⊲

vector<std_predicate ∗> redexes;void calRedexes();bool rewritable() if (redexes.size()) return true;else return false;

Defines:calRedexes, used in chunks 104a, 123b, and 173a.redexes, used in chunks 103d, 119b, 121–24, and 183d.rewritable, used in chunks 123b and 182c.


119b 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 113b 123a ⊲


void std_predicate::calRedexes() redexes.clear();for (uint i=0; i6=transformations.size(); i++)

transformation_t ∗ p = transformations[i];if (i ≡ transformations.size()-1 ∧ isRewpred(p→tnum))

redexes.push_back(this);uint ks1, ks2;for (uint j=0; j6=p→args.size(); j++)

std_predicate ∗ p2 = p→args[j];p2→calRedexes();ks1 = redexes.size(); ks2 = p2→redexes.size();redexes.resize(ks1+ks2);for (uint k=0; k6=ks2; k++)

redexes[ks1+k] = p2→redexes[k];p2→redexes.clear();

Defines:

calRedexes, used in chunks 104a, 123b, and 173a.Uses clear 145b, isRewpred 116a 116b, redexes 119a, rewrite.h 131b, std_predicate 101b,

and transformation_t 101a.

3.5 Predicate Enumeration

Comment 3.5.1. To efficiently construct predicates in many practical applications, some care isnecessary. Note that the search space defined by a predicate rewrite system is (usually) a graph,not a tree. In other words, there may be many paths from some initial predicate p to somepredicate q.

In the following, we give two algorithms for systematically enumerating the search space definedby a predicate rewrite system. Each has advantages and disadvantages, depending on the situation.

Comment 3.5.2. The first solution uses a version of the classical “anagram algorithm”. A set of(regularised forms of) previously seen predicates is maintained during search. Each time a newpredicate is generated, the algorithm regularises the predicate and check whether that is in the


seenSet , adding it if it is not. This tactic essentially turns the search space back into a tree.Figure 3.2 shows the algorithm.

function Enumerate(, p0) returns the set of standard predicates in thesearch space rooted at p0 using ;

input: , a predicate rewrite system;p0, a standard predicate;

predicates := ;

openList := [p0];

seenSet := p0;

while openList 6= [] do

p := head(openList);

openList := tail(openList);

predicates := predicates ∪ p;

for each redex r via r b, for some b, in p do

q := Regularise(p[r/b]);

if q 6∈ seenSet then

seenSet := seenSet ∪ q;

openList := openList ++ [q];

return predicates ;

Figure 3.2: Algorithm I

Comment 3.5.3. We have a function rewrite that implements Algorithm I (and Algorithm IIto be described later) in on-line mode. In other words, we do not generate the whole search spaceat one go. Rather, the function take as input a node in the search tree and generates only itschildren. Algorithm I is called the EXPECTED enumeration strategy here.

120 〈rewrite::public functions 116a〉+≡ (131b) ⊳ 116a 126b ⊲

extern vector<std_predicate ∗> rewriteP(std_predicate & in);

Defines:rewriteP, used in chunk 173c.



Comment 3.5.4. The variable child is initialised with 1 because 0 will be used by the parentin LR mode.

121a 〈rewrite::functions 121a〉≡ (131c) 126a ⊲

// #define DEBUG_REWRITEvector<std_predicate ∗> rewriteP(std_predicate &in)

vector<std_predicate ∗> ret;uint child = 1;uint i, j, k;int tnum; transformation_t ∗ trans;type_abstraction ∗ postype;std_predicate ∗ temp = NULL;if (options.strategy ≡ LR) 〈rewrite::LR 124〉

#ifndef NO_GMP

else if (options.strategy ≡ EXPECTED) 〈rewrite::EXPECTED 121b〉 #endif

return ret;

Defines:rewriteP, used in chunk 173c.

Uses EXPECTED 234, LR 234, options 234 235, std_predicate 101b, transformation_t 101a,and type_abstraction 82c.

Comment 3.5.5. For each redex, we compute its positional type and use it to obtain the set ofapplicable rewrites using find-unifiers. We then apply each rewrite to generate new predicates.

121b 〈rewrite::EXPECTED 121b〉≡ (121a)

for (i=0; i<in.redexes.size(); i++) trans = in.redexes[i]→transformations.back();tnum = trans→tnum;〈rewrite::compute positional type 122a〉

vector<int> index = find_unifiers(tnum, postype→getSource());for (k=0; k6=index.size(); k++)

type_rewrites_t ∗ p = rewrite_table[index[k]].second;for (j=0; j6=p→rewrites.size(); j++)

〈rewrite::prevent recursive rewrites 122b〉〈rewrite::clone the input predicate 122c〉〈rewrite::apply the rewrite 123b〉〈rewrite::insert into outlist if not seen 123c〉

if (temp) temp→freememory(); temp = NULL; if (options.one_redex) break;

Uses find_unifiers 116a 118a, freememory 19a 19c, getSource 82c 83d 106 110d, options 234 235, redexes 119a,rewrite_table 115a, and type_rewrites_t 115b.


Comment 3.5.6. If the redex is not a top, we replace it by top, which is equivalent to putting avariable there. We then proceed to calculate the positional type using recalculateType.

122a 〈rewrite::compute positional type 122a〉≡ (121b 124)

if (tnum 6= getTopID()) temp = in.clone();temp→redexes[i]→transformations.back()→freememory();temp→redexes[i]→transformations.back() = new transformation_t;temp→recalculateType();postype = dcast<type_abstraction ∗>

(temp→redexes[i]→transformations.back()→ttype); elsepostype = dcast<type_abstraction ∗>(trans→ttype);assert(postype); assert(postype→getSource());

Uses clone 19a 19b, freememory 19a 19c, getSource 82c 83d 106 110d, getTopID 238a, recalculateType 107a110a 110d, redexes 119a, transformation_t 101a, and type_abstraction 82c.

Comment 3.5.7. The aim here is to get rid of all forms of recursive rewrites to make sure thesearch space is finite. Here, we make sure that a transformation p is not used to rewrite argumentsof p. This prevents recursive rewrites of the form ∧2(top ∧2 (top top)) but not those of the form∧2(top ∧3 (top top top)), for example. Also, this is only a one-level deep check. When we have apredicate rewrite of the form

top (subgraphs3 ) setExists1 top

for example, (undesirable) predicates like

(subgraphs3 ) setExists1 ((subgraphs3 ) setExists1 top)

will get generated.How do we handle this problem in general? It is true that with care, predicate rewrite systems

can be designed such that these undesirable predicates do not get generated, but maybe there isa better answer...

122b 〈rewrite::prevent recursive rewrites 122b〉≡ (121b 124)

if (¬options.recursive ∧ trans→parent)if (trans→parent→tnum ≡ p→rewrites[j]→transformations[0]→tnum)

continue;Uses options 234 235.

Comment 3.5.8. Here, we obtain a clone of the input predicate for use with its child and computethe new encoding for the child.

122c 〈rewrite::clone the input predicate 122c〉≡ (121b 124)

std_predicate ∗ newrw = in.clone();newrw→encoding.push_back(child); child++;

Uses clone 19a 19b, encoding 122d, and std_predicate 101b.

Comment 3.5.9. The encoding operation is related to the predicate evaluation algorithm. See§5.7 for details. Every predicate is encoded using a vector of integers. The root node of the searchtree is encoded with the 1-dimensional vector [1]. The y-th child q of a predicate p is encodedwith the n+1-dimensional vector [x, y], where [x] is the n-dimensional vector encoding for p. Thecomputation makes use of the following members of standard predicates.

122d 〈std predicate function declarations 105c〉+≡ (101b) ⊳ 119a 125a ⊲

vector<int> encoding;void printCode();

Defines:encoding, used in chunks 103d, 104a, 122c, 123a, 173a, 183d, and 190–93.printCode, used in chunk 181a.


123a 〈standard predicate member functions 103d〉+≡ (131a) ⊳ 119b

void std_predicate::printCode() cout ≪ "[";for (uint i=0; i6=encoding.size(); i++)

cout ≪ encoding[i]; if (i 6= encoding.size()-1) cout ≪ ","; cout ≪ "]";

Defines:printCode, used in chunk 181a.

Uses encoding 122d and std_predicate 101b.

Comment 3.5.10. To apply the rewrite r → s, we first free up the r term in question, whichis always the last transformation in the transformations vector. We then put a cloned copy ofs in its place, adjusting the parents pointer accordingly. Having done that, we compute the new(potential) redexes on the resulting predicate. Finally, we recalculate the type of the predicate ifnecessary.

123b 〈rewrite::apply the rewrite 123b〉≡ (121b 124)

std_predicate ∗ newpred = p→rewrites[j];int nsize = newrw→redexes[i]→transformations.size();newrw→redexes[i]→transformations[nsize-1]→freememory();newrw→redexes[i]→transformations[nsize-1]

= newpred→transformations[0]→clone();newrw→redexes[i]→transformations[nsize-1]→ttype→renameParameters();newrw→redexes[i]→transformations[nsize-1]→parent =newrw→redexes[i]→parent;for (uint l=1; l<newpred→transformations.size(); l++)

transformation_t ∗ temp = newpred→transformations[l]→clone();temp→parent = newrw→redexes[i]→parent;newrw→redexes[i]→transformations.push_back(temp);

newrw→calRedexes();if (newpred→isPolymorphic()) newrw→polymorphic = 1;if (newrw→rewritable() ∧ newrw→isPolymorphic() ∧ newpred→isPolymorphic())

newrw→recalculateType();Uses calRedexes 119a 119b, clone 19a 19b, freememory 19a 19c, isPolymorphic 109d 110d, polymorphic 108b,

recalculateType 107a 110a 110d, redexes 119a, renameParameters 79b 80d, rewritable 119a,std_predicate 101b, and transformation_t 101a.

Comment 3.5.11. We insert the newly-generated predicate if we have not seen it before. Weuse the encode function (described in §3.6.1) to encode predicates.

123c 〈rewrite::insert into outlist if not seen 123c〉≡ (121b)

pair<vector<int>, bool> iencoded = newrw→regularise();bigint encoded;encode(iencoded.first, encoded);pair<set<bigint>::iterator, bool> result = seenset.insert(encoded);if (result.second) ret.push_back(newrw);else newrw→freememory(); encoded.freememory(); // seen before

Uses bigint 239, encode 126a, freememory 19a 19c, insert 11d, and regularise 113b 114b.

Comment 3.5.12. The algorithm is simple and worked well in conjunction with heuristic searches.The main problem is the (un)scalability of the seenset memory consumption. The implication isthat for sufficiently large search spaces, an exhaustive enumeration is computationally impossible.This motivates the development of the next algorithm.


Comment 3.5.13. We now look at another enumerate algorithm. The central idea of this algo-rithm is that one can introduce a restricted form of redex selection scheme and discard non-regularpredicates during enumeration to avoid the construction of equivalent predicates.

We now address the restricted form of redex selection. Note that occurrences (of subterms) canbe totally ordered by the lexicographic ordering, denoted by ≤. The strict lexicographic orderingis denoted by <.

Definition 3.5.14. Let 〈p0, p1, . . . , pn〉 be a predicate derivation, where oi is the occurrence ofthe redex selected in pi, for i = 0, . . . , n − 1. Then the predicate derivation is said to be LR ifoi−1 ≤ oi, for i = 1, . . . , n− 1.

Comment 3.5.15. LR stands for ‘left-to-right’. The intuitive idea behind the definition is thata redex in the predicate derivation is always ‘at or to the right of’ the previously selected redex.Hence the selection of redexes proceeds left-to-right.

Comment 3.5.16. The algorithm is given in Figure 3.3. In the figure, the phrase LR redexmeans the redex is selected according to the LR selection rule, that is, the redex must be at or tothe right of the redex selected in the parent predicate of p.

function Enumerate2 (, p0) returns the set of regular predicates in asearch space rooted at p0 using ;

input: , a predicate rewrite system;p0, a predicate;

predicates := ;

openList := [p0];




predicates := predicates ∪ p;

for each LR redex r via r b, for some b, in p do

q := p[r/b];

if q is regular then openList := openList ++ [q];

return predicates ;

Figure 3.3: Algorithm II

124 〈rewrite::LR 124〉≡ (121a)

for (i=in.marker; i<in.redexes.size(); i++) trans = in.redexes[i]→transformations.back();tnum = trans→tnum;〈rewrite::compute positional type 122a〉

vector<int> index;index = find_unifiers(tnum, postype→getSource());for (k=0; k6=index.size(); k++)

type_rewrites_t ∗ p = rewrite_table[index[k]].second;for (j=0; j6=p→rewrites.size(); j++)

〈rewrite::prevent recursive rewrites 122b〉〈rewrite::clone the input predicate 122c〉

3.6. MISCELLANEOUS FUNCTIONS 125

newrw→marker = i;〈rewrite::apply the rewrite 123b〉〈rewrite::insert into outlist if regular 125b〉

if (temp) temp→freememory(); temp = NULL; if (options.one_redex) break;

Uses find_unifiers 116a 118a, freememory 19a 19c, getSource 82c 83d 106 110d, marker 125a, options 234 235,

redexes 119a, rewrite_table 115a, and type_rewrites_t 115b.

Comment 3.5.17. We use a marker to keep track of the last rewritten redex. This is a memberof standard predicates.

125a 〈std predicate function declarations 105c〉+≡ (101b) ⊳ 122d

int marker;

Defines:marker, used in chunks 22–24, 103d, 104a, 124, and 183d.

Comment 3.5.18. A predicate is kept iff it is regular.

125b 〈rewrite::insert into outlist if regular 125b〉≡ (124)

pair<vector<int>, bool> iencoded = newrw→regularise();if (iencoded.second)

ret.push_back(newrw);if (options.enumSpace ∧ options.verbosity ≡ 2)

cout ≪ "top >-> "; newrw→print(); cout ≪ "; \n"; else newrw→freememory();

Uses freememory 19a 19c, options 234 235, and regularise 113b 114b.

Comment 3.5.19. This algorithm, while conceptually very simple, introduces some difficulttheoretical questions. Denoting the set of expected predicates of a rewrite system by all thefinal predicates of predicate derivations obtainable with no restrictions on the selection of redexeswith respect to , the questions are:

1. Is the search directed by the algorithm complete, that is, are (the regularisations of) all theexpected predicates actually generated by regular, LR derivations?

2. Is each (regularisation of an) expected predicate generated exactly once?

It is shown in [Llo03] that under mild conditions on the predicate rewrite system, these twoquestions can be answered in the affirmative.

Comment 3.5.20. Algorithm II is very effective for most kinds of searches. However, it can workpoorly when used in conjunction with certain types of heuristic searches. A non-regular predicategenerated during search is thrown away even though it might be the best hypothesis encounteredso far. In such scenarios, Algorithm I is more appropriate.

3.6 Miscellaneous Functions

Comment 3.6.1. We now look at efficient encoding of predicates. Potentially a large numberof predicates need to be stored during learning, e.g., in the seenset, in the FAP table and in theopen-list. Memory utility can be a huge problem. This function can be used to encode vectorrepresentations of standard predicates into massive arbitrary precision integers. (See Comment7.2.4 for details.) It implements the encoding part of a bijection between vectors and integers.


To see how it works, consider an m-dimensional tuple of integers space, where dimension itakes on values from 0 to ni. The cardinality of the tuple space is N = Πni. We want to establisha bijection between the tuple space and [0..N − 1].

The basic idea is rather simple: We first partition the interval [0..N − 1] into n1 equal subin-tervals. Then we partition each subinterval into n2 subsubintervals. And so on recursively. Givena tuple v, its encoding is simply given by

v1 × (n2 × n3 × · · · × nm) + v2 × (n3 × · · · × nm) + · · · vm−1 × nm + vm.

We use multiple-precision integers from the GNU multiple-precision library here. The typebigint is declared in the global area.

This function can also be used to reduce the memory requirements of the open-list, but weneed a decode function.

The reason why the desired value is made an argument of the function instead of the returnvalue of the function is because we need to release the memory occupied by the bigint variables.

126a 〈rewrite::functions 121a〉+≡ (131c) ⊳ 121a 127a ⊲

#ifndef NO_GMP

#include <stdexcept>#include <global.h>static void encode(vector<int> inter, bigint & ret)

int size = inter.size(); assert(size > 0);for (int i=0; i6=size; i++) assert(inter[i] ≥ 0);

bigint range, multiplier, temp;range = trans_table_size() + 1;multiplier = 1;ret = inter[size-1];for (int i=size-2; i6=-1; i−−)

multiplier = multiplier ∗ range;temp = multiplier ∗ inter[i];ret = ret + temp;

range.freememory(); multiplier.freememory(); temp.freememory();

#endif

Defines:encode, used in chunks 123c and 131c.

Uses bigint 239, freememory 19a 19c, global.h 232, and trans_table_size 238a.

Comment 3.6.2. This procedure resets the seenset for later use. It is essential that we free upthe memory occupied by the GMP integers first. It turns out that one cannot call any non-constmember functions of objects in a set using iterators, since modifying the state of a member of theset might require it to be moved in the underlying red-black tree. The only provision for this inthe C++ Standard is the mutable attribute that can be declared on class members. But that isnot exactly what we want. That is a new language feature and is not well-supported in existingcompilers in any case.

Here, we use a temporary holder to hold the bigints before deallocating. This is obviously notefficient, but it is correct. This procedure does not get called very often anyway.

126b 〈rewrite::public functions 116a〉+≡ (131b) ⊳ 120 127b ⊲

extern void rewrite_refresh();

Defines:rewrite_refresh, used in chunks 171b and 186.


127a 〈rewrite::functions 121a〉+≡ (131c) ⊳ 126a 127c ⊲

void rewrite_refresh() #ifndef NO_GMP

vector<bigint> temp; int i=0; set<bigint>::iterator p;if (options.strategy≡EXPECTED)

cerr ≪ "| seen set | = " ≪ seenset.size() ≪ endl;p = seenset.begin();while (p 6= seenset.end())

temp.push_back(∗p); temp[i++].freememory(); p++; seenset.clear(); temp.clear(); i=0;

#endif

Defines:rewrite_refresh, used in chunks 171b and 186.

Uses bigint 239, clear 145b, EXPECTED 234, freememory 19a 19c, and options 234 235.

Comment 3.6.3. This function calculates the total number of regular predicates in search space.It can be invoked by toggling the -e flag on the command line. The algorithm is described in somedetails in [Ng05].

127b 〈rewrite::public functions 116a〉+≡ (131b) ⊳ 126b

#ifndef NO_GMP

extern void spsize2(bigint & ret);#endif

Defines:spsize, never used.spsize2, used in chunks 130b and 185a.

Uses bigint 239.

127c 〈rewrite::functions 121a〉+≡ (131c) ⊳ 127a

#ifndef NO_GMP

〈rewrite::spsize2 functions 127d〉#endif

127d 〈rewrite::spsize2 functions 127d〉≡ (127c) 128a ⊲

void binom(bigint & n, bigint & k, bigint & ret) assert(n ≥ k);bigint x,y, i, nk;x = 1; y = 1;nk = n-k;for (i=n; i6=nk; i=i-1) x = x ∗ i;for (i=k; i6=1; i=i-1) y = y ∗ i;

ret = x ÷ y;x.freememory(); y.freememory(); i.freememory(); nk.freememory();

Defines:

binom, used in chunk 129.Uses bigint 239 and freememory 19a 19c.


128a 〈rewrite::spsize2 functions 127d〉+≡ (127c) ⊳ 127d 128b ⊲

int flatElements(transformation_t ∗ t) int ret = 1;for (uint i=t→args.size()-1; i6=0; i−−)

if (t→args[i]→spEqual(t→args[i-1])) ret++;return ret;

Defines:flatElements, used in chunk 129.

Uses spEqual 105a and transformation_t 101a.

128b 〈rewrite::spsize2 functions 127d〉+≡ (127c) ⊳ 128a 129 ⊲

void countT(transformation_t ∗ t, bigint & ret);

void countP(std_predicate ∗ p, bigint & ret) bigint temp;ret = 1;for (uint i=0; i6=p→transformations.size(); i++)

countT(p→transformations[i], temp);ret = ret ∗ temp;

temp.freememory();

Defines:countP, used in chunks 129 and 130a.countT, never used.

Uses bigint 239, freememory 19a 19c, std_predicate 101b, and transformation_t 101a.


129 〈rewrite::spsize2 functions 127d〉+≡ (127c) ⊳ 128b 130a ⊲

void countT(transformation_t ∗ t, bigint & ret) bigint temp, m, arg1, arg2;ret = 1;if (isRewpred(t→tnum))

vector<int> index = find_unifiers(t→tnum, t→getSource());if (index.size() ≡ 0) goto countTend;

type_rewrites_t ∗ hook;for (uint j=0; j6=index.size(); j++)

hook = rewrite_table[index[j]].second;for (uint i=0; i6=hook→rewrites.size(); i++)

// disallow recursive rewritesif (t→parent ∧

hook→rewrites[i]→transformations[0]→tnum≡ t→parent→tnum)

continue;countP(hook→rewrites[i], temp);ret = ret + temp;

goto countTend;

if ((find_trans_info(t→tnum))→symmetricity ∧ t→args.size())

int k = flatElements(t); assert(k > 0);arg2 = k;if (k ≡ t→rank)

countP(t→args[0], m);arg1 = m + arg2 - 1;binom(arg1, arg2, temp);ret = ret ∗ temp;goto countTend;

// else compute an upper boundfor (uint i=0; i6=t→args.size(); i++)

countP(t→args[i], temp);ret = ret ∗ temp;

countTend:

temp.freememory(); m.freememory();arg1.freememory(); arg2.freememory();

Defines:

countT, never used.Uses bigint 239, binom 127d, countP 128b, find_trans_info 237a, find_unifiers 116a 118a, flatElements 128a,

freememory 19a 19c, getSource 82c 83d 106 110d, isRewpred 116a 116b, rewrite_table 115a,transformation_t 101a, and type_rewrites_t 115b.

3.7. MODULE STRUCTURE 130

130a 〈rewrite::spsize2 functions 127d〉+≡ (127c) ⊳ 129

void spsize2(bigint & ret) bigint temp;ret = 1;vector<int> index;index = find_unifiers(getTopID(), get_type(topleveltype).second);〈spsize2::error message 130b〉type_rewrites_t ∗ hook;for (uint j=0; j6=index.size(); j++)

hook = rewrite_table[index[j]].second;for (uint i=0; i6=hook→rewrites.size(); i++)

countP(hook→rewrites[i], temp);ret = ret + temp;

temp.freememory();

Defines:spsize2, used in chunks 130b and 185a.

Uses bigint 239, countP 128b, find_unifiers 116a 118a, freememory 19a 19c, get_type 245c, getTopID 238a,rewrite_table 115a, topleveltype 234 235, and type_rewrites_t 115b.

130b 〈spsize2::error message 130b〉≡ (130a)

if (index.size() 6= 1) cout ≪ "Error in spsize2(): You have two entries in the rewrite "

"system with the same redex.\n";cout ≪ "index.size() = " ≪ index.size() ≪ "\t";for (uint i=0; i6=index.size(); i++)

cout ≪ index[i] ≪ " ";cout ≪ endl;exit(1);

Uses redex 15b and spsize2 127b 130a.

3.7 Module Structure

130c 〈predicate.h 130c〉≡#ifndef _PREDICATE_H_

#define _PREDICATE_H_

#include <string>#include <vector>#include "terms.h"

#include "types.h"

using namespace std;〈predicate::representations 101a〉#endif

Defines:predicate.h, used in chunks 131 and 229.

Uses terms.h 9 and types.h 76a.

3.7. MODULE STRUCTURE 131

131a 〈predicate.cc 131a〉≡#include "predicate.h"

#include "global.h"

#include <string.h>#include <utility>#include "unification.h"

〈transformation member functions 102a〉〈standard predicate member functions 103d〉

Uses global.h 232, predicate.h 130c, and unification.h 85b.

131b 〈rewrite.h 131b〉≡#ifndef _REWRITE_H_

#define _REWRITE_H_

#include <utility>#include <vector>#include <list>#include <set>#include <map>#include <stdio.h>#include "predicate.h"


#include "global.h"


〈rewrite::struct type-rewrites-t 115b〉〈rewrite::public functions 116a〉

#endifDefines:

rewrite.h, used in chunks 2, 119b, 131c, 149, 158a, 189b, and 206.Uses global.h 232, predicate.h 130c, and unification.h 85b.

131c 〈rewrite.cc 131c〉≡#include "rewrite.h"

#ifndef NO_GMP

static set<bigint> seenset;static void encode(vector<int> inter, bigint & ret);#endif

〈rewrite table data structure and functions 115a〉〈rewrite::functions 121a〉

Uses bigint 239, encode 126a, and rewrite.h 131b.

Chapter Notes

If thou art able, O stranger,to find out all these things

and gather them together in your mind,giving all the relations, thou shalt depart crowned with glory

and knowing that thou hast been adjudged perfect in this species of wisdom.Anonymous (in Ivor Thomas "Greek Mathematics")

Chapter 4

Training and Test Individuals

Comment 4.0.1. This is the class definition for Individuals.

132 〈training set::data structures 132〉≡ (133)

#include <string>#include "terms.h"

struct label_types string clabel; float rg; ;

enum settype TRAIN, TEST, VALID, UNDEFINED, OBSOLETE ;struct Individual

string id;label_types label;int group;settype membership; int ptr_tr, ptr_ts, ptr_vl;float weight; unsigned int classified; // used by boostingterm_schema ∗ individual;Individual() group = -1; membership = UNDEFINED; weight = 1.0; void setweight(float w) weight = w; void freememory() individual→freememory(); void print() const

individual→print();if (options.learning_mode ≡ CLASSIFICATION)

cout ≪ "\t" ≪ label.clabel ≪ endl; else cout ≪ "\t" ≪ label.rg ≪ endl;

;

Defines:Individual, used in chunks 134–36, 147b, 148, 153, 154, 184a, 192, 194a, 201–5, 212a, and 214a.label_types, never used.OBSOLETE, used in chunks 136a, 139a, and 162a.TEST, used in chunks 137–39, 141, 162a, 202b, and 205.TRAIN, used in chunks 137–39, 141, 142a, 162a, 202b, 203, and 205.UNDEFINED, used in chunks 139a and 162a.VALID, used in chunks 138a, 139a, 141, and 162a.

Uses CLASSIFICATION 234, freememory 19a 19c, label 21, options 234 235, term_schema 9, and terms.h 9.

Comment 4.0.2. A few words about some attributes are in order here. In some experiments,the data set needs to be partitioned into three disjoint subsets: a training set for inducing thedecision tree, a validation set for tree post-pruning, and a test set for statistical estimation of thetrue error rate. The attribute membership records this information. The three ptr attributesserve as pointers to the next item in the designated subset.

132

133

Comment 4.0.3. This module implements the data structures for storing and manipulating thetraining examples. Two sets of individuals are maintained - a training set and a test set.

133 〈trainset.h 133〉≡#ifndef _TRAINSET_H_

#define _TRAINSET_H_

#include "global.h"

using namespace std;〈training set::data structures 132〉〈training set::public functions 134〉

#endifDefines:

trainset.h, used in chunks 2, 135a, 144, 149, and 206.Uses global.h 232.

134

Comment 4.0.4. The following lists the public functions of the module. Note that this is aC-style module. To access the training individuals using this interface functions, a module mustinclude the following header file.

134 〈training set::public functions 134〉≡ (133)

// administrationextern void cleanup_trainset();// add and get individualsextern int addTrIndividual(Individual x);extern int addTsIndividual(Individual x);extern Individual & getTrIndividual(unsigned int x);extern Individual & getTrIndividualForce(unsigned int x);extern Individual & getTsIndividual(unsigned int x);// classesextern unsigned int addClass(string classname, float weight);extern int getClassIndex(string classname);extern string getClassString(unsigned int x);extern unsigned int getClassCount();extern float getClassWeight(unsigned int x);extern float getClassWeight(string name);// get sizesextern unsigned int getTrSize();extern unsigned int getTsSize();extern unsigned int getTrTrainSize();extern unsigned int getTrTestSize();extern unsigned int getTrValidSize();extern unsigned int getTrClassSize(unsigned int x);extern unsigned int getTrClassSize(string name);extern float getTrClassWeightSize(unsigned int x);// partitioningextern void assignAllTrain();extern void assignGrpTest(int group);extern void chooseTestSet(unsigned int seed, int percent);extern void chooseValidSet(unsigned int seed, int percent);extern void linkup_sets();extern int getTrStartIndex(); extern int getTsStartIndex();extern int getVlStartIndex();extern void doPartition(unsigned int n, unsigned int seed = 0);// printextern void printTrainset();extern void printTestset();

Defines:addClass, used in chunk 210.addTrIndividual, used in chunk 212a.addTsIndividual, used in chunk 212a.assignAllTrain, used in chunks 159 and 161a.assignGrpTest, used in chunk 159b.chooseTestSet, used in chunk 161a.chooseValidSet, used in chunks 159 and 161a.cleanup_trainset, used in chunk 8.doPartition, used in chunks 159a and 160d.getClassCount, used in chunks 145a, 162c, and 203.getClassIndex, used in chunks 137a, 142a, 147b, 166a, 196b, 202b, 204, 205, and 212d.getClassString, used in chunks 142a, 152a, and 166b.getClassWeight, used in chunks 145c, 146a, 150b, 152a, 153b, 163a, 172b, 174b, 175a, and 200.getTrClassSize, never used.getTrClassWeightSize, used in chunk 162c.getTrIndividual, used in chunks 153b, 162, 166a, 184a, 190b, 192, 194a, 196b, 197a, 202–5, and 214a.

135

getTrSize, used in chunks 3a, 138, 140a, 159a, 160d, 162a, 173a, 192b, 198b, 202b, 203, 205, and 214a.getTrStartIndex, used in chunks 162b, 204, and 205.getTrTestSize, used in chunks 166a and 203.getTrTrainSize, used in chunks 138a and 203.getTrValidSize, used in chunks 194d, 196a, and 203.getTsIndividual, used in chunks 166b and 214a.getTsSize, used in chunks 166b and 214a.getTsStartIndex, used in chunk 166a.getVlStartIndex, used in chunks 196b and 197a.linkup_sets, used in chunks 159 and 161a.printTestset, used in chunk 7b.printTrainset, used in chunk 7b.

Uses classes 135a and Individual 132.

Comment 4.0.5. We have the implementation details in the following.

135a 〈trainset.cc 135a〉≡#include <vector>#include <utility>#include <string>#include "trainset.h"

static vector<Individual> trainset;static vector<Individual> testset;static vector<pair<string, float> > classes;

〈trainset::body 135b〉Defines:

classes, used in chunks 134, 136, 137a, 140b, 210e, and 211a.testset, used in chunks 135b, 136a, 140b, 142b, 143, 162a, and 203.trainset, used in chunks 2, 7a, 135–44, 149, 162a, 203, and 206.

Uses Individual 132 and trainset.h 133.

Comment 4.0.6. Here are some functions for adding individuals to the training and test sets.

135b 〈trainset::body 135b〉≡ (135a) 136a ⊲

int addTrIndividual(Individual x) if (trainset.size() ≡ trainset.max_size()) return -5;trainset.push_back(x); return trainset.size()-1;

int addTsIndividual(Individual x)

if (testset.size() ≡ testset.max_size()) return -5;testset.push_back(x); return testset.size()-1;

Defines:addTrIndividual, used in chunk 212a.addTsIndividual, used in chunk 212a.

Uses Individual 132, testset 135a, and trainset 135a.

136

Comment 4.0.7. There are also functions for retrieving individuals from the training and testsets.

136a 〈trainset::body 135b〉+≡ (135a) ⊳ 135b 136b ⊲

Individual & getTrIndividual(unsigned int x) assert(x < trainset.size());return trainset[x];

Individual & getTsIndividual(unsigned int x)

assert(x < testset.size());if (testset[x].membership ≡ OBSOLETE)

cerr ≪ "Test Individual #" ≪ x ≪ " is obsolete.\n";assert(false);

return testset[x];

Defines:getTrIndividual, used in chunks 153b, 162, 166a, 184a, 190b, 192, 194a, 196b, 197a, 202–5, and 214a.getTsIndividual, used in chunks 166b and 214a.

Uses Individual 132, OBSOLETE 132, testset 135a, and trainset 135a.

Comment 4.0.8. We can add new classes (and their weights) using the function addClass. Theinputs are the name of the class (of type string) and the weight of the class (of type float). Thereturn value is the index of the class.

136b 〈trainset::body 135b〉+≡ (135a) ⊳ 136a 136c ⊲

unsigned int addClass(string classname, float weight) pair<string, float> temp(classname, weight);classes.push_back(temp);return classes.size()-1;

Defines:

addClass, used in chunk 210.Uses classes 135a.

Comment 4.0.9. The index of a class can be found out using getClassIndex.

136c 〈trainset::body 135b〉+≡ (135a) ⊳ 136b 137a ⊲

int getClassIndex(string classname) uint size = classes.size();for (uint i=0; i6=size; i++)

if (classes[i].first ≡ classname) return i;setSelector(STDERR);ioprint("Error: "); ioprint(classname); ioprintln(" unknown.");return ALKERROR;

Defines:

getClassIndex, used in chunks 137a, 142a, 147b, 166a, 196b, 202b, 204, 205, and 212d.Uses ALKERROR 232, classes 135a, ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, and STDERR 246.

137

Comment 4.0.10. The name and weights of classes can be retrieved using these functions.

137a 〈trainset::body 135b〉+≡ (135a) ⊳ 136c 137b ⊲

string getClassString(unsigned int x) return classes[x].first;

float getClassWeight(string name) return classes[getClassIndex(name)].second;

float getClassWeight(unsigned int x) return classes[x].second;

Defines:getClassString, used in chunks 142a, 152a, and 166b.getClassWeight, used in chunks 145c, 146a, 150b, 152a, 153b, 163a, 172b, 174b, 175a, and 200.

Uses classes 135a and getClassIndex 134 136c.

Comment 4.0.11. We now look at different ways of partitioning the examples. The functionassignAllTrain designates that every example in the training set should be used for training.The function assignGrpTest sets aside examples in a certain group for testing.

137b 〈trainset::body 135b〉+≡ (135a) ⊳ 137a 138a ⊲

void assignAllTrain() for (unsigned int i=0; i6=trainset.size(); i++)

trainset[i].membership = TRAIN;void assignGrpTest(int group)

for (unsigned int i=0; i6=trainset.size(); i++)if (trainset[i].group ≡ group)

trainset[i].membership = TEST;

Defines:assignAllTrain, used in chunks 159 and 161a.assignGrpTest, used in chunk 159b.

Uses TEST 132, TRAIN 132, and trainset 135a.

138

Comment 4.0.12. The function chooseValidSet randomly picks a given percentage of examplesand puts them into the validation set.


// choose a subset from the training set// if the count is 0.x, round that up to 1.// otherwise if count = x.y, round down to x.void chooseValidSet(unsigned int seed, int percent)

assert(percent > 0);srand(seed);unsigned int totaltrain = getTrTrainSize();if (totaltrain ≡ 0) return;unsigned int count = (percent ∗ totaltrain) ÷ 100;if (count ≡ 0) count = 1; /∗ special case ∗/assert(count ≤ totaltrain);unsigned int random;float total = getTrSize();while (count 6= 0)

random = (unsigned int) (total∗rand()÷(RAND_MAX+1.0));if (trainset[random].membership ≡ TRAIN)

trainset[random].membership = VALID;count−−;

Defines:chooseValidSet, used in chunks 159 and 161a.

Uses getTrSize 134 140b, getTrTrainSize 134 141, TRAIN 132, trainset 135a, and VALID 132.

Comment 4.0.13. Using the same general procedure, the function chooseTestSet randomlypicks a given percentage of examples and puts them into the test set.


void chooseTestSet(unsigned int seed, int percent) assert(percent > 0);srand(seed);unsigned int random;int count = (percent ∗ getTrSize()) ÷ 100;float total = getTrSize();if (count ≡ 0) count = 1;while (count 6= 0)

random = (unsigned int) (total∗rand()÷(RAND_MAX+1.0));if (trainset[random].membership ≡ TRAIN)

trainset[random].membership = TEST;count−−;

Defines:chooseTestSet, used in chunk 161a.

Uses getTrSize 134 140b, TEST 132, TRAIN 132, and trainset 135a.

139

Comment 4.0.14. This function initialises the pointers to the different subsets in trainset.


static int tr_start = -5, ts_start = -5, vl_start = -5;void linkup_sets()

tr_start = -5; ts_start = -5; vl_start = -5;int tr_last = -5, ts_last = -5, vl_last = -5;for (unsigned int i=0; i6=trainset.size(); i++)

switch (trainset[i].membership) case TRAIN:

if (tr_start ≡ -5) tr_start = i; tr_last = i; else trainset[tr_last].ptr_tr = i; tr_last = i; break;

case TEST:if (ts_start ≡ -5) ts_start = i; ts_last = i; else trainset[ts_last].ptr_ts = i; ts_last = i; break;

case VALID:if (vl_start ≡ -5) vl_start = i; vl_last = i; else trainset[vl_last].ptr_vl = i; vl_last = i; break;

case UNDEFINED: assert(false); exit(1);case OBSOLETE: break;

if (tr_last 6= -5) trainset[tr_last].ptr_tr = -5;if (ts_last 6= -5) trainset[ts_last].ptr_ts = -5;if (vl_last 6= -5) trainset[vl_last].ptr_vl = -5;

Defines:

linkup_sets, used in chunks 159 and 161a.tr_start, used in chunk 139b.ts_start, used in chunk 139b.vl_start, used in chunk 139b.

Uses OBSOLETE 132, TEST 132, TRAIN 132, trainset 135a, UNDEFINED 132, and VALID 132.


int getTrStartIndex() return tr_start; int getTsStartIndex() return ts_start; int getVlStartIndex() return vl_start;

Defines:getTrStartIndex, used in chunks 162b, 204, and 205.getTsStartIndex, used in chunk 166a.getVlStartIndex, used in chunks 196b and 197a.

Uses tr_start 139a, ts_start 139a, and vl_start 139a.

140

Comment 4.0.15. This function partitions the data set into n disjoint and approximately equal-sized groups, which can then be used for cross-validation experiments. The second parameter seedis used to initialise the random number generator. Its default value is 0.


#include <ctype.h>static vector<unsigned int> filled;void doPartition(unsigned int n, unsigned int seed)

filled.clear();filled.reserve(n);unsigned int xyz = getTrSize();if (xyz < n)

cout ≪ "\nWARNING: |Trainset| < " ≪ n ≪ ". Do you want to "

"proceed with the partitioning? (Y|N) ";char in; cin ≫ in;if (toupper(in) ≡ ’N’) exit(1);

unsigned int i, j;for (j=0; j6=n; j++)

filled.push_back(xyz ÷ (n-j));xyz = xyz - filled[j];

unsigned int random; bool done; float fn = n;srand(seed);for(i=0; i6=getTrSize(); i++)

done = false;while (¬done)

random = (unsigned int)(fn∗rand()÷(RAND_MAX+1.0));if (filled[random] ≡ 0) continue;else filled[random]−−;

trainset[i].group = random;done = true;

Defines:doPartition, used in chunks 159a and 160d.filled, never used.

Uses clear 145b, getTrSize 134 140b, and trainset 135a.

Comment 4.0.16. The function getTrSize() returns the size of the training set; the functiongetTsSize() returns the size of the test set; getClassCount() returns the number of classes.

140b 〈trainset::body 135b〉+≡ (135a) ⊳ 140a 141 ⊲

unsigned int getTrSize() return trainset.size(); unsigned int getTsSize() return testset.size(); unsigned int getClassCount() return classes.size();

Defines:getClassCount, used in chunks 145a, 162c, and 203.getTrSize, used in chunks 3a, 138, 140a, 159a, 160d, 162a, 173a, 192b, 198b, 202b, 203, 205, and 214a.getTsSize, used in chunks 166b and 214a.

Uses classes 135a, testset 135a, and trainset 135a.

141

141 〈trainset::body 135b〉+≡ (135a) ⊳ 140b 142a ⊲

unsigned int getTrTrainSize() unsigned int ret = 0;for (unsigned int i=0; i6=trainset.size(); i++)

if (trainset[i].membership ≡ TRAIN) ret++;return ret;

unsigned int getTrTestSize()

unsigned int ret = 0;for (unsigned int i=0; i6=trainset.size(); i++)

if (trainset[i].membership ≡ TEST) ret++;return ret;

unsigned int getTrValidSize()

unsigned int ret = 0;for (unsigned int i=0; i6=trainset.size(); i++)

if (trainset[i].membership ≡ VALID) ret++;return ret;

Defines:getTrTestSize, used in chunks 166a and 203.getTrTrainSize, used in chunks 138a and 203.getTrValidSize, used in chunks 194d, 196a, and 203.

Uses TEST 132, TRAIN 132, trainset 135a, and VALID 132.

142

Comment 4.0.17. The function getTrClassSize() returns the number of individuals of a par-ticular class in the training set.

142a 〈trainset::body 135b〉+≡ (135a) ⊳ 141 142b ⊲

unsigned int getTrClassSize(unsigned int x) assert(options.learning_mode ≡ CLASSIFICATION);unsigned int i, j = trainset.size(), ret = 0;string cname = getClassString(x);for (i=0; i6=j; i++)

if (trainset[i].label.clabel ≡ cname ∧trainset[i].membership≡TRAIN)

ret++;return ret;

unsigned int getTrClassSize(string name)

return getTrClassSize(getClassIndex(name));

float getTrClassWeightSize(unsigned int x) assert(options.learning_mode ≡ CLASSIFICATION);unsigned int i, j = trainset.size(); float ret = 0;string cname = getClassString(x);for (i=0; i6=j; i++)

if (trainset[i].label.clabel ≡ cname ∧trainset[i].membership≡TRAIN)

ret += trainset[i].weight;return ret;

Defines:getTrClassSize, never used.getTrClassWeightSize, used in chunk 162c.

Uses CLASSIFICATION 234, getClassIndex 134 136c, getClassString 134 137a, label 21, options 234 235,TRAIN 132, and trainset 135a.

142b 〈trainset::body 135b〉+≡ (135a) ⊳ 142a 143 ⊲

void printTrainset() for (unsigned int i=0; i6=trainset.size(); i++) trainset[i].print();

void printTestset() for (unsigned int i=0; i6=testset.size(); i++) testset[i].print();

Defines:printTestset, used in chunk 7b.printTrainset, used in chunk 7b.

Uses testset 135a and trainset 135a.

143

Comment 4.0.18. The function cleanup-trainset() frees the memory occupied the individu-als.

143 〈trainset::body 135b〉+≡ (135a) ⊳ 142b

void cleanup_trainset() cerr ≪ "Cleaning up the training examples......";int icount = trainset.size();int ticount = testset.size();for (int i=0; i6=icount; i++) trainset[i].freememory();for (int i=0; i6=ticount; i++) testset[i].freememory();cerr ≪ "Done.\n";

Defines:cleanup_trainset, used in chunk 8.

Uses freememory 19a 19c, testset 135a, and trainset 135a.

Chapter 5

A Decision-Tree Learner

We need to make a decision, no matter what it is.Suzanne Botts

5.1 Introduction

Comment 5.1.1. Having discussed our approach to knowledge representation, we now presenta decision-tree learning system based on these ideas. Alkemic trees are very much like standardbinary decision trees, with the exception that they accept as input basic terms and use standardpredicates as splitting functions. Two closely related ILP systems, S-CART and Tilde are describedin [Kra96], [KW01], [BD98] and [Blo98].

We present two algorithms for learning decision trees. The first is a variant of the standardtop-down-induction of decision trees (TDIDT) algorithm. The second algorithm implemented inthe system is a variant of the covering algorithm of [Riv87] for learning decision lists.

5.2 Data Structures

5.2.1 struct distribution

Comment 5.2.1. The structure distribution is used to record the individuals allocated to aparticular node of the decision tree, it is part of a tree node structure (see below). It has thefollowing members:

• n : The number of individuals in each class falling in this node; used in classification problems.

• labels : The regression values falling in this node; used in regression problems.

• labelsum : The sum of all the values in labels.

144 〈tree-dstructs::distribution 144〉≡ (155a)

#include <set>#define uint unsigned int


struct distribution vector<double> n;multiset<double> labels; double labelsum;distribution();void operator=(const distribution &other);〈distribution functions 145b〉〈distribution function declarations 148〉

;

144

5.2. DATA STRUCTURES 145

Defines:distribution, used in chunks 145–47, 149, 150b, 154c, 172b, 175a, 181a, 189a, and 191b.

Uses trainset 135a and trainset.h 133.

145a 〈tree-dstructs.cc 145a〉≡ 145c ⊲

#include "tree-dstructs.h"

distribution::distribution() for (uint i=0;i6=getClassCount();i++) n.push_back(0);labelsum = 0;

void distribution::operator=(const distribution & other)

n.clear();for (uint i=0; i6=other.n.size(); i++) n.push_back(other.n[i]);labels = other.labels;labelsum = other.labelsum;

Uses clear 145b, distribution 144, getClassCount 134 140b, and tree-dstructs.h 155a.

Comment 5.2.2. This clears the members.

145b 〈distribution functions 145b〉≡ (144)

void clear() for (unsigned int i=0; i6=n.size(); i++) n[i] = 0;labels.clear(); labelsum = 0;

Defines:clear, used in chunks 20b, 25d, 60, 62b, 67e, 92b, 93c, 95b, 96a, 103d, 107a, 110a, 113b, 117b, 119b, 127a, 140a,

145a, 150, 184a, 191b, 205, 208a, 214b, and 238b.

Comment 5.2.3. Classes can have different weights. The function sumInt computes the (un-weighted) sum of the n array, whereas the function sum computes the weighted sum of the n

array.

145c 〈tree-dstructs.cc 145a〉+≡ ⊳ 145a 146a ⊲

double distribution::sumInt() double result = 0;for (unsigned int i=0; i6=n.size(); i++) result += n[i];return result;

double distribution::sum()

double result = 0.0;for (unsigned int i=0; i6=n.size(); i++)

result = result + getClassWeight(i) ∗ n[i];return result;

Defines:sum, used in chunks 152a, 171b, 172a, 178a, 182c, and 186.sumInt, used in chunks 147c, 167b, 172a, 174b, 177–79, 196b, and 198.

Uses distribution 144 and getClassWeight 134 137a.

Comment 5.2.4. The function majorclass computes the majority class, i.e., the class with thehighest individual count in a node. The function minorclass, in turn, computes the minorityclass. The weight of the classes are taken into account in these calculations.


146a 〈tree-dstructs.cc 145a〉+≡ ⊳ 145c 146b ⊲

int distribution::majorclass() int result = 0;double max = getClassWeight(0) ∗ n[0];for (unsigned int i=1; i6=n.size(); i++)

if (getClassWeight(i) ∗ n[i] > max) result = i; max = getClassWeight(i) ∗ n[i];

return result;int distribution::minorclass()

int result = 0;double min = getClassWeight(0) ∗ n[0];for (unsigned int i=1; i6=n.size(); i++)

if (getClassWeight(i) ∗ n[i] < min) result = i; min = getClassWeight(i) ∗ n[i];

return result;

Defines:majorclass, used in chunks 147c, 150b, 162c, 174b, 182b, and 184a.minorclass, used in chunk 182c.


146b 〈tree-dstructs.cc 145a〉+≡ ⊳ 146a 146c ⊲

void distribution::print() if (CLASSIFICATION_MODE)

cout ≪ "("; int ccount = n.size();for (int i=0; i6=ccount; i++)

cout ≪ n[i];if (i ≡ ccount-1) cout≪") "; else cout≪",";

else cout ≪ "(" ≪ labels.size() ≪ ") ";cout ≪ "("; multiset<double>::iterator p = labels.begin();while (p 6= labels.end()) cout ≪ ∗p ≪ ","; p++; cout ≪ ") ";

Uses CLASSIFICATION_MODE 234 and distribution 144.

Comment 5.2.5. The following two functions are related to regression learning. The functioncomputeAverage computes the average of the regression values in the node.

146c 〈tree-dstructs.cc 145a〉+≡ ⊳ 146b 147a ⊲

double distribution::computeAverage() assert(REGRESSION_MODE);if (labels.size() ≡ 0) return 0.0;double ret = 0; double rsize = labels.size();multiset<double>::iterator p = labels.begin();while (p 6= labels.end()) ret += ∗p; p++; return (ret ÷ rsize);

Defines:computeAverage, used in chunks 150b, 152a, 162c, 163a, 174b, 198b, and 200.

Uses distribution 144 and REGRESSION_MODE 234.


Comment 5.2.6. The function computeSqError computes the squared error of a set of regressionvalues given the empirical mean of the set.


double distribution::computeSqError(double average) assert(REGRESSION_MODE);double ret = 0;if (labels.size() ≤ 1) return ret;multiset<double>::iterator p = labels.begin();while (p 6= labels.end()) ret += (∗p-average) ∗ (∗p-average); p++; return ret;

Defines:computeSqError, used in chunks 150b, 152a, 153b, 163a, 174b, 198b, and 200.

Uses distribution 144 and REGRESSION_MODE 234.

Comment 5.2.7. The next function can be used to update a node with a new individual.


void distribution::update_with(Individual & ind) if (CLASSIFICATION_MODE)

n[getClassIndex(ind.label.clabel)] += ind.weight;else labels.insert(ind.label.rg); labelsum += ind.label.rg;

Defines:update_with, used in chunks 153b, 162c, 184a, 192, 194a, and 197a.

Uses CLASSIFICATION_MODE 234, distribution 144, getClassIndex 134 136c, Individual 132, insert 11d,and label 21.

Comment 5.2.8. The next function allows us to check the purity of a node. Complete puritycan be checked by setting the input argument x to 0.


bool distribution::isXPure(int x) assert(CLASSIFICATION_MODE ∧ DLIST_MODE);if (sumInt() - n[majorclass()] ≤ x) return true;return false;

Defines:

isXPure, used in chunk 182b.Uses CLASSIFICATION_MODE 234, distribution 144, DLIST_MODE 234, majorclass 146a 148, and sumInt 145c 148.


Comment 5.2.9. Here is a summary of the functions we have described.

148 〈distribution function declarations 148〉≡ (144)

double sumInt();double sum();int majorclass();int minorclass();void print();double computeAverage();double computeSqError(double average);void update_with(Individual & ind);bool isXPure(int x);

Defines:computeAverage, used in chunks 150b, 152a, 162c, 163a, 174b, 198b, and 200.computeSqError, used in chunks 150b, 152a, 153b, 163a, 174b, 198b, and 200.isXPure, used in chunk 182b.majorclass, used in chunks 147c, 150b, 162c, 174b, 182b, and 184a.minorclass, used in chunk 182c.sum, used in chunks 152a, 171b, 172a, 178a, 182c, and 186.sumInt, used in chunks 147c, 167b, 172a, 174b, 177–79, 196b, and 198.update_with, used in chunks 153b, 162c, 184a, 192, 194a, and 197a.

Uses Individual 132.


5.2.2 struct treenode_t

Comment 5.2.10. struct treenode stores the information at a particular node in the decisiontree, with links to its left and right subtree. Obviously, all nodes are accessible from the root node.struct treenode has the following members:

• bestpred : The predicate that splits this node.

• indcount : The number of individuals in this node.

• individuals : The index numbers of the individuals falling in this node. Useful for extractionfrom the training set.

• part : The distribution of the example labels in this node.

• majorityclass : The majority class of the individuals allocated to this node.

• ltree : The pointer to the left subtree, if one exists.

• rtree : The pointer to the right subtree, if one exists.

• Ap : The accuracy.

• entropy : The entropy of examples falling in this node.

• largestPureSet : This is related to decision-list learning.

• average : The average of the regression values falling in the node.

• sq_error : The squared error of the regression values falling in the node.

• pruned : Is the subtree rooted at the current node pruned?

• alpha : The value-cost tradeoff of the subtree rooted at this node.

• rigid : Can this node be changed?

149 〈tree-dstructs::treenode 149〉≡ (155a)

#include <list>#include "trainset.h"


union indlabel int mclass; double reg; ;struct treenode_t

std_predicate ∗ bestpred;int indcount;vector<int> individuals;treenode_t ∗ ltree, ∗ rtree;distribution part;int majorityclass;double Ap;double entropy;double largestPureSet;double average;double sq_error;bool pruned;double alpha;bool rigid;treenode_t();〈struct treenode function declarations 154c〉〈struct treenode functions 151c〉

;

Defines:bestpred, used in chunks 150–54, 162d, 177–80, and 184a.indcount, used in chunks 150, 151a, 153b, 162, 166a, 174b, 177, 178a, 184a, 192, 194a, 196b, and 197a.treenode_t, used in chunks 150–54, 157, 161c, 171, 184–86, 189a, 191b, 195, 198–201, and 203.

Uses distribution 144, rewrite.h 131b, std_predicate 101b, trainset 135a, and trainset.h 133.



treenode_t::treenode_t() alpha = 1000.0; bestpred = NULL; individuals.clear();Ap = 0.0; entropy = 1.0;largestPureSet = 0;indcount = 0; ltree = NULL; rtree = NULL;average = 0.0; sq_error = 0.0;pruned = false;rigid = false;

Uses bestpred 149, clear 145b, indcount 149, and treenode_t 149.

Comment 5.2.11. This function is used in the Extend Current Tree code chunk. It takesin examples falling in a leaf node (recorded in a distribution structure) and initialises theappropriate fields.


extern double Entropy(vector<double> x);

void treenode_t::initialise_fields(distribution E) indcount = individuals.size();bestpred = new std_predicate;bestpred→transformations.push_back(new transformation_t);part = E;if (CLASSIFICATION_MODE)

majorityclass = E.majorclass();Ap = getClassWeight(E.majorclass()) ∗ E.n[E.majorclass()];entropy = Entropy(E.n);

else if (REGRESSION_MODE) average = E.computeAverage();sq_error = E.computeSqError(average);

Defines:initialise_fields, used in chunk 184a.

Uses bestpred 149, CLASSIFICATION_MODE 234, computeAverage 146c 148, computeSqError 147a 148,distribution 144, Entropy 179b, getClassWeight 134 137a, indcount 149, majorclass 146a 148,REGRESSION_MODE 234, std_predicate 101b, transformation_t 101a, and treenode_t 149.

Comment 5.2.12. This function empties the various fields in the tree, retaining only the bestpredicates and the labels at the leaf nodes. It is used to clear a tree before it is used to evaluatea fresh dataset.


void treenode_t::clear_fields() indcount = 0; individuals.clear();part.clear();Ap = 0; entropy = 0; largestPureSet = 0;sq_error = 0; average = 0;alpha = 1000;if (ltree) ltree→clear_fields();if (rtree) rtree→clear_fields();

Defines:

clear_fields, used in chunks 166a, 196b, and 197a.Uses clear 145b, indcount 149, and treenode_t 149.


Comment 5.2.13. This function returns a deep clone of the tree rooted at the current node.This function needs to be changed if additional attributes are added to the treenode_t structure.


treenode_t ∗ treenode_t::clone() treenode_t ∗ ret = new treenode_t;if (bestpred) ret→bestpred = bestpred→clone();else ret→bestpred = NULL;ret→indcount = indcount; ret→individuals = individuals;ret→part =part;ret→majorityclass = majorityclass;ret→Ap = Ap; ret→entropy = entropy;ret→largestPureSet = largestPureSet;ret→average = average; ret→sq_error = sq_error;ret→pruned = pruned; ret→alpha = alpha;ret→rigid = rigid;if (ltree) ret→ltree = ltree→clone(); else ret→ltree = NULL;if (rtree) ret→rtree = rtree→clone(); else ret→rtree = NULL;return ret;

Uses bestpred 149, clone 19a 19b, indcount 149, and treenode_t 149.

Comment 5.2.14. This function frees the memory allocated for the decision tree. It should onlybe called from the root node.

151b 〈tree-dstructs.cc 145a〉+≡ ⊳ 151a 152a ⊲

void treenode_t::freememory() if (bestpred) bestpred→freememory();if (ltree) ltree→freememory();if (rtree) rtree→freememory();delete this;

Uses bestpred 149, freememory 19a 19c, and treenode_t 149.

Comment 5.2.15. This function checks whether a node is a terminal node.

151c 〈struct treenode functions 151c〉≡ (149) 154b ⊲

bool isterminal() return (pruned ∨ (ltree ≡ NULL ∧ rtree ≡ NULL));

Defines:isterminal, used in chunks 152–54, 194d, and 198–200.


Comment 5.2.16. Printing of the decision tree is a fairly straightforward exercise.

152a 〈tree-dstructs.cc 145a〉+≡ ⊳ 151b 152b ⊲

void treenode_t::print(int layer) int label;if (isterminal())

if (DLIST_MODE) layer = 1;for (int i=0; i6=layer; i++) cout ≪ "\t";if (CLASSIFICATION_MODE)

label = majorityclass;cout ≪ getClassString(label) ≪ "\t";cout ≪ getClassWeight(label)∗part.n[label];cout ≪ "/" ≪ part.sum();cout ≪ "\t"; part.print(); cout ≪ endl;

else // regressionaverage = part.computeAverage();sq_error = part.computeSqError(average);cout ≪ average ≪ "\t(err="≪ sq_error ≪ ")\t";part.print(); cout≪endl;

return;

if (DLIST_MODE) layer = 0;setSelector(STDOUT);for (int i=0; i6=layer; i++) cout ≪ "\t";cout ≪ "IF "; bestpred→print(); cout ≪ " x THEN" ≪ endl;ltree→print(layer+1);

for (int i=0; i6=layer; i++) cout ≪ "\t";if (DLIST_MODE) cout ≪ "ELSE "; else cout ≪ "ELSE\n";rtree→print(layer+1);

Uses bestpred 149, CLASSIFICATION_MODE 234, computeAverage 146c 148, computeSqError 147a 148,DLIST_MODE 234, getClassString 134 137a, getClassWeight 134 137a, isterminal 151c, label 21,setSelector 246 247a, STDOUT 246, sum 145c 148, and treenode_t 149.

Comment 5.2.17. The function calcAP calculates the number of examples classified correctlyby the calling tree.


double treenode_t::calcAP() assert(CLASSIFICATION_MODE);if (isterminal()) return Ap;else return ltree→calcAP() + rtree→calcAP();

Defines:calcAP, used in chunks 167b, 171b, 186, and 196b.

Uses CLASSIFICATION_MODE 234, isterminal 151c, and treenode_t 149.


Comment 5.2.18. The function calcError calculates the squared error of the calling tree on itsset of examples.


double treenode_t::calcError() assert(REGRESSION_MODE);if (isterminal()) return sq_error;else return ltree→calcError() + rtree→calcError();

Defines:calcError, used in chunks 165b, 197a, and 198b.

Uses isterminal 151c, REGRESSION_MODE 234, and treenode_t 149.

Comment 5.2.19. We have functions for evaluating both a set of individuals (evaluate) and onesingle individual (evaluate1). To use the former, we need to initialise the individuals vectorin the root node of the tree. A call to evaluate will then distribute the examples down the leafnodes of the tree.


#include "pred-evaluation.h"

void treenode_t::evaluate() if (isterminal())

if (CLASSIFICATION_MODE)Ap=getClassWeight(majorityclass)∗part.n[majorityclass];

else sq_error = part.computeSqError(average);return ;

term_schema ∗ pred = bestpred→makeTerm();for (int i=0; i6=indcount; i++)

Individual & ind = getTrIndividual(individuals[i]);bool result = eval(pred, ind.individual);if (result)

ltree→part.update_with(ind);ltree→individuals.push_back(individuals[i]);

else rtree→part.update_with(ind);rtree→individuals.push_back(individuals[i]);

pred→freememory();ltree→indcount = ltree→individuals.size(); ltree→evaluate();rtree→indcount = rtree→individuals.size(); rtree→evaluate();

Defines:evaluate, used in chunks 166a, 196b, and 197a.

Uses bestpred 149, CLASSIFICATION_MODE 234, computeSqError 147a 148, eval 189a 190a, freememory 19a 19c,getClassWeight 134 137a, getTrIndividual 134 136a, indcount 149, Individual 132, isterminal 151c,makeTerm 103b 105b, pred-evaluation.h 189a, term_schema 9, treenode_t 149, and update_with 147b 148.


Comment 5.2.20. The function evaluate1 pushes the example down the tree and returns thelabel of the leaf node it ended up in.


indlabel treenode_t::evaluate1(Individual & ind) if (isterminal())

indlabel ret;if (CLASSIFICATION_MODE) ret.mclass = majorityclass;else ret.reg = average;return ret;

term_schema ∗ pred = bestpred→makeTerm();bool result = eval(pred, ind.individual); pred→freememory();if (result) return ltree→evaluate1(ind);else return rtree→evaluate1(ind);

Defines:evaluate1, used in chunks 166b, 201c, and 204a.

Uses bestpred 149, CLASSIFICATION_MODE 234, eval 189a 190a, freememory 19a 19c, Individual 132,isterminal 151c, makeTerm 103b 105b, term_schema 9, and treenode_t 149.

Comment 5.2.21. This function resets the unpruned fields in the calling tree.

154b 〈struct treenode functions 151c〉+≡ (149) ⊳ 151c 198a ⊲

void unpruned() pruned = false;if (isterminal()) return;ltree→unpruned(); rtree→unpruned();

Defines:unpruned, used in chunk 194c.

Uses isterminal 151c.

Comment 5.2.22. Here is a quick summary of the functions we have introduced for treenode_t.

154c 〈struct treenode function declarations 154c〉≡ (149)

void initialise_fields(distribution E);void clear_fields();treenode_t ∗ clone();void freememory();void print(int layer);double calcAP();double calcError();void evaluate();indlabel evaluate1(Individual & ind);

Defines:calcAP, used in chunks 167b, 171b, 186, and 196b.calcError, used in chunks 165b, 197a, and 198b.clear_fields, used in chunks 166a, 196b, and 197a.evaluate, used in chunks 166a, 196b, and 197a.evaluate1, used in chunks 166b, 201c, and 204a.initialise_fields, used in chunk 184a.

Uses clone 19a 19b, distribution 144, freememory 19a 19c, Individual 132, and treenode_t 149.


155a 〈tree-dstructs.h 155a〉≡#ifndef _TREE_DSTRUCTS_H_

#define _TREE_DSTRUCTS_H_

〈tree-dstructs::distribution 144〉〈tree-dstructs::treenode 149〉

#endifDefines:

tree-dstructs.h, used in chunks 145a, 157, and 189b.

5.2.3 struct olnode

Comment 5.2.23. The structure olnode stores the information of the individual nodes on theopen list. It has the following members:

• predicate : The predicate waiting to be rewritten.

• Ap : The accuracy of the predicate on the training set.

• Bp : The refinement bound of the predicate on the training set.

• max_cover : The highest pure coverage that can be achieved; used in decision list learning.

• error : The squared error of the predicate on the training set.

• min_error : The minimum error that can be achieved by any descendant of predicate.

• age : The age of this node; used to enforce a FIFO behaviour on nodes with equal Bp values.

• freememory() : Frees the memory allocated to this node.

155b 〈alkemy::data structures 155b〉≡ (157) 160a ⊲

struct olnode std_predicate ∗ predicate;double Ap, Bp;double max_cover;double error, min_error;unsigned int age;olnode() predicate = NULL; Ap = 0; Bp = 0; max_cover = 0;

error = 0; min_error = 0; bool operator≡(olnode const &other) const;bool operator<(olnode const &other) const;void freememory() if (predicate) predicate→freememory();

;

Defines:olnode, used in chunks 156, 157, 173, and 181–83.



Comment 5.2.24. We need to define a total order on olnode because this is needed by thestandard C++ containers to do operations like sorting.

156a 〈alkemy::data structures::functions 156a〉≡ (158a) 156b ⊲

bool olnode::operator≡(olnode const &other) const if (CLASSIFICATION_MODE ∧ DLIST_MODE)

return (age ≡ other.age ∧ feq(max_cover,other.max_cover)); else if (CLASSIFICATION_MODE)

return (feq(Ap,other.Ap) ∧ feq(Bp,other.Bp) ∧ age ≡ other.age); else

return (feq(error,other.error) ∧ feq(min_error,other.min_error) ∧age ≡ other.age);

Uses CLASSIFICATION_MODE 234, DLIST_MODE 234, feq 232, and olnode 155b.

156b 〈alkemy::data structures::functions 156a〉+≡ (158a) ⊳ 156a

bool olnode::operator<(olnode const &other) const if (CLASSIFICATION_MODE ∧ DLIST_MODE)

if (max_cover > other.max_cover) return false;if (max_cover < other.max_cover) return true;return (age > other.age);

else if (CLASSIFICATION_MODE) if (Bp>other.Bp) return false; if (Bp<other.Bp) return true;if (Ap>other.Ap) return false; if (Ap<other.Ap) return true;return (age > other.age);

else if (min_error < other.min_error) return false;if (min_error > other.min_error) return true;if (error < other.error) return false;if (error > other.error) return true;return (age > other.age);

Uses CLASSIFICATION_MODE 234, DLIST_MODE 234, and olnode 155b.

5.3. ALKEMY MODULE STRUCTURE 157

5.3 Alkemy Module Structure

Comment 5.3.1. This is the main class where the whole learning process is conducted. Learningis performed by calling one of the public functions alkemy. The data structure dctree storesthe tree under construction. The data structure openlist is used to store potentially interestingpredicates.

157 〈alkemy.h 157〉≡#ifndef _ALKEMY_H_

#define _ALKEMY_H_


#include <queue>

〈alkemy::data structures 155b〉

class alkemy priority_queue<olnode> openlist;treenode_t ∗ dctree;

public:alkemy();void cross_validate(unsigned int n);void cross_validate_1f(unsigned int n, unsigned int i);void m_leave_n_out(unsigned int m);eval_measures boost(unsigned int m);void printdctree() cout≪functionname≪" x = \n"; dctree→print(1);

private:〈alkemy::private function declarations 161b〉

;

#endif

Defines:alkemy.h, used in chunks 2, 158a, 206, and 229.boost, used in chunks 159 and 161a.cross_validate, used in chunk 207a.cross_validate_1f, used in chunk 207a.dctree, used in chunks 161–67, 171b, 174b, 186, 189a, 191b, 193–96, 203, and 204a.m_leave_n_out, used in chunk 207a.openlist, used in chunks 7a, 171b, 173, 180–84, and 186.printdctree, used in chunks 165b and 166a.

Uses eval_measures 160a, functionname 234 235, olnode 155b, tree-dstructs.h 155a, and treenode_t 149.

5.3. ALKEMY MODULE STRUCTURE 158

158a 〈alkemy.cc 158a〉≡#include "alkemy.h"


#include "pred-evaluation.h"

#include <unistd.h>#include <utility>#include <string>#include <vector>

static bool made_progress = false;static unsigned int tested = 0;

〈alkemy definitions 173b〉〈alkemy::static functions 158b〉〈alkemy::data structures::functions 156a〉〈alkemy::private functions 161c〉〈alkemy::public functions 158c〉

Defines:made_progress, used in chunks 177–80 and 184a.tested, used in chunks 164b, 165a, 174b, and 181a.

Uses alkemy.h 157, pred-evaluation.h 189a, and rewrite.h 131b.

Comment 5.3.2. The constructor installs a signal handler for SIGINT, which can be used tohalt learning.

158b 〈alkemy::static functions 158b〉≡ (158a) 175a ⊲

static bool interrupted;static void handle_signal(int sig)

cout ≪ "Learning interrupted........\n"; interrupted = true;

158c 〈alkemy::public functions 158c〉≡ (158a) 159a ⊲

#include <signal.h>alkemy::alkemy() signal(SIGINT, handle_signal);

5.4. LEARNING WRAPPERS 159

5.4 Learning Wrappers

Comment 5.4.1. This procedure performs a n-fold cross-validation on the data set. See [Koh95]for an analysis of this statistical estimation method. For small data sets, this method is infamouslyunstable.

We first divide the data set into n partitions, 2 ≤ n ≤ S, where S is the size of the data set. Infold i, all examples in partition i are used as the test set. The rest are used for training. Further,if tree post-pruning is required, a subset of the training examples are chosen as the validation set.

A final hypothesis is generated at the end using the whole data set.

159a 〈alkemy::public functions 158c〉+≡ (158a) ⊳ 158c 160d ⊲

void alkemy::cross_validate(unsigned int n) assert(n ≥ 2 ∧ n ≤ getTrSize());eval_measures avg_acc;doPartition(n, options.seed);for (unsigned int i=0; i6=n; i++)

〈cross validate::perform fold i 159b〉〈cross validate::update avg-acc with temp 160b〉

cout ≪ "\nHypothesis:\n";assignAllTrain();if (options.postprune) chooseValidSet(options.seed, options.valid);linkup_sets();if (options.boostN > 0) boost(options.boostN);else initialise_learner(); learn(); cleanup_learner(); 〈cross validate::print summary 160c〉

Defines:cross_validate, used in chunk 207a.

Uses assignAllTrain 134 137b, boost 157 203, chooseValidSet 134 138a, doPartition 134 140a,eval_measures 160a, getTrSize 134 140b, initialise_learner 161c, learn 163c 164a, linkup_sets 134 139a,and options 234 235.

Comment 5.4.2. To do a particular validation fold, we first assign all examples for training. Wethen pick out the i-th (selected using doPartition) for use as the test set. If tree postpruning isrequired, we select a subset of the remaining training examples for use as the validation set. Thedesired learning procedure is then invoked.

159b 〈cross validate::perform fold i 159b〉≡ (159a 160d)

cout ≪ "\nFold " ≪ i ≪ ":\n";assignAllTrain();assignGrpTest(i);if (options.postprune) chooseValidSet(options.seed, options.valid);linkup_sets();eval_measures temp;if (options.boostN > 0) temp = boost(options.boostN);else initialise_learner(); temp = learn(); cleanup_learner();

Uses assignAllTrain 134 137b, assignGrpTest 134 137b, boost 157 203, chooseValidSet 134 138a,eval_measures 160a, initialise_learner 161c, learn 163c 164a, linkup_sets 134 139a, and options 234 235.

5.4. LEARNING WRAPPERS 160

Comment 5.4.3. The results of learning summarised in terms of the vital statistics are storedin this data structure.

160a 〈alkemy::data structures 155b〉+≡ (157) ⊳ 155b

struct eval_measures double train_acc, test_acc;eval_measures() train_acc = 0; test_acc = 0;

;

Defines:eval_measures, used in chunks 157, 159, 163, 164, and 203.

Comment 5.4.4. We need to average the vital statistics over the n cross validations. We justadd up the numbers and then divide in the end.

160b 〈cross validate::update avg-acc with temp 160b〉≡ (159a)

avg_acc.train_acc += temp.train_acc; avg_acc.test_acc += temp.test_acc;

160c 〈cross validate::print summary 160c〉≡ (159a)

cout ≪ "\n\nSummary of a " ≪ n ≪ "-fold cross-validation:\n";if (¬options.postprune)

cout ≪ "\tAverage train set accuracy = " ≪ avg_acc.train_acc÷n≪"\n";cout ≪ "\tAverage test set accuracy = "≪ avg_acc.test_acc÷n ≪"\n";


Comment 5.4.5. This function performs the individual folds. We recyle some code from the fullcross-validation procedure here.

160d 〈alkemy::public functions 158c〉+≡ (158a) ⊳ 159a 161a ⊲

void alkemy::cross_validate_1f(unsigned int n, unsigned int i) assert(n ≥ 2 ∧ n ≤ getTrSize());doPartition(n, options.seed);〈cross validate::perform fold i 159b〉

Defines:cross_validate_1f, used in chunk 207a.

Uses doPartition 134 140a, getTrSize 134 140b, and options 234 235.

5.5. TREE INDUCTION 161

Comment 5.4.6. We can also do m different leave-n-out experiments. In any one such exper-iment, we select a certain percentage of examples for use as a test set. We then learn on thetraining set and test the accuracy of our hypothesis on the test set.

161a 〈alkemy::public functions 158c〉+≡ (158a) ⊳ 160d 203 ⊲

void alkemy::m_leave_n_out(unsigned int m) unsigned int seed = options.seed;for (unsigned i=0; i6=m; i++)

assignAllTrain();if (options.test_percentage)

chooseTestSet(seed, options.test_percentage);seed = seed + 100;

if (options.postprune)chooseValidSet(options.seed, options.valid);

linkup_sets();if (options.boostN > 0) boost(options.boostN);else initialise_learner(); learn(); cleanup_learner();

Defines:m_leave_n_out, used in chunk 207a.

Uses assignAllTrain 134 137b, boost 157 203, chooseTestSet 134 138b, chooseValidSet 134 138a,initialise_learner 161c, learn 163c 164a, linkup_sets 134 139a, and options 234 235.

5.5 Tree Induction

5.5.1 Initialise Learner

Comment 5.5.1. This procedure initialises the root node of the decision tree by filling out itsessential fields. The initial partition is filled with information from the training set, including thetotal number of individuals in this node, together with the integer identifiers of the individual inthe training set.

The default best predicate is simply top, with a default Ap value equal to the majority classof the initial partition. Furthermore, the two subtrees are empty trees initially.

161b 〈alkemy::private function declarations 161b〉≡ (157) 163b ⊲

void initialise_learner();Uses initialise_learner 161c.

161c 〈alkemy::private functions 161c〉≡ (158a) 164a ⊲

void alkemy::initialise_learner() dctree = new treenode_t;〈initialise-learner::compute training set size 162a〉〈initialise-learner::push training instances onto local structure 162b〉〈initialise-learner::initialise distribution count 162c〉〈initialise-learner::initialise best predicate 162d〉〈initialise-learner::initialise default accuracy 163a〉

Defines:

initialise_learner, used in chunks 159, 161, and 203.Uses dctree 157 and treenode_t 149.

Comment 5.5.2. We first compute the sizes of the training, test and validation sets. Thiscomputation is really quite redundant; the function linkup_sets can be modified to return thenumbers being computed here.


162a 〈initialise-learner::compute training set size 162a〉≡ (161c)

int total = getTrSize();int train = 0, test = 0, valid = 0;for (int i=0; i6=total; i++)

switch (getTrIndividual(i).membership) case TRAIN: train++; break;case TEST: test++; break;case VALID: valid++; break;case UNDEFINED: assert(false);case OBSOLETE: break;

// cerr << "|dataset| = " << total << "\t|trainset| = " << train;// cerr << "\t|testset| = " << test << "\t|validset| = " << valid << endl;assert(total ≡ train + test + valid);dctree→indcount = total - test - valid;dctree→individuals.reserve(dctree→indcount);

Uses dctree 157, getTrIndividual 134 136a, getTrSize 134 140b, indcount 149, OBSOLETE 132, TEST 132,testset 135a, TRAIN 132, trainset 135a, UNDEFINED 132, and VALID 132.

Comment 5.5.3. We next insert the indices of all training examples into the local individualsstructure of dctree.

162b 〈initialise-learner::push training instances onto local structure 162b〉≡ (161c)

int k = getTrStartIndex();while (k 6= -5)

dctree→individuals.push_back(k); k = getTrIndividual(k).ptr_tr; assert((int)dctree→individuals.size() ≡ dctree→indcount);// cerr << "dctree->individuals.size = " << dctree->individuals.size() << endl;

Uses dctree 157, getTrIndividual 134 136a, getTrStartIndex 134 139b, and indcount 149.

Comment 5.5.4. We next initialise distribution of examples in the root node of the decision tree.

162c 〈initialise-learner::initialise distribution count 162c〉≡ (161c)

if (CLASSIFICATION_MODE) for (unsigned int i=0; i6=getClassCount(); i++)

dctree→part.n[i] = getTrClassWeightSize(i);dctree→majorityclass = dctree→part.majorclass();

else // regressionfor (int i=0; i6=dctree→indcount; i++)

dctree→part.update_with(getTrIndividual(dctree→individuals[i]));dctree→average = dctree→part.computeAverage();

Uses CLASSIFICATION_MODE 234, computeAverage 146c 148, dctree 157, getClassCount 134 140b,getTrClassWeightSize 134 142a, getTrIndividual 134 136a, indcount 149, majorclass 146a 148,and update_with 147b 148.

Comment 5.5.5. The initial best predicate is just top, of course.

162d 〈initialise-learner::initialise best predicate 162d〉≡ (161c)

dctree→bestpred = new std_predicate;dctree→bestpred→transformations.push_back(new transformation_t);

Uses bestpred 149, dctree 157, std_predicate 101b, and transformation_t 101a.


Comment 5.5.6. The last step in the initialisation process is to calculate the baseline accuracyand squared error of the whole set of training examples.

163a 〈initialise-learner::initialise default accuracy 163a〉≡ (161c)

if (CLASSIFICATION_MODE) dctree→Ap = getClassWeight(dctree→majorityclass) ∗

dctree→part.n[dctree→majorityclass];cout ≪ "Default accuracy = " ≪ dctree→Ap ≪ endl;dctree→entropy = Entropy(dctree→part.n);cout ≪ "Default entropy = " ≪ dctree→entropy ≪ endl;

else // regressiondctree→average = dctree→part.computeAverage();dctree→sq_error = dctree→part.computeSqError(dctree→average);cout ≪ "Default sq_error = " ≪ dctree→sq_error ≪ endl;

Uses CLASSIFICATION_MODE 234, computeAverage 146c 148, computeSqError 147a 148, dctree 157, Entropy 179b,and getClassWeight 134 137a.

5.5.2 Clean Up Learner

163b 〈alkemy::private function declarations 161b〉+≡ (157) ⊳ 161b 163c ⊲

void cleanup_learner() if (dctree) dctree→freememory();

Uses dctree 157 and freememory 19a 19c.

5.5.3 Learn

Comment 5.5.7. This is the decision-tree learning function. The algorithm employs the tra-ditional approach to tree construction. The decision tree is grown top-down until it is as largeas possible. To combat potential overfitting, a tree post-pruning algorithm is then used to findan approximately optimal subtree of the generated tree. This second step is optional. The sys-tem currently uses the error-complexity pruning algorithm introduced by [BFOS84] to do treepost-pruning. Other possibilities exists; [Min89] lists and compares some of them.

The exact behaviour of the learn function is controlled by the global options data structure.Of relevance to us at this stage are the following options:

• stump - Learn a decision stump.

• post-prune - This determines whether tree post-pruning should be performed. It works onlywith the default algorithm.

We shall discuss the other options associated with predicate search as they arise. In addition to theabove, the system has a signal handling routine to handle SIGINT (ctrl-c) signals. On receivingthis signal, the system will try to finish testing the open list node it is currently rewriting, reportthe constructed decision tree, and die gracefully.

163c 〈alkemy::private function declarations 161b〉+≡ (157) ⊳ 163b 167a ⊲

eval_measures learn();

Defines:learn, used in chunks 159, 161a, and 203.

Uses eval_measures 160a.


164a 〈alkemy::private functions 161c〉+≡ (158a) ⊳ 161c 167b ⊲

#include <sys/time.h>eval_measures alkemy::learn()

〈alkemy::learn::initialise variables 164b〉

gettimeofday(tp1, NULL);if (options.decision_list)

cout ≪ "Learning a decision list.\n"; buildDL(dctree, 0); else

bool tree_pruned = false;buildtree(dctree, 0);if (options.postprune)

tree_pruned = error_complexity_pruning();gettimeofday(tp2, NULL);〈alkemy::learn::Display learning options 165a〉〈alkemy::learn::Display decision tree on training set 165b〉〈alkemy::learn::Compute and display decision tree on test set 166a〉〈alkemy::learn::Compute result on the real test set 166b〉〈alkemy::learn::Calculate elapsed time 164c〉return ret;

Defines:learn, used in chunks 159, 161a, and 203.

Uses buildDL 185b 186, buildtree 171a 171b, dctree 157, error_complexity_pruning 194b 194c,eval_measures 160a, and options 234 235.

Comment 5.5.8. The variables tp1 and tp2 are used for timing purposes.

164b 〈alkemy::learn::initialise variables 164b〉≡ (164a)

eval_measures ret; tested = 0;struct timeval ∗ tp1 = (struct timeval ∗)malloc(sizeof(struct timeval));struct timeval ∗ tp2 = (struct timeval ∗)malloc(sizeof(struct timeval));

Uses eval_measures 160a and tested 158a.

164c 〈alkemy::learn::Calculate elapsed time 164c〉≡ (164a)

double elptm = (double) (tp2→tv_sec - tp1→tv_sec) +(double) (tp2→tv_usec - tp1→tv_usec)∗0.000001;

cout ≪ "Total elapsed time (sec) " ≪ elptm ≪ endl;


Comment 5.5.9. Here, we just systematically go through the options structure and print outthe relevant information.

165a 〈alkemy::learn::Display learning options 165a〉≡ (164a)

cout ≪ "Learning Parameters"≪ endl;cout ≪ "\tCommand Line : " ≪ commandline ≪ endl;

cout ≪ "\n\tSearch Strategy :\t";if (options.cutout≡-1 ∧ ¬options.i_prune) cout≪"N/A";if (options.cutout > 0) cout ≪ "cutout (c=" ≪ options.cutout ≪ ") ";if (options.i_prune)

cout≪"prune (p="≪options.i_prune≪" "≪options.prune≪") ";

cout ≪ "\n\tEnumeration Strategy :\t";(options.strategy ≡ LR) ? cout ≪ "LR" : cout ≪ "Expected";if (options.one_redex) cout ≪ " (1)";

cout ≪ "\n\tStatistical Test :\t";if (options.test_percentage ≡ 0 ∧ options.crossvalidate ≡ -5) cout ≪ "N/A";else

if (options.crossvalidate)cout ≪ "cross validation (C="≪options.crossvalidate≪ ") ";

else if (options.exp_count > 0)cout ≪ "m_leave_n%_out (m=" ≪ options.exp_count≪ ", t=" ≪ options.test_percentage ≪ ") ";

if (options.seed 6= 0) cout ≪ "seed=" ≪ options.seed;

cout ≪ "\n\tTree Structure :\t";if (options.stump) cout ≪ "stump";else if (DLIST_MODE) cout ≪ "list";else cout ≪ "tree";if (options.postprune) cout ≪ ", tree post-pruning (V=" ≪ options.valid≪")";cout ≪ "\n\nTotal predicates tested : " ≪ tested ≪ endl;cout ≪ "Total predicates in FAPtable : " ≪ FAPtable_size() ≪ endl ≪ endl;

Uses commandline 234 235, DLIST_MODE 234, FAPtable_size 189a 189d, LR 234, options 234 235, and tested 158a.

165b 〈alkemy::learn::Display decision tree on training set 165b〉≡ (164a)

cout ≪ "Decision tree on training set.." ≪ endl; printdctree();

if (CLASSIFICATION_MODE) ret.train_acc = computeAccuracy("Train Acc = ");else cout ≪ "Error = " ≪ dctree→calcError() ≪ endl;

cout ≪ endl ≪ endl;

Uses calcError 153a 154c, CLASSIFICATION_MODE 234, computeAccuracy 167a 167b, dctree 157,and printdctree 157.


Comment 5.5.10. Here, we compute the score of the induced tree on the test set.

166a 〈alkemy::learn::Compute and display decision tree on test set 166a〉≡ (164a)

if (getTrTestSize()) assert(CLASSIFICATION_MODE);cout ≪ "\nDecision tree on test set.." ≪ endl;dctree→clear_fields();

int k = getTsStartIndex();while (k 6= -5)

dctree→individuals.push_back(k);int index = getClassIndex(getTrIndividual(k).label.clabel);dctree→part.n[index]++;k = getTrIndividual(k).ptr_ts;

dctree→indcount = dctree→individuals.size();dctree→evaluate(); printdctree();

ret.test_acc = computeAccuracy("Test Acc = ");cout ≪ endl;

Uses CLASSIFICATION_MODE 234, clear_fields 150c 154c, computeAccuracy 167a 167b, dctree 157,evaluate 153b 154c, getClassIndex 134 136c, getTrIndividual 134 136a, getTrTestSize 134 141,getTsStartIndex 134 139b, indcount 149, label 21, and printdctree 157.

Comment 5.5.11. This code chunk computes the predictions for the test data using the hypoth-esis.

166b 〈alkemy::learn::Compute result on the real test set 166b〉≡ (164a)

if (getTsSize()) int osel = getSelector(); setSelector(STDOUT);cout ≪ "Predictions on the test set." ≪ endl;for (unsigned int i=0; i6=getTsSize(); i++)

indlabel result;result = dctree→evaluate1(getTsIndividual(i));cout ≪ functionname ≪ " ";setSelector(STDOUT); getTsIndividual(i).individual→print();if (CLASSIFICATION_MODE)

cout ≪ " = ";if (result.mclass ≡ -5) cout ≪ "Not sure\n;";else cout ≪ getClassString(result.mclass);

else cout ≪ " = " ≪ result.reg ≪ endl;cout ≪ endl;

cout ≪ "End predictions"≪ endl;setSelector(osel);

Uses CLASSIFICATION_MODE 234, dctree 157, evaluate1 154a 154c, functionname 234 235, getClassString 134 137a,getSelector 246 247a, getTsIndividual 134 136a, getTsSize 134 140b, setSelector 246 247a, and STDOUT 246.

5.6. TREE-INDUCTION ALGORITHMS 167

function Learn(E , , P, M) returns a decision tree;

inputs: E , a set of examples;, a predicate rewrite system;P , prune parameter;M , learning mode, classification or regression;

tree := BuildTree(E , ,P ,M )

if M = classification then

label each leaf node of tree by its majority class;

if M = regression then

label each leaf node of tree by the average of the regression values;

tree := Postprune(tree,M );

return tree;

Figure 5.1: Decision-tree learning algorithm

Comment 5.5.12. The following are two supporting functions for calculating vital statistics.

167a 〈alkemy::private function declarations 161b〉+≡ (157) ⊳ 163c 171a ⊲

double computeAccuracy(string text);

Defines:computeAccuracy, used in chunks 165b and 166a.

167b 〈alkemy::private functions 161c〉+≡ (158a) ⊳ 164a 171b ⊲

double alkemy::computeAccuracy(string text) double ret = 0;double numerator = dctree→calcAP();double denominator = dctree→part.sumInt();cout ≪ text ≪ numerator ≪"/"≪ denominator;if (denominator) ret = numerator ÷ denominator;cout ≪" ("≪ 100.0∗ret≪ "%)";return ret;

Defines:computeAccuracy, used in chunks 165b and 166a.

Uses calcAP 152b 154c, dctree 157, and sumInt 145c 148.

5.6 Tree-Induction Algorithms

Comment 5.6.1. We can now give the top-down tree induction algorithm. (See Figures 5.1, 5.2,and 5.3.) The basic algorithm is standard. In our implementation, the cost/error complexitypruning algorithm of [BFOS84] is used as the tree post-pruning algorithm. Any other post-pruningtechnique can be employed. The most interesting part of the algorithm is the way it searches thespace of predicates to find a good predicate to split a node. This enumeration process was describedearlier in Section 3.5. The function Predicate given in Figure 5.3 is a variation of Algorithm II(Figure 3.3) that instead returns a single predicate. One can adapt Algorithm I (Figure 3.2) foruse here in a similar manner.


function BuildTree(E , , P, M) returns a decision tree;

inputs: E , a set of examples;, a predicate rewrite system;P , prune parameter;M , learning mode, classification or regression;

tree := single node (with examples E);

p := Predicate(E , , P, M);

P := partition of E induced by p;

if M = classification ∧ AP ≤ AE then return tree;

if M = regression ∧ QP = EE then return tree;

tree.predicate := p;

E+ := (x, y) | (p x) ∧ (x, y) ∈ E;

E− := (x, y) | ¬(p x) ∧ (x, y) ∈ E;

tree.left := BuildTree(E+, , P, M);

tree.right := BuildTree(E−, , P, M);

return tree;

Figure 5.2: Tree building algorithm

There are four inputs to the algorithm: the training examples, a predicate rewrite system, aprune parameter, and a learning mode. The learning mode specifies whether a classification ora regression tree should be produced. We assume here that the input predicate rewrite systemis monotone and satisfies the conditions for ensuring uniqueness and completeness of predicatederivations.

The parameter prune is a percentage; it causes predicates whose induced partitions do nothave a refinement bound larger than prune to be pruned and thus removes predicates that do nothave the potential for achieving splits of high accuracy. In the default mode, the parameter isinitially set at 0%. As better accuracies are obtained during search, the value of the parameteris updated. This kind of pruning is safe. However, for large search spaces, it is common to set ahigh initial value for prune.

The open list is decreasingly ordered by the refinement bounds of the partitions induced bythe predicates. (Predicates that have the same refinement bound are decreasingly ordered byaccuracy.) Thus the predicate with the highest such value is at the head of the list. The refinementbound plays a crucial part in directing the search towards promising predicates, that is, those whichhave the potential for being strengthened to produce splits with high accuracy. In this regard, inFigure 5.3, the function Insert takes a predicate and the open list as arguments and returns a newopen list with the predicate inserted in the appropriate place according to the ordering imposedby refinement bound.

For the sake of clarity, we have left out the code for breaking ties between equally goodpredicates with the entropy function. Also not shown in the figure is a parameter called stump.This parameter is a boolean. If stump is true, then the induced decision tree has only a singlesplit. The default value of the parameter is false.

Comment 5.6.2. In addition to the standard top-down induction algorithm, Alkemy also im-plements the decision-list learning algorithm introduced by [Riv87]. This algorithm only worksfor classification problems. A decision list is a special kind of decision tree. Splits only everoccur at the false branches of decision nodes. More precisely, a decision list is a list L of pairs(f1, v1), . . . , (fr, vr) where each fj is a predicate, and each vi is in some finite discrete set. Adecision list L defines a function as follows: for any individual x ∈ Bα for some type α, L(x) isdefined to be vj , where j is the least index such that fj x = ⊤.


function Predicate(E , , P, M) returns a predicate;

input: E , a set of examples;, a predicate rewrite system;P , prune parameter;M , learning mode, classification or regression;

openList := [top];

predicate := top;

if M = classification then accuracy := AE ;

if M = regression then error := Q(E , ∅);




for each LR redex r via r b, for some b, in p do

q := p[r/b];

if q is regular then

P := partition of E induced by q;

if M = classification then

if AP > accuracy then

predicate := q;

accuracy := AP ;

if AP > P then P := AP ;

if BP ≥ P then

openList := Insert(q, openList);

if M = regression then

if QP < error then

predicate := q;

error := QP ;

if QP < P then P := QP ;

if CP ≤ P ∧ QP > CP then

openList := Insert(q, openList);

return predicate ;

Figure 5.3: Algorithm for finding a predicate to split a node


function LearnDL(E , ) returns a decision list;

inputs: E , a set of examples;, a predicate rewrite system;

tree := single node (with examples E);

S := the set of predicates defined by ;

T := tree;

while E not empty do

foreach p ∈ S do

E+ := x | (p x) ∧ x ∈ E;

E− := E \ E+;

if E+ is pure then

T.left := single node (with examples E+);

T.right := single node (with examples E−);

T := T.right;

E := E−;

break;

if no split can be found then declare error;

label each leaf node of tree by its majority class;

return tree;

Figure 5.4: Decision-list learning algorithm

Figure 5.4 shows the algorithm. A set of examples is said to be pure if all the examples inthe set have the same class label. Besides purity, it is also common to demand that the set ofexamples covered by a predicate must have a certain minimum size. To counter potential problemsintroduced by noise, it is possible to relax the constraint and require only that examples are almostpure, by defining a suitable maximum impurity tolerance. [SMJST03] gives such an algorithm.

One can be more careful about the way the intermediate predicate choices are made, with theaim of minimising the size of the decision list. Although the problem of computing the smallestconsistent decision list given a set of examples, assuming one exists, is NP-hard in general, it isexpected that standard heuristics like picking at each stage the predicate that covers the largestset of examples would help. A simple pruning mechanism can be designed to work with thisheuristic. Let (E1, E2) be the partition induced by a predicate p. If E1 is pure, then define thescore of p as |E1|. Define the potential of p as the number of examples in the majority class inE1. The pruning mechanism works as follows: If E1 is pure, then prune off the subtree rooted atp. Else if the potential of p is less than the current best score achieved by some other predicate,then again prune. It is easy to show that the pruning is safe.

It is interesting to note that in [MC95] and [CM98], decision lists are constructed in reverseorder, that is, the more general cases at the end of the lists are learned first, and then exceptionsto those rules are attached to the front of the lists. They reported good performance for theirlearning task using this modification.

Comment 5.6.3. Heuristic search algorithms are useful when the search space is so large anexhaustive search is impossible. Alkemy implements a heuristic that can be used with eitherforms of predicate enumeration. Experience shows that, in general, the seenSet-based algorithmworks better in conjunction with these heuristics.

The algorithm is a form of resource-bounded greedy depth-first search algorithm. A non-


negative parameter cutout can be set such that if the algorithm investigates successively cutout

predicates without finding one which is strictly better than the current best predicate, then thealgorithm terminates, returning the best predicate found so far. Every time a new best predicateis found, the cutout parameter is reset to its initial value.

Besides the above, it is possible and easy to incorporate other search strategies. For example, ifaccuracy is an issue, and there is no harsh requirement on search time, then one can implement astrategy whereby a minimum accuracy is specified by the user, and the system does not terminatea search until a predicate with a higher accuracy is found. For another example, consider real-timeapplications where there is a hard limit on response time. In such cases, the learner can implementa strategy to do the best it can within the specified time.

5.6.1 BuildTree

171a 〈alkemy::private function declarations 161b〉+≡ (157) ⊳ 167a 184c ⊲

bool buildtree(treenode_t ∗ tnode, double min_best);

Defines:buildtree, used in chunk 164a.

Uses treenode_t 149.

171b 〈alkemy::private functions 161c〉+≡ (158a) ⊳ 167b 185a ⊲

bool alkemy::buildtree(treenode_t ∗ tnode, double min_best) cerr ≪ "Splitting "; cerr ≪ "......\n\n";〈buildtree::return if nothing can be done 172a〉〈buildtree::initialise the book-keeping data structures 172b〉〈buildtree::initialise the openlist 173a〉

interrupted = false;while (openlist.size() ∧ ¬interrupted)

〈buildtree::get the rewrites for the current olnode 173c〉for (unsigned int kss=0; kss 6=rwlist.size(); kss++)

std_predicate ∗ rwp = rwlist[kss];〈buildtree::construct predicate and test it 174a〉〈buildtree::impose cutout mechanism 181b〉〈buildtree::insert into openlist if interesting 182a〉

〈buildtree::left most consideration 183d〉

〈buildtree::clean up openlist if interrupted 183e〉〈buildtree::extend current tree if a better accuracy is obtained 184a〉if (extended ≡ false) return false;

if (CLASSIFICATION_MODE ∧ feq(dctree→calcAP(),dctree→part.sum())) cerr ≪ "We got a perfect classifier.\n"; return true;

rewrite_refresh();if (options.stump ≡ false ∧ ¬interrupted)

buildtree(tnode→ltree, 0); rewrite_refresh();buildtree(tnode→rtree, 0); rewrite_refresh();

return true;

Defines:buildtree, used in chunk 164a.

Uses calcAP 152b 154c, CLASSIFICATION_MODE 234, dctree 157, feq 232, openlist 157, options 234 235,rewrite_refresh 126b 127a, std_predicate 101b, sum 145c 148, and treenode_t 149.


Comment 5.6.4. If the current node is already pure, then return straightaway. Also, if the currentnode cannot be split because there are no two classes with high enough conditional distributions(estimated by frequency, as usual), then we give up straight away.

172a 〈buildtree::return if nothing can be done 172a〉≡ (171b 186)

if ((REGRESSION_MODE ∧ feq(tnode→sq_error,0)) ∨(CLASSIFICATION_MODE ∧ options.pos_only≡false∧ feq(tnode→Ap,tnode→part.sum())))

cerr ≪ " ** Pure node, return.\n\n"; return false;

double balance = 0;if (CLASSIFICATION_MODE ∧ options.boostN ≡ 0)

double obalance = (options.balance ∗ tnode→part.sumInt()) ÷ 100 ;balance = (2 ≥ obalance) ? 2 : obalance ;// balance = (options.balance * tnode->part.sumInt()) /100;unsigned int toosmall = 0;for (unsigned int i=0; i6=tnode→part.n.size(); i++)

if (tnode→part.n[i] < balance) toosmall++;if (toosmall ≥ tnode→part.n.size()-1) return false;

Uses CLASSIFICATION_MODE 234, feq 232, options 234 235, REGRESSION_MODE 234, sum 145c 148,and sumInt 145c 148.

Comment 5.6.5. Here we initialise the variables used for book keeping. The default values areall obvious.

172b 〈buildtree::initialise the book-keeping data structures 172b〉≡ (171b 186)

term_schema ∗ pred = NULL;distribution E1, E2;unsigned int age = 0;int cutout = options.cutout; options.prune = options.i_prune;

double default_accuracy = 0, best = 0, best2 = 0;double besterr = 0, besterr2 = 0;

if (CLASSIFICATION_MODE) default_accuracy = getClassWeight(tnode→majorityclass)

∗ tnode→part.n[tnode→majorityclass];best = default_accuracy; best2 = 0.0; // second best accuracy

// #define ENTROPY#ifdef ENTROPY

best = Entropy(tnode→part.n);#endif else besterr = tnode→sq_error; besterr2 = besterr;

Uses CLASSIFICATION_MODE 234, distribution 144, Entropy 179b, getClassWeight 134 137a, options 234 235,and term_schema 9.


Comment 5.6.6. Here, we create a top and insert that into the open list. We also need to insertthe entry for top into the frequently-accessed predicates (FAP) table, since that is the ancestorfor every other predicates in the search space.

173a 〈buildtree::initialise the openlist 173a〉≡ (171b 186)

assert(openlist.size() ≡ 0);std_predicate ∗ initial = new std_predicate;initial→encoding.push_back(0);transformation_t ∗ toptrans = new transformation_t;assert(get_type(topleveltype).second);type ∗ ind = get_type(topleveltype).second→clone();delete_type(toptrans→ttype→getAlpha(0));toptrans→ttype→setAlpha(new type_synonym(topleveltype, ind), 0);initial→transformations.push_back(toptrans);initial→calRedexes();

olnode init_node;init_node.predicate = initial;openlist.push(init_node);

vector<char> topresults;topresults.reserve(getTrSize());for (unsigned int ks=0; ks 6=getTrSize(); ks++) topresults.push_back(1);initialise_FAPtable(initial→encoding, topresults);

Uses calRedexes 119a 119b, clone 19a 19b, delete_type 77b 77c, encoding 122d, get_type 245c,getTrSize 134 140b, initialise_FAPtable 189a 189d, olnode 155b, openlist 157, std_predicate 101b,topleveltype 234 235, transformation_t 101a, and type_synonym 81c.

Comment 5.6.7. Here, we extract the first node on the open-list and examine whether it is fitfor rewriting. This is needed because nodes inserted earlier in the learning process (when theprune parameter is still relatively low) may be rendered uninteresting by discoveries made lateron in the search. An alternative way to handle this is to scan through the open-list and removeuninteresting nodes every time the prune parameter is raised.

Potentially interesting predicates are generated using the rewrite function call implementedin predicate-construction.nw.

173b 〈alkemy definitions 173b〉≡ (158a) 178b ⊲

#define NO_PRUNING (options.i_prune < 0)


173c 〈buildtree::get the rewrites for the current olnode 173c〉≡ (171b 186)

olnode node = openlist.top(); openlist.pop();if (CLASSIFICATION_MODE)

if (¬(NO_PRUNING) ∧ node.Bp ∧ node.Bp < best) node.freememory(); continue;

else // regressionif (¬(NO_PRUNING) ∧ node.min_error ≥ besterr)

node.freememory(); continue; if (options.enumSpace) enumerate(∗node.predicate); exit(0); vector<std_predicate ∗> rwlist = rewriteP(∗node.predicate);

Uses CLASSIFICATION_MODE 234, enumerate 184c 185a, freememory 19a 19c, olnode 155b, openlist 157,options 234 235, rewriteP 120 121a, and std_predicate 101b.


Comment 5.6.8. Having computed the children of the predicate at the front of the openlist, weproceed to evaluate each children on the training examples.

174a 〈buildtree::construct predicate and test it 174a〉≡ (171b 186)

pred = rwp→makeTerm();〈buildtree::predicate evaluation via table lookup 174b〉〈buildtree::predicate evaluation::calculate refinement bound 174c〉〈buildtree::predicate evaluation::record if better than best so far 176b〉if (options.verbosity ≥ 2) 〈buildtree::print progress 180b〉 pred→freememory();

Uses freememory 19a 19c, makeTerm 103b 105b, and options 234 235.

Comment 5.6.9. The description for the actual predicate evaluation via table lookup algorithmcan be found in §5.7.

174b 〈buildtree::predicate evaluation via table lookup 174b〉≡ (174a)

FAP_evaluate(pred, rwp, E1, E2, tnode, dctree);tested++; cutout−−;

double accuracy = 0, entropy = 0, covered = 0, sq_error = 0, err2 = 0;

if (CLASSIFICATION_MODE ∧ DLIST_MODE) if (E1.n[E1.majorclass()] ≡ 0) covered = 0;else covered = E1.n[E1.majorclass()] ÷ E1.sumInt(); // changed

else if (CLASSIFICATION_MODE) int leftmajori = E1.majorclass(); int rightmajori = E2.majorclass();accuracy = getClassWeight(leftmajori) ∗ E1.n[leftmajori] +

getClassWeight(rightmajori) ∗ E2.n[rightmajori];#ifdef ENTROPY

entropy = (E1.sumInt()÷tnode→indcount)∗Entropy(E1.n) +(E2.sumInt()÷tnode→indcount)∗Entropy(E2.n);

#endif else // regression

double av1 = E1.computeAverage(); double av2 = E2.computeAverage();err2 = E2.computeSqError(av2);sq_error = E1.computeSqError(av1) + err2;

Uses CLASSIFICATION_MODE 234, computeAverage 146c 148, computeSqError 147a 148, dctree 157, DLIST_MODE 234,

Entropy 179b, FAP_evaluate 189a 190b 191b, getClassWeight 134 137a, indcount 149, majorclass 146a 148,sumInt 145c 148, and tested 158a.

Comment 5.6.10. Here we calculate the refinement bound of the current predicate. The actualcomputations are done using static functions to avoid clutter in the buildtree function. This isalso the place to do preprocessing of the predicate rewrite system.

174c 〈buildtree::predicate evaluation::calculate refinement bound 174c〉≡ (174a)

double potential = 0; double min_error = 0;if (CLASSIFICATION_MODE) potential = calcBp(tnode→part, E1); else

if (E1.labels.size() ≡ 0) min_error = err2;else min_error = computeMinimum(E1.labels, E2.labels, tnode→sq_error,

tnode→part.labelsum);

Uses calcBp 175a, CLASSIFICATION_MODE 234, and computeMinimum 176a.


175a 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 158b 175b ⊲

static double calcBp(distribution & all, distribution & E1) double bp = 0, maxBp = 0;int classcount = all.n.size();for (int i=0; i6=classcount; i++)

for (int j=0; j6=classcount; j++) if (i ≡ j) continue;bp = getClassWeight(i) ∗ all.n[i] + getClassWeight(j) ∗ E1.n[j];if (bp > maxBp) maxBp = bp;

return maxBp;

Defines:calcBp, used in chunk 174c.


Comment 5.6.11. The function computeSi computes a running sum of the vector A and storesthe result in S. See [Ng05] on why how this works.

175b 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 175a 175c ⊲

#include <vector>

static void computeSi(multiset<double> A, vector<double> & S) if (A.size() ≡ 0) return;multiset<double>::iterator p = A.begin(); int i=0;S[i] = ∗p; p++; i++;while (p 6= A.end()) S[i] = S[i-1] + ∗p; i++; p++;

Defines:computeSi, used in chunk 176a.

175c 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 175b 176a ⊲

static double computePterm(int size1, double sum1, int size2, double sum2) double term1 = (size1 ≡ 0) ? 0 : (sum1∗sum1)÷size1;double term2 = (size2 ≡ 0) ? 0 : (sum2∗sum2)÷size2;return term1 + term2;

Defines:computePterm, used in chunk 176a.


176a 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 175c 179b ⊲

static double computeMinimum(multiset<double> myset, multiset<double> myset2,double totalvar, double totalsum)

int ret = -1; double maxp = 0;vector<double> s1(myset.size());computeSi(myset, s1);double S2 = 0;multiset<double>::iterator p = myset2.begin();while (p 6= myset2.end()) S2 += ∗p; p++;

int n = myset.size(); int m = myset2.size();double K = totalvar + (totalsum∗totalsum)÷(n+m);for (int i=0; i6=n; i++)

double p = 0; double e11 = s1[i]; double e12 = s1[n-1] - s1[i];double p1 = computePterm(i+1,e11,n-i-1+m,e12+S2);double p2 = computePterm(n-i-1,e12,i+1+m,e11+S2);if (p1 ≥ p2) p = p1; else p = p2;if (p > maxp) maxp = p; ret = i;

return K - maxp;

Defines:computeMinimum, used in chunk 174c.

Uses computePterm 175c and computeSi 175b.

Comment 5.6.12. Here, we need to overwrite the previous best predicate. We must first free upthe memory occupied by it.

We report both the new best and new equal best predicates. One extra condition for reportingequal best is the current best must be higher than the default accuracy. This is introduced toprevent useless information from being printed early in the learning process.

We want to maintain the two best values. Given the value of a new predicate, there are onlyfour scenarios to consider because of the relationship between the best and second best values.To simplify discussion, let us concentrate on regression. The new error must be larger, equal to,or smaller than the best error. In the first case, if the error is larger or equal to the second besterror, we do nothing. Otherwise, we update the second best error. In the second case, we updatethe second best error. In the third case, we update both the best and second best error.

176b 〈buildtree::predicate evaluation::record if better than best so far 176b〉≡ (174a)

if (CLASSIFICATION_MODE ∧ DLIST_MODE) 〈record if useful::case of list learning 177〉

else if (CLASSIFICATION_MODE) 〈record if useful::case of tree learning 178a〉

else 〈record if useful::case of regression 180a〉

Uses CLASSIFICATION_MODE 234 and DLIST_MODE 234.


Comment 5.6.13. This is a straightforward operation. If the coverage of the current predicateis (fairly) pure, we keep it and update largestPureSet accordingly.

177 〈record if useful::case of list learning 177〉≡ (176b)

if (E1.sumInt() < tnode→indcount ∧ E1.sumInt() > 2)if (covered > tnode→largestPureSet)

tnode→largestPureSet = covered;if (options.verbosity ≡ 1) 〈buildtree::print progress 180b〉 if (options.verbosity) cout ≪ " BEST" ≪ endl;if (tnode→bestpred) tnode→bestpred→freememory();tnode→bestpred = rwp→clone();made_progress = true;

Uses bestpred 149, clone 19a 19b, freememory 19a 19c, indcount 149, made_progress 158a, options 234 235,

and sumInt 145c 148.


Comment 5.6.14. There are two scenarios we need to worry about. If the current best accuracyis bettered, then we just update the necessary fields. But if the current best accuracy is matchedonly, then we need to break ties based on entropy.

178a 〈record if useful::case of tree learning 178a〉≡ (176b)

if (accuracy > best2 ∧ accuracy < best) best2 = accuracy;#ifdef ENTROPY

〈record if useful::case of entropy 179a〉#else

if (accuracy≡best ∧ BALANCED(E1.sumInt(),E2.sumInt()) ∧ options.boostN≡0) best2 = best;

if (options.verbosity≡2) cout ≪ "EQUAL BEST ";if (options.verbosity≡2) 〈buildtree::print progress 180b〉 cout≪endl;

double total = E1.sumInt() + E2.sumInt();assert(total ≡ (double)(tnode→indcount));double impurity = (E1.sumInt()÷tnode→indcount)∗Entropy(E1.n) +

(E2.sumInt()÷tnode→indcount)∗Entropy(E2.n);if (tnode→entropy - impurity > 0.000001)

cerr ≪ "** " ≪ impurity ≪" "≪tnode→entropy≪endl;if (options.verbosity) cout ≪ "BEST ALTERNATIVE ";if (options.verbosity ≡ 1)

〈buildtree::print progress 180b〉 cout ≪ endl; if (tnode→bestpred) tnode→bestpred→freememory();tnode→bestpred = rwp→clone();tnode→entropy = impurity;cutout = options.cutout;made_progress = true;

if (accuracy > best ∧ BALANCED(E1.sumInt(), E2.sumInt()) ∧ PURE)

best2 = best; best = accuracy;double newprune = (best ∗ 100) ÷ tnode→part.sum();if (newprune ≥ options.prune) options.prune = newprune;tnode→bestpred→freememory();tnode→bestpred = rwp→clone();tnode→entropy = (E1.sumInt()÷tnode→indcount)∗Entropy(E1.n) +

(E2.sumInt()÷tnode→indcount)∗Entropy(E2.n);

if (options.verbosity) cout ≪ "BEST ";if (options.verbosity≡1) 〈buildtree::print progress 180b〉 cout≪endl; // if we have got maximum accuracy, discontinue searchif (best ≡ tnode→indcount)

cout ≪ " OPTIMAL SPLIT FOUND "; cutout = 0; else cutout = options.cutout;made_progress = true;

#endif

Uses bestpred 149, clone 19a 19b, Entropy 179b, EQUAL 221, freememory 19a 19c, indcount 149,made_progress 158a, options 234 235, PURE 182c, sum 145c 148, and sumInt 145c 148.

178b 〈alkemy definitions 173b〉+≡ (158a) ⊳ 173b 182c ⊲

#define BALANCED(x,y) (x >= balance && y >= balance)


Comment 5.6.15. Some times, we want to use entropy instead of accuracy as the main measureof success. To get this comment, define ENTROPY.

179a 〈record if useful::case of entropy 179a〉≡ (178a)

if (entropy < best ∧ BALANCED(E1.sumInt(), E2.sumInt())) best = entropy;tnode→bestpred→freememory();tnode→bestpred = rwp→clone();tnode→entropy = entropy;

if (options.verbosity) cout ≪ "BEST ";if (options.verbosity≡1) 〈buildtree::print progress 180b〉 cout≪endl; made_progress = true;

Uses bestpred 149, clone 19a 19b, freememory 19a 19c, made_progress 158a, options 234 235, and sumInt 145c 148.

179b 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 176a 181a ⊲

#include <math.h>double Entropy(vector<double> x)

double ret = 0;double total = 0;for (unsigned int i=0; i6=x.size(); i++) total += x[i];if feq(total,0) return 0;for (unsigned int i=0; i6=x.size(); i++)

double pi = x[i] ÷ total;if (pi 6= 0) ret += -pi ∗ log(pi);

return ret;

Defines:Entropy, used in chunks 150b, 163a, 172b, 174b, and 178a.

Uses feq 232.


Comment 5.6.16. This is the same as above, except that we use the squared error as the measureof success.

180a 〈record if useful::case of regression 180a〉≡ (176b)

if (sq_error < besterr2 ∧ sq_error > besterr) besterr2 = sq_error;if (sq_error ≡ besterr ∧ besterr < tnode→sq_error)

besterr2 = besterr;if (options.verbosity) cout ≪ "\nEQUAL BEST ";if (options.verbosity≡1) 〈buildtree::print progress 180b〉 cout≪endl;

if (sq_error < besterr)

besterr2 = besterr; besterr = sq_error;tnode→bestpred→freememory();tnode→bestpred = rwp→clone();

if (options.verbosity) cout ≪ "\nBEST ";if (options.verbosity≡1) 〈buildtree::print progress 180b〉 cout≪endl; // if we have got minimum error, discontinue searchif (besterr ≡ 0)

cout ≪ " OPTIMAL SPLIT FOUND "; cutout = 0; else cutout = options.cutout;made_progress = true;

Uses bestpred 149, clone 19a 19b, freememory 19a 19c, made_progress 158a, and options 234 235.

Comment 5.6.17. Progress printing is used all over the place. I have made this a static functionto avoid clutter in buildtree.

180b 〈buildtree::print progress 180b〉≡ (174a 177–80)

printProgress(E1, E2, rwp, covered, accuracy, potential, entropy, sq_error,min_error, openlist.size(), cutout, tnode→largestPureSet, best, besterr);

Uses openlist 157 and printProgress 181a.


181a 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 179b 201a ⊲

static void printProgress(distribution & E1, distribution & E2,std_predicate ∗ rwp,double covered, double accuracy, double potential,double entropy, double sq_error, double min_error,uint olsize, int cutout,double largestPureSet, double best, double besterr)

setSelector(STDOUT);rwp→print(); ioprint(" "); rwp→printCode(); ioprintln();cout ≪ "l: "; E1.print(); cout ≪ "r: "; E2.print();if (CLASSIFICATION_MODE ∧ DLIST_MODE)

cout ≪" Cover = "≪ covered ≪" PoCover = 1"; else if (CLASSIFICATION_MODE)

cout ≪ " Ap = " ≪ accuracy ≪ " Bp = " ≪ potential;#ifdef ENTROPY

cout ≪ " ent = " ≪ entropy;#endif

else cout ≪ "SqError = " ≪ sq_error ≪ " MinErr = " ≪ min_error;cout ≪ " OL = " ≪ olsize;if (cutout ≥ 0) cout ≪ " Cutout = " ≪ cutout;cout ≪ " Tested = " ≪ tested;if (CLASSIFICATION_MODE)

cout ≪ " Best = ";(DLIST_MODE) ? cout ≪ largestPureSet : cout ≪ best;

else cout ≪ " Best Error = " ≪ besterr;setSelector(SILENT);

Defines:printProgress, used in chunk 180b.

Uses CLASSIFICATION_MODE 234, distribution 144, DLIST_MODE 234, ioprint 246 247a, ioprintln 246 247a,printCode 122d 123a, setSelector 246 247a, SILENT 246, std_predicate 101b, STDOUT 246, and tested 158a.

Comment 5.6.18. If learning was to be killed by the cutout mechanism, we have to first free thememory occupied by the rwp list and the remaining nodes on the open list before bailing out.

181b 〈buildtree::impose cutout mechanism 181b〉≡ (171b 186)

if (cutout ≡ 0) for (uint g=kss; g6=rwlist.size(); g++) rwlist[g]→freememory();while (openlist.size())

olnode temp=openlist.top(); temp.freememory();openlist.pop(); break;

Uses freememory 19a 19c, olnode 155b, and openlist 157.


Comment 5.6.19. A predicate is interesting and ought to be kept for further examination if itsatisfies all of the following conditions:

1. it has a Bp value higher than the current best score;

2. it has a Bp value higher than the current prune value; and

3. it is rewritable, i.e., it can be further strengthened.

182a 〈buildtree::insert into openlist if interesting 182a〉≡ (171b 186)

if (CLASSIFICATION_MODE ∧ DLIST_MODE) 〈keep if interesting::case of list learning 182b〉

else if (CLASSIFICATION_MODE) 〈keep if interesting::case of tree learning 182d〉

else 〈keep if interesting::case of regression 183a〉 Uses CLASSIFICATION_MODE 234 and DLIST_MODE 234.

182b 〈keep if interesting::case of list learning 182b〉≡ (182a)

if (E1.isXPure(0) ∨ E1.n[E1.majorclass()] ≤ tnode→largestPureSet∨ E1.n[E1.majorclass()] < 2 ∨ ¬REWRITABLE) 〈prune predicate 183b〉

else if (options.verbosity ≥ 2) cout ≪ " KEPT.\n\n";int olsize = openlist.size();if (¬(options.ollength > 0 ∧ olsize > options.ollength))

olnode temp;temp.predicate = rwp; temp.max_cover = covered; temp.age=++age;openlist.push(temp);

else 〈junk predicate 183c〉

Uses isXPure 147c 148, majorclass 146a 148, olnode 155b, openlist 157, options 234 235, and REWRITABLE 182c.

182c 〈alkemy definitions 173b〉+≡ (158a) ⊳ 178b

#define HIGHER_THAN_PRUNE (potential >= (tnode->part.sum()*options.prune)/100)

#define HIGHER_EQ_THAN_BEST (potential >= best) && (potential >= min_best)

#define REWRITABLE (rwp->rewritable())

#define PURE (E2.n[E2.minorclass()] <= options.purity)

Defines:HIGHER_EQ_THAN_BEST, used in chunk 182d.HIGHER_THAN_PRUNE, used in chunk 182d.PURE, used in chunks 178a and 182d.REWRITABLE, used in chunks 182 and 183a.

Uses minorclass 146a 148, options 234 235, rewritable 119a, and sum 145c 148.

182d 〈keep if interesting::case of tree learning 182d〉≡ (182a)

if ((NO_PRUNING ∨ (HIGHER_THAN_PRUNE ∧ HIGHER_EQ_THAN_BEST ∧ PURE))∧ REWRITABLE)

if (options.verbosity ≥ 2) cout ≪ " KEPT.\n\n";

int olsize = openlist.size();if (¬(options.ollength > 0 ∧ olsize > options.ollength))

olnode temp;temp.predicate = rwp;temp.Ap = accuracy; temp.Bp = potential; temp.age = ++age;openlist.push(temp);

else 〈junk predicate 183c〉 else 〈prune predicate 183b〉

Uses HIGHER_EQ_THAN_BEST 182c, HIGHER_THAN_PRUNE 182c, olnode 155b, openlist 157, options 234 235,PURE 182c, and REWRITABLE 182c.


183a 〈keep if interesting::case of regression 183a〉≡ (182a)

if (REWRITABLE ∧ min_error < besterr) if (options.verbosity ≥ 2) cout ≪ " KEPT.\n\n";int olsize = openlist.size();if (¬(options.ollength > 0 ∧ olsize > options.ollength))

olnode temp;temp.predicate = rwp; temp.error = sq_error;temp.min_error = min_error; temp.age = ++age;openlist.push(temp);

else 〈junk predicate 183c〉 else 〈prune predicate 183b〉

Uses olnode 155b, openlist 157, options 234 235, and REWRITABLE 182c.

183b 〈prune predicate 183b〉≡ (182 183a)

if (options.verbosity ≥ 2) cout ≪ " PRUNED.\n\n";rwp→freememory();

Uses freememory 19a 19c and options 234 235.

183c 〈junk predicate 183c〉≡ (182 183a)

if (options.verbosity ≥ 2) cout ≪ " JUNKED.\n\n";rwp→freememory();

Uses freememory 19a 19c and options 234 235.

Comment 5.6.20. If the rewrite strategy is left most, we move the marker one step further andcheck whether the current node can still be expanded after that. If yes, we push it back intothe openlist. Otherwise, we free it. We use two different encodings for the same predicate withdifferent marker positions to make sure the evaluation part works as usual. One optimisation Ihave not done is to duplicate the evaluation result for the old predicate for the new predicate inthe FAP table.

183d 〈buildtree::left most consideration 183d〉≡ (171b 186)

if (options.strategy ≡ LR ∧ options.one_redex) node.predicate→marker++;node.predicate→encoding.push_back(0);if (node.predicate→marker<(int)node.predicate→redexes.size())

openlist.push(node);else node.freememory();

else node.freememory();

Uses encoding 122d, freememory 19a 19c, LR 234, marker 125a, openlist 157, options 234 235, and redexes 119a.

183e 〈buildtree::clean up openlist if interrupted 183e〉≡ (171b 186)

while (openlist.size()) olnode temp = openlist.top(); temp.freememory(); openlist.pop();

Uses freememory 19a 19c, olnode 155b, and openlist 157.


Comment 5.6.21. We have to reassign individuals to the two leafs of the current node becausethis record is not kept during computation of the best predicate.

184a 〈buildtree::extend current tree if a better accuracy is obtained 184a〉≡ (171b 186)

bool extended = false;

if (CLASSIFICATION_MODE & ¬DLIST_MODE) tnode→Ap = tnode→part.n[tnode→part.majorclass()];if (made_progress ∧ best ≤ tnode→Ap + options.margin)

made_progress = false;if (¬made_progress) cerr≪" ** No predicate better than top can be found.\n\n";else

extended = true;cerr ≪ "\n ** Found a split.\n\n";if (CLASSIFICATION_MODE) tnode→Ap=best; else tnode→sq_error=besterr;

〈buildtree::initialise the subtrees 184b〉E1.clear(); E2.clear();for (int i=0; i6=tnode→indcount; i++)

Individual & ind = getTrIndividual(tnode→individuals[i]);bool res = FAP_evaluate(tnode→bestpred,tnode→individuals[i]);if (res)

ltree→individuals.push_back(tnode→individuals[i]);E1.update_with(ind);

else rtree→individuals.push_back(tnode→individuals[i]);E2.update_with(ind);

ltree→initialise_fields(E1); rtree→initialise_fields(E2);tnode→ltree = ltree; tnode→rtree = rtree;extended = true;while (openlist.size()) openlist.pop();made_progress = false;

Uses bestpred 149, CLASSIFICATION_MODE 234, clear 145b, DLIST_MODE 234, FAP_evaluate 189a 190b 191b,

getTrIndividual 134 136a, indcount 149, Individual 132, initialise_fields 150b 154c, made_progress 158a,majorclass 146a 148, openlist 157, options 234 235, and update_with 147b 148.

184b 〈buildtree::initialise the subtrees 184b〉≡ (184a)

treenode_t ∗ ltree = new treenode_t;treenode_t ∗ rtree = new treenode_t;ltree→individuals.reserve(tnode→individuals.size());rtree→individuals.reserve(tnode→individuals.size());


Comment 5.6.22. This function counts the size of the search space. A quick algorithm is usedto calculate the size first. An exhaustive enumeration of the search space is then used to confirmthe result if computationally possible.

184c 〈alkemy::private function declarations 161b〉+≡ (157) ⊳ 171a 185b ⊲

void enumerate(std_predicate & top) ;

Defines:enumerate, used in chunk 173c.



185a 〈alkemy::private functions 161c〉+≡ (158a) ⊳ 171b 186 ⊲

void alkemy::enumerate(std_predicate & top) #ifndef NO_GMP

bigint size; spsize2(size);cout ≪ "|Search Space| <= "; size.print(); cout ≪ endl;

#endif

Defines:enumerate, used in chunk 173c.

Uses bigint 239, spsize2 127b 130a, and std_predicate 101b.

5.6.2 Build Decision Lists

Comment 5.6.23. This is also similar to buildtree, but we only grow the right subtrees.

185b 〈alkemy::private function declarations 161b〉+≡ (157) ⊳ 184c 194b ⊲

bool buildDL(treenode_t ∗ tnode, double min_best);

Defines:buildDL, used in chunk 164a.


5.7. PREDICATE EVALUATION VIA TABLE LOOKUP 186

186 〈alkemy::private functions 161c〉+≡ (158a) ⊳ 185a 194c ⊲

bool alkemy::buildDL(treenode_t ∗ tnode, double min_best) cout ≪ "Splitting...\n";assert(CLASSIFICATION_MODE);

if (tnode→ltree ∧ tnode→rtree ∧ tnode→rigid) assert(tnode→rtree);buildDL(tnode→rtree, min_best);return true;

〈buildtree::return if nothing can be done 172a〉〈buildtree::initialise the book-keeping data structures 172b〉〈buildtree::initialise the openlist 173a〉

interrupted = false;while (openlist.size() ∧ ¬interrupted)

〈buildtree::get the rewrites for the current olnode 173c〉for (unsigned int kss=0; kss 6=rwlist.size(); kss++)

std_predicate ∗ rwp = rwlist[kss];〈buildtree::construct predicate and test it 174a〉〈buildtree::impose cutout mechanism 181b〉〈buildtree::insert into openlist if interesting 182a〉

〈buildtree::left most consideration 183d〉

〈buildtree::clean up openlist if interrupted 183e〉〈buildtree::extend current tree if a better accuracy is obtained 184a〉if (extended ≡ false) return false;

rewrite_refresh();if (options.pos_only ≡ false ∧

feq(dctree→calcAP(),dctree→part.sum())) cerr ≪ "We got a perfect classifier.\n"; return true;

if (¬interrupted ∧ options.stump ≡ false) buildDL(tnode→rtree, 0); rewrite_refresh();

return true;

Defines:buildDL, used in chunk 164a.

Uses calcAP 152b 154c, CLASSIFICATION_MODE 234, dctree 157, feq 232, openlist 157, options 234 235,rewrite_refresh 126b 127a, std_predicate 101b, sum 145c 148, and treenode_t 149.

5.7 Predicate Evaluation via Table Lookup

Comment 5.7.1. To compute the decision stump induced by a predicate q on a set of individualsE , we need to know the values of (q x) for each x ∈ E . The following pseudo-code shows the moststraight-forward way to perform this operation.

for individual 0 to N

evaluate q on current individual

assign individual to the correct part of the tree.

This simple algorithm does more work than strictly necessary. A minor modification can makeit go faster. The trick is to realise that it is sometimes not necessary to perform the expensive


operation of evaluating predicate q on individual x to find out the value of (q x). What we needis a little memory about what we have done previously.

Consider a predicate p and one of its children q. To compute the decision stump induced byq, we do not have to compute (q x) for all individuals, only those that have been evaluated totrue by p. All the individuals that have been evaluated to false by p can be assigned to the rightsubtree straight-away without evaluation. We can do that because we know that q is a childrenof p, hence it must be stronger than p. That is, (p x = false) =⇒ (q x = false).

A few comments on the space complexity is in order here. If the training set is large, we mayrun into memory utility problems (again). A method that can alleviate this problem is to storeonly the smaller subtree of a predicate on the open-list, and to compute the larger subtree atrun-time. This can be done because we always know the complete set of training examples at aparticular node in the decision tree. This is another typical instance of a space-time complexitytrade-off.

How much speed are we likely to gain from this new algorithm? The short answer is unfortu-nately not very much, but sufficient to make an impact in large applications. The predicate searchalgorithm considers only those predicates with high Bp values, and by definition, predicates withhigh Bp values are those with large left subtrees.

We can generalise the idea above for decision trees. If pruning does not take place, and thesearch space is the same across decision nodes, every predicate in the search space needs only beevaluated on the training examples exactly once to generate a decision tree. All we need to dois to maintain a |E| × P table of boolean values, where |E| is the size of the training set, and Pis the total number of regular predicates in the search space. Entry (i, j) in the table is true iffpredicate P [j] evaluates to true for individual i, false otherwise. This table can be constructedduring the computation of the root node of the decision tree. This is where the one evaluation onthe training examples needs to be done for the predicates. Subsequent predicate evaluations inother parts of the decision tree involves only table lookup.

The memory consumption of this algorithm is potentially huge. The size of the table can bereduced to approximately |E|/2×P if we store only the smaller subtree induced by each predicate.This is little comfort, of course, since P is usually the (much) larger number. In fact, for infinitesearch spaces, the table is correspondingly infinite in size.

Obviously, some kind of trade-off is necessary if we were to make use of this simple optimisationtechnique in our algorithm. We do not need to opt for complete memory, of course. Fixing thesize of the table to some (manageable) finite dimension, this method becomes a caching solution;and if some form of locality can be observed, a fair number of computing cycles can still be saved.

An observation about the locality of evaluation is that predicates higher-up in the search spacehave a higher probability of being evaluated across different nodes than those lower down in thetree, because of pruning. This is good news because a typical run of the learning system coverswidely higher-up in the search space and only narrowly lower down in the search space. By incorpo-rating this table optimisation technique directly into the system, we would have a reasonably-sizedtable with a good percentage of frequently evaluated predicates after the computation of the rootnode. The not-so-good news is, locality of evaluation across decision nodes seldom if ever appliesto the predicates low down in the search space we store in the table during the computation ofthe root node; and predicates low down in the search space are exactly those predicates mostexpensive to compute.

Fortunately, there is something we can do to make things go slightly faster even when apredicate is not in the table, again using the information that a descendant always implies itsparent. To make this work, a fast and efficient way of determining whether two predicates p andq lie on the same branch, and which comes before the other, is needed. The following encodingscheme springs to mind immediately: Every predicate is encoded using a vector of integers. Theroot node of the search tree is encoded with the 1-dimensional vector [0]. The y-th child q ofa predicate p is encoded with the n + 1-dimensional vector [x, y], where [x] is the n-dimensionalvector encoding for p. Using this encoding scheme, checking whether q is a descendant of p is asimple matter of checking whether the encoding of p is a prefix of the encoding of q.


Armed with the above, we can now give a new algorithm for computing partitions induced bypredicates. Given a training set E and a predicate p, if p is in the frequently-accessed predicate(FAP) table, then lookup the values of (p x) for each x ∈ E . If (p x) is undefined, compute(p x) and update the corresponding entry in the table. Assign x to the corresponding subtree.Otherwise if p is not in the table, find the closest ancestor q of p in the FAP table. For everyx ∈ E , if x is in the right subtree of q, assign x to the right subtree of p. Otherwise, perform theevaluation of (p x) and assign x to the correct subtree accordingly. Finally, if the FAP table is notfull, the dimension of the vector encoding for p is less than some constant, store p in the table.

It is sometimes useful to restrict the storing of predicates to those evaluated in the root nodeonly, since entries in the FAP table can only benefit the children of the node where the predicateis evaluated; and predicates evaluated in the root node are the only universally useful predicates.

Figure 5.5 shows the algorithm.

function Evaluate(p, ǫ) returns the value of p ǫ;

inputs: p, a predicate;ǫ, an example;

if ptable[p][ǫ] 6= undefined then return ptable [p][ǫ];

q = ancestor(p, ptable);

if ptable[q][ǫ] = false

then ptable[p][ǫ] = false;

else ptable[p][ǫ] = p ǫ;

return ptable[p][ǫ];

Figure 5.5: Predicate evaluation algorithm

To make the algorithm work, we need to find a way to encode the predicates so that they canbe easily indexed in the ptable and the ancestor function can be implemented efficiently. Thefollowing is a simple scheme: Every predicate is encoded using a vector of integers. The root nodeof the search tree is encoded with the 1-dimensional vector [0]. The y-th child q of a predicate p isencoded with the n + 1-dimensional vector [x, y], where [x] is the n-dimensional vector encodingfor p. Using this encoding scheme, checking whether q is a descendant of p is a simple matter ofchecking whether the encoding of p is a prefix of the encoding of q. Predicate indexing in ptable canbe done in O(1) time by defining a bijection f from Z

n to Z, where n is the maximum dimensionof the encodings of the predicates. (Predicates with smaller encodings vectors are padded withzeros at the end.)

Assuming the data structure ptable can fit in memory, we have the following result.

Proposition 5.7.2. Every predicate evaluation (p e) is computed at most once by the functionEvaluate during learning.

Proof. Consider the first time the function Evaluate is called with a predicate p and an examplee, say at time t1. If p is not top, then the entry ptable[p][e] is necessarily undefined. (Otherwise,(p e) is true and we return straightaway.) Let q be the closest ancestor of p which has at leastone defined entry. (We know q must exist because every node except the root of the tree has aparent, and the ancestor function is never called on top because its every entry has been initialiseto true.) There are two cases to consider. If (q ǫ) is false, then we know the value of (p ǫ) withouthaving to do any computation. Otherwise, the value (p ǫ) is computed and recorded in ptable.

Consider now the n-th time, n > 1, the function is called with the same pair p and e. Theentry ptable[p][ǫ] is necessarily defined after t1, and the function returns the value of p ǫ withouthaving to do any computation.


189a 〈pred-evaluation.h 189a〉≡extern uint FAPtable_size();extern void initialise_FAPtable(vector<int> x, vector<char> y);extern bool FAP_evaluate(std_predicate ∗ rwp, int ind);extern bool eval(term_schema ∗ pred, term_schema ∗ ind);extern void FAP_evaluate(term_schema ∗ pred, std_predicate ∗ rwp,

distribution & E1, distribution & E2,treenode_t ∗ tnode, treenode_t ∗ dctree);

Defines:eval, used in chunks 153b, 154a, 190b, 192, and 194a.FAP_evaluate, used in chunks 174b and 184a.FAPtable_size, used in chunk 165a.initialise_FAPtable, used in chunk 173a.pred-evaluation.h, used in chunks 153b and 158a.

Uses dctree 157, distribution 144, std_predicate 101b, term_schema 9, and treenode_t 149.

189b 〈pred-evaluation.cc 189b〉≡ 189d ⊲

#include "terms.h"



#include <sys/time.h>

〈pred-evaluation::hash function 189c〉

#ifdef gcc31

#include <ext/hash_map>static __gnu_cxx::hash_map<vector<int>, vector<char>, HashVector> FAPtable;#else#include <hash_map>static hash_map<vector<int>, vector<char>, HashVector> FAPtable;#endif

#define UNKNOWN 3

Defines:UNKNOWN, used in chunk 192.

Uses HashVector 189c, rewrite.h 131b, terms.h 9, and tree-dstructs.h 155a.

Comment 5.7.3. This is the hash function used to map predicates to their indices in the mapdata structure.

189c 〈pred-evaluation::hash function 189c〉≡ (189b)

class HashVector public: size_t operator()(vector<int> const in) const

size_t ret = 0;for (uint i=0; i6=in.size(); i++) ret += (i+1)∗(in[i]+1);return ret;

;

Defines:HashVector, used in chunks 189b and 191a.

189d 〈pred-evaluation.cc 189b〉+≡ ⊳ 189b 190a ⊲

uint FAPtable_size() return FAPtable.size(); void initialise_FAPtable(vector<int> x, vector<char> y) FAPtable[x] = y;

Defines:FAPtable_size, used in chunk 165a.initialise_FAPtable, used in chunk 173a.


Comment 5.7.4. This is where an actual predicate evaluation is done. Given a query, werepeatedly simplify it using reduce until nothing can be done anymore. If the end result is a dataconstructor, we print it. The variable tried is the total number of redexes tried throughout thecomputation.

190a 〈pred-evaluation.cc 189b〉+≡ ⊳ 189d 190b ⊲

bool eval(term_schema ∗ pred, term_schema ∗ ind) bool ret = true;term_schema ∗ query = new_term(APP);query→insert(pred→clone()); query→insert(ind→clone());

setSelector(SILENT);int tried = 0; int changed = 0; bool reduced = true;while (reduced)

query→labelVariables(changed);reduced = query→reduce(NULL, 0, query, tried, changed);if (query→tag ≡ D) break;

;if (query→isD("True")) ret = true;else if (query→isD("False")) ret = false;else

setSelector(STDERR);ioprint("Irreducible predicate : ");query→print(); ioprintln("\n");assert(false);

query→freememory();return ret;

Defines:

eval, used in chunks 153b, 154a, 190b, 192, and 194a.Uses clone 19a 19b, freememory 19a 19c, insert 11d, ioprint 246 247a, ioprintln 246 247a, isD 11a,

labelVariables 22d 22e 23a, new_term 17b, reduce 58e 59a, setSelector 246 247a, SILENT 246, STDERR 246,tag 10c, and term_schema 9.

Comment 5.7.5. This is the procedure used to compute the evaluation of a predicate p on oneparticular example x. We first look through the FAP table to see if (p x) has been computedbefore. If so, we return the result straightaway; otherwise, we call Escher to compute the result.

190b 〈pred-evaluation.cc 189b〉+≡ ⊳ 190a 191b ⊲

bool FAP_evaluate(std_predicate ∗ rwp, int index) 〈hash map iterator 191a〉mp = FAPtable.find(rwp→encoding);if (mp 6= FAPtable.end())

if (FAPtable[rwp→encoding][index] ≡ 1) return true;else if (FAPtable[rwp→encoding][index] ≡ 0) return false;

term_schema ∗ pred = rwp→makeTerm();bool ret = eval(pred, getTrIndividual(index).individual);pred→freememory();return ret;

Defines:

FAP_evaluate, used in chunks 174b and 184a.Uses encoding 122d, eval 189a 190a, freememory 19a 19c, getTrIndividual 134 136a, makeTerm 103b 105b,

std_predicate 101b, and term_schema 9.


191a 〈hash map iterator 191a〉≡ (190b 191b)

#ifdef gcc31

__gnu_cxx::hash_map<vector<int>,vector<char>,HashVector>::iterator mp;#else

hash_map<vector<int>, vector<char>, HashVector>::iterator mp;#endif

Uses HashVector 189c.

Comment 5.7.6. This is the same function that works on a set of examples. The overall structureis very simple; the code is just cluttered up with timing instructions.

191b 〈pred-evaluation.cc 189b〉+≡ ⊳ 190b

static double max_plain = 0;static double max_lookup = 0;bool make_statement = false;void FAP_evaluate(term_schema ∗ pred, std_predicate ∗ rwp, distribution & E1,

distribution & E2, treenode_t ∗ tnode, treenode_t ∗ dctree) E1.clear(); E2.clear();〈FAP evaluate::switch to plain algorithm if desirable 193b〉

〈hash map iterator 191a〉struct timeval ∗ lookup1_t1, ∗ lookup1_t2, ∗ lookup2_t1, ∗ lookup2_t2;struct timeval ∗ plain_t1, ∗ plain_t2;

lookup1_t1 = (struct timeval ∗)malloc(sizeof(struct timeval));lookup1_t2 = (struct timeval ∗)malloc(sizeof(struct timeval));gettimeofday(lookup1_t1, NULL);mp = FAPtable.find(rwp→encoding);gettimeofday(lookup1_t2, NULL);

if (mp 6= FAPtable.end()) 〈in table::time calculation 1 193c〉〈case of predicate is in FAPtable 192a〉〈in table::time calculation 2 193d〉

else 〈not in table::time calculation 1 193e〉〈case of predicate not in FAPtable 192b〉〈not in table::time calculation 2 193f〉

Defines:FAP_evaluate, used in chunks 174b and 184a.make_statement, used in chunk 193b.max_lookup, used in chunk 193.max_plain, used in chunk 193.

Uses clear 145b, dctree 157, distribution 144, encoding 122d, std_predicate 101b, term_schema 9,and treenode_t 149.


Comment 5.7.7. There are two cases to consider when a given predicate has an entry in the FAPtable. Individual entries for the predicate can have one of three values: true, false or undefined.If the value is undefined, we have to perform the evaluation and record the result. Otherwise, wejust use the boolean information recorded.

192a 〈case of predicate is in FAPtable 192a〉≡ (191b)

for (int i=0; i6=tnode→indcount; i++) int index = tnode→individuals[i];Individual & ind = getTrIndividual(index);int rsize = FAPtable[rwp→encoding].size();if (rsize ≤ index)

for (int j=0; j6=index - rsize + 101; j++)FAPtable[rwp→encoding].push_back(UNKNOWN);

assert(index < (int)FAPtable[rwp→encoding].size());if (FAPtable[rwp→encoding][index] ≡ UNKNOWN)

bool result = eval(pred, ind.individual);if (result) FAPtable[rwp→encoding][index] = 1;else FAPtable[rwp→encoding][index] = 0;

if (FAPtable[rwp→encoding][index] ≡ 1) E1.update_with(ind);else E2.update_with(ind);

Uses encoding 122d, eval 189a 190a, getTrIndividual 134 136a, indcount 149, Individual 132, UNKNOWN 189b,and update_with 147b 148.

192b 〈case of predicate not in FAPtable 192b〉≡ (191b)

vector<int> ancestor = rwp→encoding;ancestor.pop_back();while (ancestor.size())

if (FAPtable.find(ancestor) 6= FAPtable.end()) break;ancestor.pop_back();

vector<char> results = FAPtable[ancestor];vector<char> newresults;newresults.reserve(getTrSize());for (unsigned int i=0; i6=getTrSize(); i++) newresults.push_back(UNKNOWN);

for (int i=0; i6=tnode→indcount; i++) int index = tnode→individuals[i];Individual & ind = getTrIndividual(index);if (results[index] ≡ 0) // these are on the right

E2.update_with(ind); newresults[index] = 0; else // these are on the left subtree

bool result = eval(pred, ind.individual);if (result) E1.update_with(ind); newresults[index] = 1; else E2.update_with(ind); newresults[index] = 0;

〈FAP table::store if there is more space 193a〉

Uses encoding 122d, eval 189a 190a, getTrIndividual 134 136a, getTrSize 134 140b, indcount 149,Individual 132, UNKNOWN 189b, and update_with 147b 148.


193a 〈FAP table::store if there is more space 193a〉≡ (192b)

int FAPtable_length = options.FAPtable_length;int FAPtable_elength = options.FAPtable_entry_length;if ((tnode ≡ dctree) ∧ // if we are in the root node of the decision tree

(FAPtable_length < 0 ∨ (int)FAPtable.size() ≤ FAPtable_length)∧ (FAPtable_elength < 0∨(int)rwp→encoding.size() ≤ FAPtable_elength))

FAPtable[rwp→encoding] = newresults;

Uses dctree 157, encoding 122d, and options 234 235.

Comment 5.7.8. If the FAP table becomes too big and the time it takes to access it is longerthan the actual Escher computation, then we just switch back to using the plain vanilla algorithm.

193b 〈FAP evaluate::switch to plain algorithm if desirable 193b〉≡ (191b)

if (max_lookup > max_plain) if (¬make_statement)

cout≪"\n *** Switching to plain evaluation algorithm.";cout ≪ "\t" ≪max_lookup ≪ " " ≪ max_plain ≪"\n\n";make_statement = true;

〈old boring evaluation algorithm 194a〉return;

Uses make_statement 191b, max_lookup 191b, and max_plain 191b.

193c 〈in table::time calculation 1 193c〉≡ (191b)

lookup2_t1 = (struct timeval ∗)malloc(sizeof(struct timeval));lookup2_t2 = (struct timeval ∗)malloc(sizeof(struct timeval));gettimeofday(lookup2_t1, NULL);

193d 〈in table::time calculation 2 193d〉≡ (191b)

gettimeofday(lookup2_t2, NULL);double time2 = (double)(lookup1_t2→tv_sec + lookup2_t2→tv_sec -

lookup1_t1→tv_sec - lookup2_t1→tv_sec) +(double) 0.000001 ∗ (lookup1_t2→tv_usec + lookup2_t2→tv_usec

- lookup1_t1→tv_usec - lookup2_t1→tv_usec);if (time2 > max_lookup) max_lookup = time2;free(lookup1_t1); free(lookup1_t2); free(lookup2_t1); free(lookup2_t2);

Uses free 26a and max_lookup 191b.

193e 〈not in table::time calculation 1 193e〉≡ (191b)

plain_t1 = (struct timeval ∗)malloc(sizeof(struct timeval));plain_t2 = (struct timeval ∗)malloc(sizeof(struct timeval));gettimeofday(plain_t1, NULL);

193f 〈not in table::time calculation 2 193f〉≡ (191b)

gettimeofday(plain_t2, NULL);double time3 = (double) (plain_t2→tv_sec-plain_t1→tv_sec)

+ (double) (plain_t2→tv_usec - plain_t1→tv_usec)∗ 0.000001;if (time3 > max_plain) max_plain = time3;free(plain_t1); free(plain_t2);

Uses free 26a and max_plain 191b.

5.8. ERROR COMPLEXITY PRUNING 194

Comment 5.7.9. This is the old evaluation algorithm.

194a 〈old boring evaluation algorithm 194a〉≡ (193b)

for (int i=0; i6=tnode→indcount; i++) Individual & ind = getTrIndividual(tnode→individuals[i]);eval(pred, ind.individual) ? E1.update_with(ind) : E2.update_with(ind);

Uses eval 189a 190a, getTrIndividual 134 136a, indcount 149, Individual 132, and update_with 147b 148.

5.8 Error Complexity Pruning

Comment 5.8.1. This procedure implements the error complexity pruning mechanism proposedin [BFOS84, chap. 3]. The algorithm is made up of two main steps. In the first step, a sequenceof pruned trees is computed, by iteratively computing the error-complexity value of each subtree,and pruning the weakest link in the current tree. In the second step, we select the most promisingtree from the sequence and return that as the final hypothesis.

194b 〈alkemy::private function declarations 161b〉+≡ (157) ⊳ 185b

bool error_complexity_pruning();

Defines:error_complexity_pruning, used in chunk 164a.

194c 〈alkemy::private functions 161c〉+≡ (158a) ⊳ 186

bool alkemy::error_complexity_pruning() 〈error comp pruning::special cases 194d〉dctree→unpruned();〈error comp pruning::compute pruned tree sequence 195〉〈error comp pruning::select best-pruned tree 196a〉return true;

Defines:error_complexity_pruning, used in chunk 164a.

Uses dctree 157 and unpruned 154b.

194d 〈error comp pruning::special cases 194d〉≡ (194c)

if (getTrValidSize() ≤ 0) cout ≪ "\n*** Tree post-pruning wasn’t performed; validation"

" set is empty.\n\n";return false;

if ((dctree→ltree ≡ NULL ∧ dctree→rtree ≡ NULL) ∨

(dctree→ltree→isterminal() ∧ dctree→rtree→isterminal())) cout ≪ "\n*** Tree post-pruning wasn’t performed; the induced"

" tree is too simple.\n\n";return false;

Uses dctree 157, getTrValidSize 134 141, and isterminal 151c.

Comment 5.8.2. On each iteration of the first step, the value-cost tradeoff, i.e., the alpha value ofeach (proper) subtree is calculated. This is achieved using the recursive procedure call calcAlphaon the two children of the root node of the decision tree. Given the original tree Torg, the value ofa subtree T is simply the reduction in classification error induced by the subtree as compared toTorg − T . The cost of a subtree is given by the number of terminal nodes it has. Obviously, good


subtrees are low-complexity trees that give significant reduction in the misclassification rate. Theexact formulas for calculating these are given below.

Having computed the “goodness” of each subtree, we go on to construct a sequence of prunedtrees, where the first tree is the original induced tree. Tree i in the sequence is the tree constructedfrom tree i−1 by removing from it the weakest link, defined as the subtree with the lowest reductionin error per leaf, the alpha. The last tree in the sequence is a single-node tree.

To construct the sequence of candidate trees, we use a clone of the previous tree in the sequenceto construct the new one. Tree pruning is done using the function call cut-weakest-link on theroot node. The function returns false if the tree has only one non-terminal node and cannot bepruned. It get rids of the weakest link and return true otherwise. A more clever scheme canprobably do away with using only one tree, by using markers of some kind on the nodes to dovirtual tree pruning. This adds unnecessary complication to the code and is not explored furtherhere.

In the following, we maintain two sequences of trees. The first records tree information withrespect to the training set; the second records tree information with respect to the validation set.

195 〈error comp pruning::compute pruned tree sequence 195〉≡ (194c)

vector<treenode_t ∗> ctrees, ctrees_valid;dctree→calcAlpha();ctrees.push_back(dctree→clone());ctrees_valid.push_back(dctree→clone());bool success;treenode_t ∗ ctree = dctree→clone();while (true)

if (ctree→ltree) ctree→ltree→calcAlpha();if (ctree→rtree) ctree→rtree→calcAlpha();success = ctree→cut_weakest_link();if (success ≡ false) ctree→freememory(); break; ctrees.push_back(ctree);ctrees_valid.push_back(ctree→clone());treenode_t ∗ temp = ctree→clone();ctree = temp;

assert(ctrees.size() > 1 ∧ (ctrees.size() ≡ ctrees_valid.size()));

Uses calcAlpha 198a 198b, clone 19a 19b, cut_weakest_link 199a 199b, dctree 157, freememory 19a 19c,and treenode_t 149.


Comment 5.8.3. The second and final step in error complexity pruning is to compute the accu-racy of the candidate trees in the sequence on an independent validation set (the default set is thepartition immediately after the test partition), and choosing the most promising one. Note thatthe most promising one is not necessarily the one that achieves the lowest accuracy on the testset. Breiman et al. proposed choosing the smallest tree with a misclassification rate within onestandard error of the minimum, the so-called 1 SE rule. The justification for this can be found in[BFOS84, sec 3.4.3]. Here, we simply choose the one with the lowest accuracy.

The following code chunk is adapted from Compute and display decision tree on test

set.

196a 〈error comp pruning::select best-pruned tree 196a〉≡ (194c)

double best = 0; int besttree = -5;if (getTrValidSize() > 0)

if (CLASSIFICATION_MODE) 〈error comp pruning::compute best tree::classification 196b〉

else 〈error comp pruning::compute best tree::regression 197a〉 assert(besttree 6= -5);dctree→freememory(); dctree = ctrees[besttree]→clone();cout ≪ "*** The best tree is candidate " ≪ besttree ≪ ".\n\n";〈error comp pruning::clean up the candidate trees 197b〉

Uses CLASSIFICATION_MODE 234, clone 19a 19b, dctree 157, freememory 19a 19c, and getTrValidSize 134 141.

196b 〈error comp pruning::compute best tree::classification 196b〉≡ (196a)

for (unsigned int i=0; i6=ctrees_valid.size(); i++) cout ≪ "candidate " ≪ i ≪ endl;ctrees_valid[i]→print(0);cout ≪ "Acc (train) = " ≪ ctrees_valid[i]→calcAP() ≪ "/";cout ≪ ctrees_valid[i]→part.sumInt() ≪ endl ≪ endl;ctrees_valid[i]→clear_fields();

int k = getVlStartIndex();while (k 6= -5)

ctrees_valid[i]→individuals.push_back(k);int classindex= getClassIndex(getTrIndividual(k).label.clabel);ctrees_valid[i]→part.n[classindex]++;k = getTrIndividual(k).ptr_vl;

ctrees_valid[i]→indcount = ctrees_valid[i]→individuals.size();ctrees_valid[i]→evaluate();ctrees_valid[i]→print(0);double acc = ctrees_valid[i]→calcAP();cout ≪ "Acc (valid) = " ≪ acc ≪ "/";cout ≪ ctrees_valid[i]→part.sumInt() ≪ endl ≪ endl;if (acc ≥ best) best = acc; besttree = i;

Uses calcAP 152b 154c, clear_fields 150c 154c, evaluate 153b 154c, getClassIndex 134 136c,

getTrIndividual 134 136a, getVlStartIndex 134 139b, indcount 149, label 21, and sumInt 145c 148.


197a 〈error comp pruning::compute best tree::regression 197a〉≡ (196a)

best = 1000;for (unsigned int i=0; i6=ctrees_valid.size(); i++)

cout ≪ "candidate " ≪ i ≪ endl;ctrees_valid[i]→print(0);cout ≪ "SqError (train) = " ≪ ctrees_valid[i]→calcError() ≪ "\n\n";ctrees_valid[i]→clear_fields();

int k = getVlStartIndex();while (k 6= -5)

ctrees_valid[i]→individuals.push_back(k);ctrees_valid[i]→part.update_with(getTrIndividual(k));k = getTrIndividual(k).ptr_vl;

ctrees_valid[i]→indcount = ctrees_valid[i]→individuals.size();ctrees_valid[i]→evaluate();ctrees_valid[i]→print(0);double err = ctrees_valid[i]→calcError();cout ≪ "SqError (valid) = " ≪ err ≪ "\n\n";if (err ≤ best) best = err; besttree = i;

Uses calcError 153a 154c, clear_fields 150c 154c, evaluate 153b 154c, getTrIndividual 134 136a,

getVlStartIndex 134 139b, indcount 149, and update_with 147b 148.

197b 〈error comp pruning::clean up the candidate trees 197b〉≡ (196a)

assert(ctrees.size() ≡ ctrees_valid.size());for (unsigned int i=0; i6=ctrees.size(); i++)

ctrees_valid[i]→freememory(); ctrees[i]→freememory(); Uses freememory 19a 19c.

Comment 5.8.4. This procedure calculates the value-complexity tradeoff of each subtree rootedat the current node recursively. The values computed are recorded at the root of each subtree.This means that every non-terminal node should have an alpha value by the end of this call. Anestimation of the error rate r(t) of a terminal node t is the proportion of instances misclassified att, given by

r(t) = 1−maxi(ni/nt),

where ni is the number of instances in node t that are in class i, and nt the total number ofinstances in node t. The error cost of node t, R(t) is simply the product of r(t) and the proportionof instances at node t.

Given a subtree T rooted at t, the cost of the tree is given by

v(T ) = R(T ) + αNT ,

where R(T ) =∑

t R(t), t a terminal node of T , NT is the total number of terminal nodes in T , andα the complexity penalty of a node. If the subtree is pruned, the root of the tree would become aterminal node and the cost would go up to

v(t) = R(t) + α.

Now, v(T ) and v(t) are equal when

α =R(t)−R(T )

NT − 1.

In a way, α gives a measure of the value of a subtree by computing the reduction in error per leaf.Obviously, the lower α is, the least value a subtree has.


198a 〈struct treenode functions 151c〉+≡ (149) ⊳ 154b 199a ⊲

void calcAlpha();double getMisclassified();int getTNodeCount();

Defines:calcAlpha, used in chunk 195.getMisclassified, used in chunk 198b.getTNodeCount, used in chunk 198b.


void treenode_t::calcAlpha() if (isterminal()) return;

int trsize = getTrSize(); double RT = 0, Rt = 0;

if (CLASSIFICATION_MODE) RT = getMisclassified() ÷ (double)trsize;Rt = (part.sumInt() - part.n[majorityclass]) ÷ (double)trsize ;

else RT = calcError();Rt = part.computeSqError(part.computeAverage());

alpha = (Rt - RT) ÷ (double)(getTNodeCount() - 1) ;if (ltree) ltree→calcAlpha();if (rtree) rtree→calcAlpha();

Defines:

calcAlpha, used in chunk 195.Uses calcError 153a 154c, CLASSIFICATION_MODE 234, computeAverage 146c 148, computeSqError 147a 148,

getMisclassified 198a 198c, getTNodeCount 198a 198c, getTrSize 134 140b, isterminal 151c, sumInt 145c 148,and treenode_t 149.

Comment 5.8.5. These are the supporting functions for calcAlpha. The first function getMisclassified

traverses all the terminal nodes in the subtree and calculates the total number of misclassified in-stances. The second function getTNodeCount returns the total number of terminal nodes in thetree rooted at the current node.

198c 〈tree-dstructs.cc 145a〉+≡ ⊳ 198b 199b ⊲

double treenode_t::getMisclassified() if (isterminal())

return (double)(part.sumInt() - part.n[majorityclass]);double ret = 0;if (ltree) ret += ltree→getMisclassified();if (rtree) ret += rtree→getMisclassified();return ret;

int treenode_t::getTNodeCount()

if (isterminal()) return 1;int ret = 0;if (ltree) ret += ltree→getTNodeCount();if (rtree) ret += rtree→getTNodeCount();return ret;

Defines:

getMisclassified, used in chunk 198b.getTNodeCount, used in chunk 198b.

Uses isterminal 151c, sumInt 145c 148, and treenode_t 149.


Comment 5.8.6. This function finds the weakest link in the tree, defined as the non-terminalnode with the lowest alpha value, and prunes it away from the tree. The function return trueif this is successful. The only way this function will return false is when the tree only has onenon-terminal node.

199a 〈struct treenode functions 151c〉+≡ (149) ⊳ 198a

bool cut_weakest_link();treenode_t ∗ find_weakest_link();bool cut_link(treenode_t ∗ weakest);

Defines:cut_link, used in chunk 199b.cut_weakest_link, used in chunk 195.find_weakest_link, used in chunk 199b.


199b 〈tree-dstructs.cc 145a〉+≡ ⊳ 198c 199c ⊲

bool treenode_t::cut_weakest_link() if (ltree→isterminal() ∧ rtree→isterminal()) return false;treenode_t ∗ weakest = find_weakest_link();assert(weakest); assert(cut_link(weakest));return true;

Defines:

cut_weakest_link, used in chunk 195.Uses cut_link 199a 200, find_weakest_link 199a 199c, isterminal 151c, and treenode_t 149.

Comment 5.8.7. The first function find-weakest-link finds and returns the non-terminal withthe lowest alpha value. This function should never be called on a terminal node.

The second function cut-link takes the pointer to the weakest link and removes it from thetree.

199c 〈tree-dstructs.cc 145a〉+≡ ⊳ 199b 200 ⊲

treenode_t ∗ treenode_t::find_weakest_link() if (this→isterminal()) return NULL;treenode_t ∗ ret = NULL, ∗ temp = NULL;list<treenode_t ∗> tlist;double lowest = 1000.0;tlist.push_back(ltree); tlist.push_back(rtree);while (tlist.size())

temp = tlist.front();tlist.pop_front();if (temp→isterminal()) continue;if (temp→alpha < lowest) lowest = temp→alpha; ret = temp; assert(temp→ltree); tlist.push_back(temp→ltree);assert(temp→rtree); tlist.push_back(temp→rtree);

assert(ret); return ret;

Defines:

find_weakest_link, used in chunk 199b.Uses isterminal 151c and treenode_t 149.


200 〈tree-dstructs.cc 145a〉+≡ ⊳ 199c

bool treenode_t::cut_link(treenode_t ∗ weakest) if (this→isterminal()) return false;

bool ret = false;if (ltree ≡ weakest)

ltree→pruned = true;if (CLASSIFICATION_MODE)

ltree→Ap = getClassWeight(ltree→majorityclass) ∗ltree→part.n[ltree→majorityclass];

else ltree→average = ltree→part.computeAverage();ltree→sq_error =

ltree→part.computeSqError(ltree→average);return true;

if (rtree ≡ weakest)

rtree→pruned = true;if (CLASSIFICATION_MODE)

rtree→Ap = getClassWeight(rtree→majorityclass) ∗rtree→part.n[rtree→majorityclass];

else rtree→average = rtree→part.computeAverage();rtree→sq_error =

rtree→part.computeSqError(rtree→average);return true;

ret = ltree→cut_link(weakest);if (ret ≡ false) ret = rtree→cut_link(weakest);return ret;

Defines:cut_link, used in chunk 199b.

Uses CLASSIFICATION_MODE 234, computeAverage 146c 148, computeSqError 147a 148, getClassWeight 134 137a,isterminal 151c, and treenode_t 149.

5.9. BOOSTING 201

5.9 Boosting

Comment 5.9.1. This section implements a version of the AdaBoost.M1 algorithm as expositedin [HTF01]. This version, tentatively named AdaBoost.M1.HTF here, differs in a few importantways from that described in [FS97]. Firstly, the latter is an extension of AdaBoost to multiclassproblems; the former is, I believe, a specialisation of AdaBoost.M1 to binary class problems.Secondly, weights are increased for misclassified instances in AdaBoost.M1.HTF, whereas weightsare decreased for correctly classified instances in the original algorithm.

Further, the implementation here differs from that give in [HTF01] in several way. Here, theweights of the training examples are normalised so that they sum to 1. So are the coefficients ofthe base hypotheses.

201a 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 181a 201b ⊲

#include <cmath>static unsigned short I(unsigned int y, unsigned int Gx)

if (y 6= Gx) return 1;return 0;

201b 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 201a 201c ⊲

static vector<pair<double, treenode_t ∗> > tree_vector;

static double compute_totalweights() double ret = 0.0;for (unsigned int i=0; i6=tree_vector.size(); i++)

ret += tree_vector[i].first;return ret;

Defines:compute_totalweights, used in chunk 202a.tree_vector, used in chunks 202a, 203, and 205.


201c 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 201b 202a ⊲

static double boost_eval(pair<double, treenode_t ∗> & tree, Individual & ind,double totalweight)

indlabel ret1 = tree.second→evaluate1(ind);int label = ret1.mclass;if (label ≡ 0) label = -1;double ret = (tree.first ÷ totalweight) ∗ label;return ret;

Defines:boost_eval, used in chunk 202a.

Uses evaluate1 154a 154c, Individual 132, label 21, and treenode_t 149.

5.9. BOOSTING 202

202a 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 201c 202b ⊲

static double m_boost_eval(Individual & ind) double accu = 0;double totalcoeff = compute_totalweights();

for (unsigned int i=0; i6=tree_vector.size(); i++)accu += boost_eval(tree_vector[i], ind, totalcoeff);

if (accu < -1.1 ∨ accu > 1.1) cout ≪ "accu = "≪ accu ≪ endl; assert(accu > -1.1 ∧ accu < 1.1);return accu;

Defines:m_boost_eval, used in chunks 202b and 205.

Uses boost_eval 201c, compute_totalweights 201b, Individual 132, and tree_vector 201b.

202b 〈alkemy::static functions 158b〉+≡ (158a) ⊳ 202a

static unsigned int classify(double accu) if (accu < 0) return 0; return 1;

static void output_result(int iter) assert(CLASSIFICATION_MODE);unsigned int i; double minmargin = 1; double margin; double label;

unsigned int trtruepred = 0, trfalsepred = 0; // train setunsigned int truepred = 0, falsepred = 0; // test setfor (i=0; i6=getTrSize(); i++)

Individual & tp = getTrIndividual(i);if (tp.membership ≡ TEST)

double accu = m_boost_eval(tp);int prediction = classify(accu);if (prediction ≡ getClassIndex(tp.label.clabel))

truepred++;else falsepred++;

else if (tp.membership ≡ TRAIN) double accu = m_boost_eval(tp);int prediction = classify(accu);if (prediction ≡ getClassIndex(tp.label.clabel)) trtruepred++;else trfalsepred++;if (getClassIndex(tp.label.clabel) ≡ 0) label = -1;else label = 1;margin = label ∗ accu;if (margin < minmargin) minmargin = margin;

cout ≪ endl;cout ≪ "Iteration " ≪ iter ≪ " : ";cout ≪ "Train Acc = " ≪ trtruepred ≪ "/" ≪trfalsepred+trtruepred;cout ≪ " ( margin = " ≪ minmargin ≪ " )";cout ≪ " Test Acc = " ≪ truepred ≪ "/"≪falsepred+truepred≪endl;

Defines:classify, used in chunk 205.output_result, used in chunk 203.

Uses CLASSIFICATION_MODE 234, getClassIndex 134 136c, getTrIndividual 134 136a, getTrSize 134 140b,Individual 132, label 21, m_boost_eval 202a, TEST 132, and TRAIN 132.

5.9. BOOSTING 203

203 〈alkemy::public functions 158c〉+≡ (158a) ⊳ 161a

// pre-condition - |trainset| > 0, |testset| >= 0, |validset| = 0eval_measures alkemy::boost(unsigned int m)

assert(getClassCount() ≡ 2);

eval_measures ret;unsigned int i; int k;unsigned int trsize = getTrTrainSize();unsigned int tssize = getTrTestSize();unsigned int vlsize = getTrValidSize();assert(trsize > 0 ∧ tssize ≥ 0 ∧ vlsize ≡ 0);

for (i=0; i6=getTrSize(); i++) Individual & ind = getTrIndividual(i);if (ind.membership ≡ TRAIN) ind.weight = 1.0÷trsize;else ind.weight = 1.0;

for (i=0; i6=m; i++)

initialise_learner(); learn();〈boost::compute err-m 204a〉// compute alpha_massert(err > 0);double alpha = log((1-err)÷err);〈boost::update weight 204b〉// store classifierpair<double, treenode_t ∗> tree;tree.first = alpha;tree.second = dctree→clone();tree_vector.push_back(tree);cleanup_learner();output_result(i);

〈boost::output classifier and result on training and test sets 205〉return ret;

Defines:

boost, used in chunks 159 and 161a.Uses clone 19a 19b, condition 16c, dctree 157, eval_measures 160a, getClassCount 134 140b,

getTrIndividual 134 136a, getTrSize 134 140b, getTrTestSize 134 141, getTrTrainSize 134 141,getTrValidSize 134 141, Individual 132, initialise_learner 161c, learn 163c 164a, output_result 202b,testset 135a, TRAIN 132, trainset 135a, tree_vector 201b, and treenode_t 149.

5.9. BOOSTING 204

204a 〈boost::compute err-m 204a〉≡ (203)

double x = 0, y = 0, err = 0;k = getTrStartIndex();while (k 6= -5)

Individual & tp = getTrIndividual(k);indlabel templabel = dctree→evaluate1(tp);tp.classified = templabel.mclass;assert(CLASSIFICATION_MODE);x += tp.weight ∗ I(getClassIndex(tp.label.clabel), tp.classified);y += tp.weight;k = tp.ptr_tr;

err = x ÷ y; // assert(err < 0.5);// cout << "x = " << x << " y = "<<y<< " err = " <<err<<endl;if (err ≥ 0.5) cout ≪ "Note: err > 0.5. Exiting...\n";

cleanup_learner(); break; Uses CLASSIFICATION_MODE 234, dctree 157, evaluate1 154a 154c, getClassIndex 134 136c,

getTrIndividual 134 136a, getTrStartIndex 134 139b, Individual 132, and label 21.

204b 〈boost::update weight 204b〉≡ (203)

double totalweight = 0;k = getTrStartIndex();while (k 6= -5)

Individual & tp = getTrIndividual(k);assert(CLASSIFICATION_MODE);double neww = tp.weight ∗

exp(alpha ∗ I(getClassIndex(tp.label.clabel),tp.classified));tp.weight = neww;totalweight += neww;k = tp.ptr_tr;

// normalise weights, can this done in the previous loop?k = getTrStartIndex();while (k 6= -5)

Individual & tp = getTrIndividual(k);tp.weight = tp.weight ÷ totalweight;k = tp.ptr_tr;

Uses CLASSIFICATION_MODE 234, getClassIndex 134 136c, getTrIndividual 134 136a, getTrStartIndex 134 139b,Individual 132, and label 21.

5.9. BOOSTING 205

205 〈boost::output classifier and result on training and test sets 205〉≡ (203)

unsigned int trtruepred = 0, trfalsepred = 0; // train setunsigned int truepred = 0, falsepred = 0; // test setfor (i=0; i6=getTrSize(); i++)

Individual & tp = getTrIndividual(i);if (tp.membership ≡ TEST)

double accu = m_boost_eval(tp);int prediction = classify(accu);assert(CLASSIFICATION_MODE);if (prediction ≡ getClassIndex(tp.label.clabel)) truepred++;else falsepred++;

else if (tp.membership ≡ TRAIN) double accu = m_boost_eval(tp);int prediction = classify(accu);assert(CLASSIFICATION_MODE);if (prediction ≡ getClassIndex(tp.label.clabel)) trtruepred++;else trfalsepred++;

unsigned int n = tree_vector.size(); //assert(tree_vector.size() == m);double totalcoeff = 0;for (i=0; i6=n; i++) totalcoeff += tree_vector[i].first;cout ≪ "H(x) = ";for (i=0; i6=n-1; i++)

cout ≪ tree_vector[i].first ÷ totalcoeff ≪ "*t" ≪ i ≪" + ";cout ≪ tree_vector[n-1].first ÷ totalcoeff ≪ "*t" ≪ n-1≪endl≪endl;cout ≪ "where\n";for (i=0; i6=tree_vector.size(); i++)

cout ≪ " t" ≪ i ≪ " = ";tree_vector[i].second→print(1); cout ≪ endl;

cout ≪ "Training set weights : " ≪ endl;k = getTrStartIndex();while (k 6= -5)

cout ≪ getTrIndividual(k).weight ≪ " ";k = getTrIndividual(k).ptr_tr;

cout ≪ endl;cout ≪ "Train Acc = " ≪trtruepred≪"/"≪trfalsepred+trtruepred≪endl;cout ≪ "Boost Acc = " ≪ truepred ≪ "/" ≪ falsepred+truepred ≪endl;ret.train_acc = (double)trtruepred ÷ (double)(trfalsepred+trtruepred) ;ret.test_acc = (double)truepred ÷ (double)(falsepred+truepred) ;for (i=0; i6=tree_vector.size(); i++)

tree_vector[i].second→freememory();tree_vector.clear();

Uses CLASSIFICATION_MODE 234, classify 202b, clear 145b, freememory 19a 19c, getClassIndex 134 136c,getTrIndividual 134 136a, getTrSize 134 140b, getTrStartIndex 134 139b, Individual 132, label 21,m_boost_eval 202a, TEST 132, TRAIN 132, and tree_vector 201b.

Seek, and ye shall find;Bible, New Testament, Matthew 7:7

You can only find truth with logic if you have already found truth without it.Chesterton, G. K. (1874 - 1936)

Chapter 6

User Interface

6.1 The System Parser

Comment 6.1.1. The parser is rather complicated. I will describe the grammar and associatedactions for each of the four parts separately. The following is a skeleton of the module.

206 〈inputproc.y 206〉≡%

#include <fstream>

#include <stack>

#include <string>

#include <string.h>

#include <stdlib.h>

#include <stdio.h>

#include "terms.h"

#include "tables.h"



#include "global.h"



#include "alkemy.h"

extern int yylex(); extern char * yytext;

extern char linebuf[2000]; extern int tokenpos;

extern int mylineno; extern int seccount;

extern int switchBuffer(FILE * in);

void yyerror(const char * s);

〈inputproc.y::variables 209c〉

%

%union 〈inputproc.y::union members 218c〉

%token SECTION

〈inputproc.y::more preambles 209b〉

%%

input : declaration examples transinfo rewrites SECTION

ioprintln("finished parsing spec file"); 〈start learning 207a〉

206

6.1. THE SYSTEM PARSER 207

;

〈inputproc.y::declaration 207b〉〈inputproc.y::examples 211b〉〈inputproc.y::transinfo 213d〉〈inputproc.y::rewrites 217c〉〈inputproc.y::term schema 222〉〈inputproc.y::types 227a〉

%%

void yyerror(const char * s)

setSelector(STDERR);

ioprint("\nAn error has occurred at line ");

ioprint(mylineno); ioprintln(".");

ioprint("Offending token: "); ioprintln(yytext);

ioprint("System message: "); ioprintln(s);

ioprintln(linebuf);

for (int i=0; i!=tokenpos-1; i++) ioprint(" ");

ioprintln("^");

Defines:yyerror, used in chunks 211a, 212d, 219c, and 228b.

Uses alkemy.h 157, global.h 232, ioprint 246 247a, ioprintln 246 247a, linebuf 229, mylineno 229,pattern-match.h 75c, rewrite.h 131b, seccount 229, setSelector 246 247a, STDERR 246, switchBuffer 231,tables.h 96b, terms.h 9, tokenpos 229, trainset 135a, trainset.h 133, and unification.h 85b.

Comment 6.1.2. I have moved the learning part from bigpicture.nw to here so that Alkemy cancontinue processing examples after the tree is grown.

207a 〈start learning 207a〉≡ (206)

alkemy learner;

if (options.crossvalidate >= 0)

if (options.foldnumber == -5)

learner.cross_validate((unsigned int)options.crossvalidate);

else assert(options.foldnumber < options.crossvalidate);

learner.cross_validate_1f((unsigned int)options.crossvalidate,

(unsigned int)options.foldnumber);

else learner.m_leave_n_out(options.exp_count);

Uses cross_validate 157 159a, cross_validate_1f 157 160d, m_leave_n_out 157 161a, and options 234 235.

6.1.1 Data Declaration

Comment 6.1.3. The data declaration section contains type declarations for the individuals andthe classes they belong to.

207b 〈inputproc.y::declaration 207b〉≡ (206) 207c ⊲

declaration : SECTION typedecls funcdecl systrans.close(); ;

Uses systrans 209e.

Comment 6.1.4. We provide facilities for declaring nullary data constructors and type synonymshere. To achieve consistency and clarity, we adopt the syntax employed in [BGCL01].

207c 〈inputproc.y::declaration 207b〉+≡ (206) ⊳ 207b 208a ⊲

typedecls : typedecl | typedecls typedecl ;

typedecl : constructordecl | syndecl ;


Comment 6.1.5. We do not allow the same name to be used to declare multiple algebraic types.For example, the following is not allowed and will generate a run-time error.

Hi, Bye : Hello ;

...

THIS, IS, FORBIDDEN : Hello ;

We could allow such (erratic) declarations and bind the name to either the first or the last decla-ration. There are two objections to this solution.

• Multiple declarations with the same name is probably an unintentional bug introduced bythe user. She should be alerted to this, and forced to change that. It’s bad programmingpractice even if it is not wrong.

• If not handled properly, this solution can result in memory leaks.

Comment 6.1.6. The following is the grammar for data constructor declaration statements. Aconstructor declaration takes the form of a list of data constructors followed by their (common)signature. (For the grammar of type, see Comment 6.1.43.)

The list of data constructors are stored in a variable called vec_constants during parsing.The list is used to create a type object, which is then registered with the global module usinginsert_type.

208a 〈inputproc.y::declaration 207b〉+≡ (206) ⊳ 207c 209a ⊲

constructordecl : dataconstructors ’:’ type ’;’

string tname($3->getName());

type * t = new type_udefined(tname, vec_constants);

if ($3->isUdefined())

insert_type(tname, UDEFINED, t);

insert_constant(vec_constants[0], $3);

for (uint i=1; i!=vec_constants.size(); i++)

insert_constant(vec_constants[i], $3->clone());

〈generate equality transformations 208b〉vec_constants.clear(); /*$*/

;

Uses clear 145b, clone 19a 19b, insert_constant 245a, insert_type 245c, isUdefined 84b, type_udefined 84b,UDEFINED 241, and vec_constants 209c.

Comment 6.1.7. We automatically generate equality transformations for each data constructor.These are stored in systrans.es.

208b 〈generate equality transformations 208b〉≡ (208a)

if ($3->isUdefined())

for (uint i=0; i!=vec_constants.size(); i++)

systrans « "eq" « $3->getName() « vec_constants[i] « " : "

« $3->getName() « " -> Bool ;\n";

systrans « "(eq" « $3->getName() « vec_constants[i]

« " x) = (== x " « vec_constants[i] « ") ;\n";

Uses isUdefined 84b, systrans 209e, and vec_constants 209c.

Comment 6.1.8. As is the usual convention, data constructors are alphanumerics that start witha capital letter.


209a 〈inputproc.y::declaration 207b〉+≡ (206) ⊳ 208a 209d ⊲

dataconstructors : dataconstructor vec_constants.push_back($1);

| dataconstructors ’,’ dataconstructor vec_constants.push_back($3);

;

dataconstructor : IDENTIFIER2 $$ = $1;

| DATA_CONSTRUCTOR $$ = $1;

;

Uses IDENTIFIER2 229 and vec_constants 209c.

209b 〈inputproc.y::more preambles 209b〉≡ (206) 210c ⊲

%type <name> dataconstructor

209c 〈inputproc.y::variables 209c〉≡ (206) 209e ⊲

vector<string> vec_constants;

Defines:vec_constants, used in chunks 208 and 209a.

Comment 6.1.9. The following code chunk gives the grammar for type synonyms. An arbitrarytype can be given an arbitrary name that starts with a capital letter. This is the place to addsystem-defined projection transformations if we so wish.

209d 〈inputproc.y::declaration 207b〉+≡ (206) ⊳ 209a 210a ⊲

syndecl : TYPE IDENTIFIER2 ’=’ type ’;’

string t($2); insert_type(t,SYNONYM,$4);

〈generate projection transformations 209f〉

;

Uses IDENTIFIER2 229, insert_type 245c, SYNONYM 241, and TYPE 210d.

209e 〈inputproc.y::variables 209c〉+≡ (206) ⊳ 209c 213a ⊲

ofstream systrans("systrans.es");

Defines:systrans, used in chunks 207–9.

209f 〈generate projection transformations 209f〉≡ (209d)

if ($4->getTag() == "Tuple")

char id[100];

int size = $4->alphaCount();

string args = "(";

for (int i=0; i!=size-1; i++)

args = args + "t" + numtostring(i) + ",";

args = args + "t" + numtostring(size-1) ;

for (int i=0; i!=size; i++)

sprintf(id, "proj%s_%d", $2, i);

systrans « id « " : " « $2 « " -> "

« $4->getAlpha(i)->getName() « " ;\n";

systrans « "(" « id « " " « args « ")) = t" « i «" ;\n";

Uses numtostring 241 and systrans 209e.


Comment 6.1.10. Here, the user specifies the function to be learned. The variables functionnameand topleveltype are defined in the global module. To make this more flexible, we can use type

for the signature of the function.

210a 〈inputproc.y::declaration 207b〉+≡ (206) ⊳ 209d 210b ⊲

funcdecl : LEARN_KEY IDENTIFIER1 ’:’ IDENTIFIER2 ARROW codomain ’;’

functionname = $2; topleveltype = $4;

;

Uses ARROW 210d, functionname 234 235, IDENTIFIER1 229, IDENTIFIER2 229, LEARN_KEY 210d,and topleveltype 234 235.

Comment 6.1.11. The codomain of the function can be boolean, a (nullary) type constructordeclared earlier, or float.

210b 〈inputproc.y::declaration 207b〉+≡ (206) ⊳ 210a

codomain : IDENTIFIER2 〈inputproc.y::Setup Classes 210e〉

| BOOL addClass("True", 1); addClass("False", 1);

| FLOAT options.learning_mode = REGRESSION; ;

Uses addClass 134 136b, BOOL 226b 226c, FLOAT 226b 226c, IDENTIFIER2 229, options 234 235, and REGRESSION 234.

210c 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 209b 212c ⊲

%token LEARN_KEY ARROW TYPE

Uses ARROW 210d, LEARN_KEY 210d, and TYPE 210d.

210d 〈inputproc.l::specific tokens 210d〉≡ (229) 213b ⊲

type 〈lex:tpos 230c〉 return TYPE;

LEARN 〈lex:tpos 230c〉 return LEARN_KEY;

\-\> 〈lex:tpos 230c〉 return ARROW;

Defines:ARROW, used in chunks 210 and 227.LEARN_KEY, used in chunk 210.TYPE, used in chunks 209d and 210c.

Comment 6.1.12. Here, we use the user-defined type object created for Class to setup theclasses of the individuals. Each class is identified by a string stored in the values field of the typeobject. We inform the training set module of the existence of the different classes by using thefunction addClass. Each class will get assigned a unique id.

210e 〈inputproc.y::Setup Classes 210e〉≡ (210b)

map<string, float> classweights_map;

string str = $1; /*$*/

type_udefined * classes = dcast<type_udefined *>(get_type(str).second);

〈Setup Classes::error checking 211a〉

float weight;

map<string, float>::iterator mp;

const vector<string> & cl = classes->getValues();

for (unsigned int sp=0; sp!=cl.size(); sp++)

mp = classweights_map.find(cl[sp]);

if (mp == classweights_map.end()) weight = 1.0;

else weight = mp->second;

addClass(cl[sp], weight);

Uses addClass 134 136b, classes 135a, get_type 245c, and type_udefined 84b.


211a 〈Setup Classes::error checking 211a〉≡ (210e)

if (!classes)

string errmsg = "Error parsing codomain of function. " +str+" undefined.";

yyerror(errmsg.c_str());

assert(classes);

Uses classes 135a and yyerror 206.

6.1.2 Training Examples

Comment 6.1.13. We keep count of the total number of training examples available. This isthen used to initialise the data structures in the training set module via the init_trainset andinit_testset function calls.

211b 〈inputproc.y::examples 211b〉≡ (206) 212a ⊲

examples : SECTION individuals

| SECTION

;


Comment 6.1.14. Individuals come in two forms, labelled examples and unlabelled examples.Labelled examples are used for training. A prediction for each unlabelled example will be givenafter learning.

Addmissible labels for classification problems are given in the grammar for label below. In-dividuals for regression problems have floating-point numbers as labels.

We provide a facility to input individuals from different files through import statements. Forfurther details on how this works, see Comment 6.2.3.

212a 〈inputproc.y::examples 211b〉+≡ (206) ⊳ 211b 212b ⊲

individuals : individual | individuals individual ;

individual : IDENTIFIER1 term_schema ’=’ label ’;’

Individual newInd;

newInd.individual = $2; newInd.label.clabel = $4;

addTrIndividual(newInd);

| IDENTIFIER1 term_schema ’=’ DATA_CONSTRUCTOR_FLOAT ’;’

Individual newInd;

newInd.individual = $2; newInd.label.rg = $4;

addTrIndividual(newInd);

| ’?’ term_schema ’;’

Individual newInd; newInd.individual = $2;

addTsIndividual(newInd); /*$*/

| IMPORT FILENAME ’;’

if (imported.find($2) == imported.end())

FILE * in = fopen($2, "r");

if (!in)

cerr « "Error reading from " « $2 « endl;

assert(false);

cerr « "Reading from " « $2 « " ... ";

switchBuffer(in);

imported.insert($2);

;

Uses addTrIndividual 134 135b, addTsIndividual 134 135b, DATA_CONSTRUCTOR_FLOAT 229, FILENAME 229,IDENTIFIER1 229, IMPORT 213b, imported 213a, Individual 132, insert 11d, label 21, switchBuffer 231,and term_schema 9.

212b 〈inputproc.y::examples 211b〉+≡ (206) ⊳ 212a

label : IDENTIFIER2 $$ = $1; 〈label::error checking 212d〉

| DATA_CONSTRUCTOR $$ = $1; 〈label::error checking 212d〉

;

Uses IDENTIFIER2 229 and label 21.

212c 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 210c 213c ⊲

%type <name> label

Uses label 21.

212d 〈label::error checking 212d〉≡ (212b)

if (getClassIndex($1) == ALKERROR)

string l = $1;

string msg = "Error parsing individual: Unknown label " + l;

yyerror(msg.c_str());

assert(false);

Uses ALKERROR 232, getClassIndex 134 136c, label 21, and yyerror 206.


213a 〈inputproc.y::variables 209c〉+≡ (206) ⊳ 209e 216a ⊲

#include <set>

set<string> imported;

Defines:imported, used in chunks 212a and 215a.

Comment 6.1.15. Here are the parts related to the scanner.

213b 〈inputproc.l::specific tokens 210d〉+≡ (229) ⊳ 210d 217b ⊲

import 〈lex:tpos 230c〉 return IMPORT;

Defines:IMPORT, used in chunks 212a, 213c, and 215a.

213c 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 212c 215c ⊲

%token <name> FILENAME

%token IMPORT

Uses FILENAME 229 and IMPORT 213b.

6.1.3 Transformations

Comment 6.1.16. The transformation section gives the definitions for all the transformations(and the subsidiary functions used to define them) that are used in the predicate rewrite system.Each definition is an Escher statement. We can import standard Escher libraries for use here aswell.

213d 〈inputproc.y::transinfo 213d〉≡ (206) 215a ⊲

transinfo : SECTION statements

initFuncTable();

〈type checking individuals and statements 214a〉 ;

statements : /* empty */ | statements alkemy_statement ;

alkemy_statement : import | statement_schema | typedecl ;

Uses initFuncTable 97c and statements 242.


Comment 6.1.17. We type check the individuals in the training set and all the Escher statementsafter the successful parsing of both. Type checking is delayed until now because at this point wehave the signature of all the constants that appear in the spec file.

214a 〈type checking individuals and statements 214a〉≡ (213d)

if (options.typeCheck)

for (uint i=0; i!=getTrSize(); i++)

Individual & ind = getTrIndividual(i);

〈type checking individual 214b〉

for (uint i=0; i!=getTsSize(); i++)

Individual & ind = getTsIndividual(i);

〈type checking individual 214b〉

for (uint i=0; i!=statements.size(); i++)

type * sttype = wellTyped(statements[i].stmt);

if (!sttype)

setSelector(STDERR); ioprint("***\n");

statements[i].stmt->print(); ioprintln();

ioprintln("Not well typed."); exit(1);

delete_type(sttype);

Uses delete_type 77b 77c, getTrIndividual 134 136a, getTrSize 134 140b, getTsIndividual 134 136a,getTsSize 134 140b, Individual 132, ioprint 246 247a, ioprintln 246 247a, options 234 235,setSelector 246 247a, statements 242, and STDERR 246.

214b 〈type checking individual 214b〉≡ (214a)

type * itype = wellTyped(ind.individual);

if (!itype)

setSelector(STDERR); ioprint("***\nIndividual "); ioprint((int)i);

ioprint(" "); ind.individual->print();

ioprintln("\nNot well typed."); exit(1);

type * itype2 = get_type_from_syn(itype);

type * toptype = get_type_from_syn(get_type(topleveltype).second);

vector<pair<string, type *> > slns;

bool result = unify(slns, itype2, toptype);

if (!result)

setSelector(STDERR); ioprint("***\nIndividual "); ioprint((int)i);

ioprint(" "); ind.individual->print();

ioprintln("\nincorrectly typed. ");

ioprint("Inferred type : "); ioprintln(itype2->getName());

ioprint("Correct type : "); ioprintln(toptype->getName());

exit(1);

for (uint j=0; j!=slns.size(); j++) delete_type(slns[j].second);

slns.clear();

delete_type(itype);

Uses clear 145b, delete_type 77b 77c, get_type 245c, get_type_from_syn 87b, ioprint 246 247a,ioprintln 246 247a, setSelector 246 247a, STDERR 246, topleveltype 234 235, and unify 88.

Comment 6.1.18. This construct is used to import Escher standard libraries. The library inquestion must of course exist. What the import statement does is to switch parsing from thecurrent spec file to the Escher file and then back. Nested imports are allowed. For further detailson how this works, see Comment 6.2.3.


215a 〈inputproc.y::transinfo 213d〉+≡ (206) ⊳ 213d 215b ⊲

import : IMPORT FILENAME ’;’

if (imported.find($2) == imported.end())

FILE * in = fopen($2, "r");

if (!in)

cerr « "Error reading from " « $2 « endl;

assert(false);

cerr « "Reading from " « $2 « " ... ";

switchBuffer(in);

imported.insert($2);

;

Uses FILENAME 229, IMPORT 213b, imported 213a, insert 11d, and switchBuffer 231.

Comment 6.1.19. We now look at the definition of transformations. Standard Escher statementsthat serve as subsidiary functions are accepted here. This is handled in the last case. The actionassociated with these is simple: we insert each of them into the right place in the globals module.

A transformation definition is not unlike an Escher statement, except that we have to providetwo pieces of extra information, namely the signature of the transformation and its symmetricity.Transformations are assumed symmetric unless otherwise indicated.

215b 〈inputproc.y::transinfo 213d〉+≡ (206) ⊳ 215a

statement_schema : iden ’:’ type ’;’ term_schema ’=’ term_schema ’;’

〈inputproc.y::prepare trans info 216b〉info.symmetricity = true;

insert_trans_info(info);

insert_constant(info.name, $3->clone());

term_schema * head = $5; term_schema * body = $7;

〈inputproc.y::statement schema 216c〉

| iden ’:’ type ’;’

term_schema ’=’ term_schema ’:’ NONSYMMETRIC ’;’

〈inputproc.y::prepare trans info 216b〉info.symmetricity = false;

insert_trans_info(info);

insert_constant(info.name, $3->clone());



| term_schema ’=’ term_schema ’;’



| iden ’:’ type ’;’ SYSTEMDEFINED ’;’

string name($1); insert_constant(name, $3);

;

iden : IDENTIFIER1 $$ = $1; | FUNCTION $$ = $1; ;

Uses clone 19a 19b, IDENTIFIER1 229, insert_constant 245a, insert_trans_info 236d, NONSYMMETRIC 217a 217b,SYSTEMDEFINED 217a 217b, and term_schema 9.

215c 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 213c 217a ⊲

%type <name> iden


Comment 6.1.20. Every transformation is given a unique id.

216a 〈inputproc.y::variables 209c〉+≡ (206) ⊳ 213a 220b ⊲

int trans_num = 0;

Comment 6.1.21. For each transformation, we record information about its name, id, rank andtype.

216b 〈inputproc.y::prepare trans info 216b〉≡ (215b)

trans_info_t info;

info.name = $1;

info.tnum = trans_num++;

type_abstraction * temp = dcast<type_abstraction *>($3);

temp->rank = temp->compRank();

info.argcount = temp->rank;

info.ttype = temp;

Uses compRank 84a, trans_info_t 236a, and type_abstraction 82c.

Comment 6.1.22. Definitions of transformations and subsidiary functions used to define themare all recorded in the global module. We first check that the definition has the right form. We thenconstruct a term from the head and body that were separately parsed. After some preprocessingneeded for Escher computations to run fast, we insert that into the statements vector.

216c 〈inputproc.y::statement schema 216c〉≡ (215b)

statementType st;

〈parser::make sure statement head has the right form 216d〉

st.stmt = new_term(APP);

term_schema * t1 = new_term(APP); term_schema * t11 = new_term(F, "==");

t1->insert(t11); t1->insert(head);

st.stmt->insert(t1); st.stmt->insert(body);

〈parser::preprocess statements 216e〉statements.push_back(st);

Uses insert 11d, new_term 17b, statements 242, statementType 241, and term_schema 9.

Comment 6.1.23. A valid Escher statement must satisfy the following definition.

Definition 6.1.24. A statement is a term of the form h = b, where h has the form f t1 . . . tn,n ≥ 0, for some function f , each free variable in h occurs exactly once in h, and b is type-weakerthan h.

216d 〈parser::make sure statement head has the right form 216d〉≡ (216c)

term_schema * leftmost = head->spineTip(st.numargs);

assert(leftmost->isF());

st.anchor = leftmost->name;

insert_ftable(leftmost->name, st.numargs);

Uses insert_ftable 98a, isF 11a 11b, spineTip 12f 13a, and term_schema 9.

Comment 6.1.25. Here we perform different kinds of preprocessing on statements talked aboutin §2.2.3.1 and other places.

216e 〈parser::preprocess statements 216e〉≡ (216c)

head->labelStaticBoundVars(); body->labelStaticBoundVars();

st.stmt->collectSharedVars();

head->collectFreeVars(st.stmt, 1);

head->precomputeFreeVars();

Uses collectFreeVars 66b 66c, collectSharedVars 64b 64c, labelStaticBoundVars 25e 27a,and precomputeFreeVars 67c 68.


Comment 6.1.26. Here are the parts related to the scanner.

217a 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 215c 218a ⊲

%token NONSYMMETRIC SYSTEMDEFINED

Defines:NONSYMMETRIC, used in chunk 215b.SYSTEMDEFINED, used in chunk 215b.

217b 〈inputproc.l::specific tokens 210d〉+≡ (229) ⊳ 213b 218b ⊲

Nonsymmetric 〈lex:tpos 230c〉 return NONSYMMETRIC;

SystemDefined 〈lex:tpos 230c〉 return SYSTEMDEFINED;

Defines:NONSYMMETRIC, used in chunk 215b.SYSTEMDEFINED, used in chunk 215b.

6.1.4 Predicate Rewrites

Comment 6.1.27. A predicate rewrite is denoted p q, where both p and q are standardpredicates. The predicate p is called the head of the rewrite, and q the body. The following codechunk gives the grammar for a predicate rewrite. The head is usually a simple transformation ofrank zero. It is identified by its string name, which must be defined previously. The body usuallyhas richer structure. The parsing of the body will be described shortly.

Upon successful parsing of a predicate rewrite, we insert that into the predicate rewrites table.Predicate rewrites are organised using their heads; the types of the heads play an important rolehere as well.

217c 〈inputproc.y::rewrites 217c〉≡ (206) 218e ⊲

rewrites : SECTION rewrite_list freeze_rewrite_table(); ;

rewrite_list : rewrite | rewrite_list rewrite ;

rewrite : IDENTIFIER1 REWRITES stdPredicate ’;’

string id = $1;

struct trans_info_t * info = find_trans_info(id); assert(info);

int tnum = info->tnum;

$3->recalculateType();

bool succ = insert_rewrite($3, $3->getSource()->clone(), tnum);

if (succ == false) 〈rewrite::error checking 217d〉

;

Uses clone 19a 19b, find_trans_info 237a, freeze_rewrite_table 116a 118b, getSource 82c 83d 106 110d,IDENTIFIER1 229, insert_rewrite 116a 117a, recalculateType 107a 110a 110d, REWRITES 218a 218b,and trans_info_t 236a.

Comment 6.1.28. There is only one way insert_rewrite can fail, that is if the body of thepredicate rewrite is not in regular form. See Comment 3.4.14. When that happens, we just outputan error message and exit.

217d 〈rewrite::error checking 217d〉≡ (217c)

setSelector(STDERR);

ioprint("Error in the definition of \n\t");

ioprint($1); ioprint(" >-> "); $3->print(); ioprintln();

ioprint("The predicate on the RHS is not regular.");

ioprintln("Please check the order of the transformations.\n");

exit(1);

Uses ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, and STDERR 246.


218a 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 217a 218d ⊲

%token REWRITES

Defines:REWRITES, used in chunk 217c.

218b 〈inputproc.l::specific tokens 210d〉+≡ (229) ⊳ 217b 221 ⊲

\>\-\> 〈lex:tpos 230c〉 return REWRITES;

Defines:REWRITES, used in chunk 217c.

Comment 6.1.29. Let us now look at the grammar for the body of a predicate rewrite. Thebody of a predicate rewrite is a standard predicate. A standard predicate is either a singletransformation with bool as its target type or a predicate constructed from the composition ofseveral transformations, the last of which is a predicate. (Note that composition is defined left toright here, i.e., f.g x = g(f(x)).)

The code associated with the first case is simple, we create a new standard predicate structureand initialise it with the transformation structure constructed for the single transformation. (Howthat is done is explained in the next code chunk.) The code associated with the second case is onlyslightly more complicated. We simply add the newly constructed transformation into the standardpredicate structure we created in the base case. We also need to initialise the source type of thelatest transformation parsed because transformations like top are polymorphically-typed and thetype variable must be appropriately instantiated for rewrites to work properly.

Compositions are left-associative.Here first are the types of stdPredicate and transformation, the grammars of which follow.

218c 〈inputproc.y::union members 218c〉≡ (206) 220d ⊲

transformation_t * trans;

std_predicate * stdpred;

Uses std_predicate 101b and transformation_t 101a.

218d 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 218a 220e ⊲

%left ’.’

%type <stdpred> stdPredicate

%type <trans> transformation

218e 〈inputproc.y::rewrites 217c〉+≡ (206) ⊳ 217c 219a ⊲

stdPredicate : transformation

$$ = new std_predicate; $$->transformations.push_back($1);

| stdPredicate ’.’ transformation

$1->transformations.push_back($3); $$ = $1;

| stdPredicate ’.’ ’(’ transformation ’)’

$1->transformations.push_back($4); $$ = $1;

| ’(’ stdPredicate ’)’ $$ = $2;

;


Comment 6.1.30. We next look at transformations. We differentiate between two classes oftransformations, those with zero rank and those with non-zero rank. The first case is trivial. Wesimply create a new transformation_t structure and initialise its fields properly. The secondcase needs more work. We need to update the parent pointers of all its k predicates. In addition,we need to update the source type of the k predicates using the transformation’s type information.


219a 〈inputproc.y::rewrites 217c〉+≡ (206) ⊳ 218e 220a ⊲

transformation :

IDENTIFIER1

string transname = $1;

trans_info_t * info = find_trans_info(transname);

〈transformation::create and initialise a new transformation 219b〉$$ = newt;

| IDENTIFIER1 arguments

string transname = $1;

trans_info_t * info = find_trans_info(transname);

〈transformation::create and initialise a new transformation 219b〉

for (int j=0; j!=newt->rank; j++)

newt->args.push_back(v_arguments.top()); v_arguments.pop();

for (uint i=0; i!=newt->args.size(); i++)

newt->args[i]->adopt_parent(newt);

$$ = newt;

;

Uses adopt_parent 104b, find_trans_info 237a, IDENTIFIER1 229, trans_info_t 236a, and v_arguments 220b.

Comment 6.1.31. A new transformation record is created and filled using a record created forthe transformation when it was first defined. Why do we need to rename the parameters?

219b 〈transformation::create and initialise a new transformation 219b〉≡ (219a)

〈transformation::error checking 219c〉transformation_t * newt = new transformation_t(info->tnum,info->argcount,NULL);

newt->ttype = info->ttype->clone();

newt->ttype->renameParameters();

Uses clone 19a 19b, renameParameters 79b 80d, and transformation_t 101a.

219c 〈transformation::error checking 219c〉≡ (219b)

if (!info)

string errmsg = "Error parsing transformation. " + transname +

" undefined.\n";


assert(info);

Uses yyerror 206.

Comment 6.1.32. Each of the arguments of a transformation of rank k is itself a standardpredicate. Here, we simply insert each argument into a temporary vector that will be used andcleared by codes higher up in the parse tree.

The required brackets around each argument is admittedly cumbersome, but they do serve auseful purpose - to disambiguate the grammar. Consider the following predicate

setExists2 ∧2 top top proj1 · top.

Knowing the rank of each transformation, the predicate can be easily ‘parsed’ by a human being.But it is impossible, unless the author is ill-informed in this case, to write an unambiguous context-free grammar that can properly dissect the string into its different components. To see the problemmore clearly, consider the bracketless alternative of the arguments production below. Yacc willcomplain that the grammar is ambiguous, with a few shift-reduce conflicts. Adopting the defaultdisambiguating rule of always choosing to shift, the predicate above will be parsed as a standardpredicate with two transformation, the first of which is setExists2 ∧2 top top proj1 , and the second,top.


220a 〈inputproc.y::rewrites 217c〉+≡ (206) ⊳ 219a

arguments : ’(’ stdPredicate ’)’ v_arguments.push($2);

| ’(’ stdPredicate ’)’ arguments v_arguments.push($2);

;

Uses v_arguments 220b.

Comment 6.1.33. The arguments are stored in a stack for access when constructing transfor-mations in the previous code chunk.

220b 〈inputproc.y::variables 209c〉+≡ (206) ⊳ 216a 223d ⊲

stack<std_predicate *> v_arguments;

Defines:v_arguments, used in chunks 219a and 220a.


6.1.5 Learning Options

Comment 6.1.34. This section is meant to ease the specification of learning options. Amongother things, we need to provide an option for changing the weights of classes. The default weightfor each class is 1. We need to copy the weights to a file called classweights in the temporarydirectory. See Comment 6.1.12.

220c 〈inputproc.y::options 220c〉≡options : SECTION ;


6.1.6 Term Schemas

Comment 6.1.35. We next look at term schemas.

220d 〈inputproc.y::union members 218c〉+≡ (206) ⊳ 218c 223c ⊲

char * name;

int numint;

float num;

term_schema * term;

Uses term_schema 9.

220e 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 218d 223b ⊲

%token EQUAL NOTEQUAL CONST VAR EXISTS FORALL

%token <name> VARIABLE

%token <name> FUNCTION

%token <name> DATA_CONSTRUCTOR

%token <numint> DATA_CONSTRUCTOR_INT

%token <num> DATA_CONSTRUCTOR_FLOAT

%token <name> DATA_CONSTRUCTOR_STRING

%token <name> SYNTACTIC_VARIABLE

%token <name> IDENTIFIER1

%token <name> IDENTIFIER2

%type <term> term_schema

Uses CONST 221, DATA_CONSTRUCTOR_FLOAT 229, DATA_CONSTRUCTOR_STRING 229, EQUAL 221, EXISTS 221, FORALL 221,IDENTIFIER1 229, IDENTIFIER2 229, NOTEQUAL 221, SYNTACTIC_VARIABLE 229, term_schema 9, VAR 221,and VARIABLE 229.


221 〈inputproc.l::specific tokens 210d〉+≡ (229) ⊳ 218b 226c ⊲

VAR 〈lex:tpos 230c〉 return VAR;

CONST 〈lex:tpos 230c〉 return CONST;

EQUAL 〈lex:tpos 230c〉 return EQUAL;

NOTEQUAL 〈lex:tpos 230c〉 return NOTEQUAL;

exists 〈lex:tpos 230c〉 return EXISTS;

forall 〈lex:tpos 230c〉 return FORALL;

\=\= 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

\/\= 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

\<\= 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

\< 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

\>\= 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

\> 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

\&\& 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

\|\| 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FUNCTION;

True 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return DATA_CONSTRUCTOR;

False 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return DATA_CONSTRUCTOR;

\# 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return DATA_CONSTRUCTOR;

\[\] 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return DATA_CONSTRUCTOR;

Defines:CONST, used in chunks 220e and 223a.EQUAL, used in chunks 71b, 178a, 220e, and 223a.EXISTS, used in chunks 220e and 222.FORALL, used in chunks 220e and 222.NOTEQUAL, used in chunks 71b, 220e, and 223a.VAR, used in chunks 220e and 223a.


222 〈inputproc.y::term schema 222〉≡ (206)

term_schema : SYNTACTIC_VARIABLE

$$ = new_term(SV, $1);

| SYNTACTIC_VARIABLE sv_condition

$$ = new_term(SV, $1); $$->cond = $2;

| VARIABLE

$$ = new_term(V, $1);

| FUNCTION

$$ = new_term(F, $1);

| DATA_CONSTRUCTOR

$$ = new_term(D, $1);

| DATA_CONSTRUCTOR_INT

$$ = new_term_int($1);

| DATA_CONSTRUCTOR_FLOAT

$$ = new_term_float($1);

| DATA_CONSTRUCTOR_STRING

string x($1); $$ = new_term_string(x);

| IDENTIFIER1

$$ = new_term(F, $1);

| IDENTIFIER2

$$ = new_term(D, $1);

| ’\\’ VARIABLE ’.’ term_schema

$$ = new_term(ABS);

$$->insert(new_term(V, $2)); $$->insert($4);

| ’\\’ EXISTS VARIABLE ’.’ term_schema

$$ = new_term(APP);

$$->insert(new_term(F, "sigma"));

term_schema * abs = new_term(ABS);

abs->insert(new_term(V, $3)); abs->insert($5);

$$->insert(abs);

| ’\\’ FORALL VARIABLE ’.’ term_schema

$$ = new_term(APP);

$$->insert(new_term(F, "pi"));

term_schema * abs = new_term(ABS);

abs->insert(new_term(V, $3)); abs->insert($5);

$$->insert(abs);

| ’(’ term_schema term_schema ’)’

$$ = new_term(APP); $$->insert($2); $$->insert($3);

| 〈term schema::syntactic sugar 223e〉| ’(’ ’)’

$$ = new_term(PROD);

| 〈term schema::products 224b〉| 〈term schema::sets 225a〉| 〈term schema::lists 225b〉;

〈inputproc.y::term schemas 224a〉〈inputproc.y::term schema products 224c〉〈inputproc.y::sv condition 223a〉

Uses cond 16d, DATA_CONSTRUCTOR_FLOAT 229, DATA_CONSTRUCTOR_STRING 229, EXISTS 221, FORALL 221,IDENTIFIER1 229, IDENTIFIER2 229, insert 11d, new_term 17b, new_term_float 18a, new_term_string 18a,SYNTACTIC_VARIABLE 229, term_schema 9, and VARIABLE 229.


Comment 6.1.36. There is a small language for imposing side conditions on syntactical variables.See Comment 2.1.21.

223a 〈inputproc.y::sv condition 223a〉≡ (222)

sv_condition : ’/’ VAR ’/’ $$ = new condition; $$->tag = CVAR;

| ’/’ CONST ’/’ $$ = new condition; $$->tag = CCONST;

| ’/’ EQUAL ’,’ SYNTACTIC_VARIABLE ’/’

$$ = new condition; $$->tag = CEQUAL; $$->name = $4;

| ’/’ NOTEQUAL ’,’ SYNTACTIC_VARIABLE ’/’

$$ = new condition; $$->tag = CNOTEQUAL; $$->name = $4;

;

Uses CCONST 16b, CEQUAL 16b, CNOTEQUAL 16b, condition 16c, CONST 221, CVAR 16b, EQUAL 221, NOTEQUAL 221,SYNTACTIC_VARIABLE 229, tag 10c, and VAR 221.

223b 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 220e 226b ⊲

%type <cond> sv_condition;

Uses cond 16d.

223c 〈inputproc.y::union members 218c〉+≡ (206) ⊳ 220d 226a ⊲

condition * cond;

Uses cond 16d and condition 16c.

Comment 6.1.37. A function applied to multiple arguments is painful to write. Here weintroduce a syntactic sugar to allow users to write terms of the form (f t1 . . . tn) to mean(· · · (f t1) · · · tn). The following variable is needed to remember terms.

223d 〈inputproc.y::variables 209c〉+≡ (206) ⊳ 220b 226d ⊲

vector<term_schema *> temp_fields;

Defines:temp_fields, used in chunks 223–25.

Uses term_schema 9.

223e 〈term schema::syntactic sugar 223e〉≡ (222)

’(’ term_schema term_schema term_schemas ’)’

$$ = new_term(APP); $$->insert($2); $$->insert($3);

int size = temp_fields.size(); int psize = 0;

while (temp_fields[size-1-psize] != NULL) psize++;

term_schema * temp;

for (int i=size-psize; i!=size; i++)

temp = new_term(APP);

temp->insert($$);

temp->insert(temp_fields[i]);

$$ = temp;

while (psize+1) temp_fields.pop_back(); psize–;

Uses insert 11d, new_term 17b, temp_fields 223d, and term_schema 9.


224a 〈inputproc.y::term schemas 224a〉≡ (222)

term_schemas : term_schema

temp_fields.push_back(NULL); // start a new mult app

temp_fields.push_back($1);

| term_schemas term_schema


;

Uses temp_fields 223d and term_schema 9.

Comment 6.1.38. Products are handled in about the same way, except that we do not have toconstruct application nodes.

224b 〈term schema::products 224b〉≡ (222)

’(’ term_schemas_product ’)’

$$ = new_term(PROD);



for (int i=size-psize; i!=size; i++) $$->insert(temp_fields[i]);


Uses insert 11d, new_term 17b, and temp_fields 223d.

224c 〈inputproc.y::term schema products 224c〉≡ (222)

term_schemas_product : term_schema

temp_fields.push_back(NULL); // start a new product


| term_schemas_product ’,’ term_schema


;

Uses temp_fields 223d and term_schema 9.


Comment 6.1.39. We also provide syntactic sugar for extensional sets. We should cater forempty sets as well.

225a 〈term schema::sets 225a〉≡ (222)

’’ ’’

$$ = new_term(ABS); $$->insert(new_term(V, "pv"));

$$->insert(new_term(D, "False"));

| ’’ term_schemas_product ’’

$$ = new_term(ABS); $$->insert(new_term(V, "pv"));

term_schema * arg2 = new_term(D, "False");

int i = temp_fields.size()-1;

while (temp_fields[i] != NULL)

term_schema * ite = newT2Args(F, "ite");

term_schema * eq = newT2Args(F, "==");

eq->initT2Args(new_term(V, "pv"), temp_fields[i]);

ite->initT2Args(eq, new_term(D, "True"));

term_schema * temp = new_term(APP);

temp->insert(ite); temp->insert(arg2);

arg2 = temp;

i–;

$$->insert(arg2);




Uses initT2Args 14b, insert 11d, new_term 17b, newT2Args 14a, temp_fields 223d, and term_schema 9.

Comment 6.1.40. In the good tradition of functional programming we provide syntactic sugarfor lists as well.

225b 〈term schema::lists 225b〉≡ (222)

’[’ ’]’

| ’[’ term_schemas_product ’]’

int tsize = temp_fields.size();

term_schema * tail = newT2Args(D, "#");

tail->initT2Args(temp_fields[tsize-1], new_term(D, "[]"));

int i = tsize - 2;

while (temp_fields[i] != NULL)

term_schema * current = newT2Args(D, "#");

current->initT2Args(temp_fields[i], tail);

tail = current;

i–;

$$ = tail;




Uses initT2Args 14b, new_term 17b, newT2Args 14a, temp_fields 223d, and term_schema 9.


6.1.7 Types

Comment 6.1.41. We now look at the parsing of types.

226a 〈inputproc.y::union members 218c〉+≡ (206) ⊳ 223c

type * c_type;

226b 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 223b 227c ⊲

%token BOOL INT FLOAT STRING

%type <c_type> type

Defines:BOOL, used in chunks 210b and 227a.FLOAT, used in chunks 210b and 227a.INT, used in chunk 227a.STRING, used in chunk 227a.

226c 〈inputproc.l::specific tokens 210d〉+≡ (229) ⊳ 221

Bool 〈lex:tpos 230c〉 return BOOL;

Int 〈lex:tpos 230c〉 return INT;

Float 〈lex:tpos 230c〉 return FLOAT;

String 〈lex:tpos 230c〉 return STRING;

Defines:BOOL, used in chunks 210b and 227a.FLOAT, used in chunks 210b and 227a.INT, used in chunk 227a.STRING, used in chunk 227a.

Comment 6.1.42. A type-tuple object is created in productType’s base case. For the recursivecase, data types are added to the tuple object. There is one complication. Note that a type maycontain within it a few subtypes that are themselves product types. The classical way to handlethis is of course to maintain a stack of type-tuple objects as they are created, where the currenttuple object is always on the top of the stack. On the successful parsing of each product type, thecorresponding tuple object is popped off the stack. The following variable serves this purpose.

226d 〈inputproc.y::variables 209c〉+≡ (206) ⊳ 223d

stack<type *> tempTuples;

Defines:tempTuples, used in chunks 227 and 228a.

Comment 6.1.43. A type is one of the following:

• a type constructor T applied to n data types, where n is the arity of T ;

• a function type constructed from two types using →;

• a product type constructed from two types using ×.

Here, we call the first two standalone types. Note that only two kinds of function types aresupported in the system, sets and multisets; the syntax for them actually look very similar tothe type constructor. Of course, standalone types may contain data types within them, includingproduct types.

A type object is created for each type statement. These are then inserted into the globaldata area using the function insert-type. The type objects are identified by their names, andretrievable using the get-type function defined globally. The type objects serve several purposes.For one, it is responsible for parsing the training examples. (See the §6.1.2 for details on this.) Itmay also be needed for type-checking purposes. For example, a transformation object can querythe type object to make sure the transformation can be safely executed on the training individuals.The latter function is not implemented in the current version.


It is worth spending a minute to look at the actions needed for the different cases. For thestandalone types, we simply return the type pointer. There is however one complication. Noticethat the grammar is structured in such a way that a standalone type is parsed as a tuple with asingle element. We have to detect this scenario and do the appropriate thing.

Again, we disallow more than one types to be given to the same name. See the comments forsimilar constraints in the data declaration section.

227a 〈inputproc.y::types 227a〉≡ (206) 227b ⊲

type : IDENTIFIER1 string tname($1); $$ = new type_parameter(tname);

| BOOL $$ = new type("Bool");

| INT $$ = new type("Int");

| FLOAT $$ = new type("Float");

| STRING $$ = new type("String");

| IDENTIFIER2

string tname($1);

pair<int,type *> p = get_type(tname);

if (p.second == NULL) $$ = new type_udefined(tname);

else

if (p.first == UDEFINED) $$ = p.second->clone();

else $$ = new type_synonym(tname, p.second->clone());

| ’(’ IDENTIFIER2 types ’)’

string tname($2);

type_tuple * rem = dcast<type_tuple *>(tempTuples.top());

tempTuples.pop();

$$ = new type_alg(tname, rem);

delete_type(rem);

| ’(’ products ’)’ $$ = tempTuples.top(); tempTuples.pop();

| arrow $$ = $1;

| ’(’ type ’)’ $$ = $2;

;

Uses BOOL 226b 226c, clone 19a 19b, delete_type 77b 77c, FLOAT 226b 226c, get_type 245c, IDENTIFIER1 229,IDENTIFIER2 229, INT 226b 226c, STRING 226b 226c, tempTuples 226d, type_alg 84c, type_parameter 79c,type_synonym 81c, type_tuple 81d, type_udefined 84b, and UDEFINED 241.

227b 〈inputproc.y::types 227a〉+≡ (206) ⊳ 227a 227d ⊲

products : products ’*’ type tempTuples.top()->addAlpha($3);

| type ’*’ type

tempTuples.push(new type_tuple);

tempTuples.top()->addAlpha($1);


;

Uses tempTuples 226d and type_tuple 81d.

227c 〈inputproc.y::more preambles 209b〉+≡ (206) ⊳ 226b

%right ARROW

%type <c_type> arrow

Uses ARROW 210d.

227d 〈inputproc.y::types 227a〉+≡ (206) ⊳ 227b 228a ⊲

arrow : type ARROW type $$ = new type_abstraction($1, $3);

;

Uses ARROW 210d and type_abstraction 82c.


228a 〈inputproc.y::types 227a〉+≡ (206) ⊳ 227d

types : type

tempTuples.push(new type_tuple); tempTuples.top()->addAlpha($1);

| types type


;

Uses tempTuples 226d and type_tuple 81d.

228b 〈primary::error checking 228b〉≡if (p.second == NULL)

string errmsg = "Error parsing type. " + tname + " undefined.\n";


assert(p.second);

Uses yyerror 206.

6.2. THE SCANNER 229

6.2 The Scanner

Comment 6.2.1. The following is the scanner. This file will be processed by flex and providesthe yylex function that will be used by the parser for reading tokens. (For details on lex andyacc, see, for examples, [LMB92, Joh79].)

229 〈inputproc.l 229〉≡%

#include <iostream>

#include <stack>

#include <string.h>

#include "predicate.h"

#include "alkemy.h"

#include "y.tab.h"


int mylineno = 1;

int seccount = 0;

char linebuf[2000];

int tokenpos = 0;

%

%s DATADECLARATION

%s TRAINSET

%s TRANSFORMATIONS

%s REWRITE

%%

[\t ]+ 〈lex:tpos 230c〉

\-\-.* 〈lex:tpos 230c〉

\n.* 〈lex error reporting hackery 230b〉

\%\% 〈lex:tpos 230c〉seccount++;

switch (seccount)

case 1: BEGIN DATADECLARATION; break;

case 2: BEGIN TRAINSET; break;

case 3: BEGIN TRANSFORMATIONS; break;

case 4: BEGIN REWRITE; break;

return SECTION;

〈inputproc.l::specific tokens 210d〉[a-zA-Z\/0-9\_\.]+\.es 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return FILENAME;

[m-z][0-9]* 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return VARIABLE;

pv(e|t)[0-9]* 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return VARIABLE;

[a-zA-Z][0-9]*\_SV 〈lex:tpos 230c〉〈lex:copy yytext 230a〉;return SYNTACTIC_VARIABLE;

-?[0-9]+ 〈lex:tpos 230c〉yylval.numint = atoi(yytext);

return DATA_CONSTRUCTOR_INT;

-?[0-9]+\.[0-9]+ 〈lex:tpos 230c〉yylval.num = atof(yytext);

return DATA_CONSTRUCTOR_FLOAT;

\"[^"]*\" 〈lex:tpos 230c〉〈lex:copy yytext 230a〉


return DATA_CONSTRUCTOR_STRING;

[a-z][a-zA-Z0-9\_\’]* 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return IDENTIFIER1;

[A-Z][a-zA-Z0-9\_\’]* 〈lex:tpos 230c〉〈lex:copy yytext 230a〉; return IDENTIFIER2;

. 〈lex:tpos 230c〉 return yytext[0];

%%

#define YY_NO_UNPUT 1

〈facilities for handling multiple input files 231〉Defines:

DATA_CONSTRUCOR_INT, never used.DATA_CONSTRUCTOR_FLOAT, used in chunks 212a, 220e, and 222.DATA_CONSTRUCTOR_STRING, used in chunks 220e and 222.FILENAME, used in chunks 212a, 213c, and 215a.IDENTIFIER1, used in chunks 210a, 212a, 215b, 217c, 219a, 220e, 222, and 227a.IDENTIFIER2, used in chunks 209, 210, 212b, 220e, 222, and 227a.linebuf, used in chunks 206 and 230b.mylineno, used in chunks 206, 230b, and 231.seccount, used in chunk 206.SYNTACTIC_VARIABLE, used in chunks 220e, 222, and 223a.tokenpos, used in chunks 206 and 230.VARIABLE, used in chunks 220e and 222.

Uses alkemy.h 157 and predicate.h 130c.

230a 〈lex:copy yytext 230a〉≡ (221 229)

yylval.name = strdup(yytext);

Comment 6.2.2. I learned this trick for achieving better error recovery from [LMB92, p. 246].The regular expressionn.* matches a newline and the next line, which is saved in linebuf before being returned to thescanner by yyless. The variable tokenpos remembers the current position on the current line.

230b 〈lex error reporting hackery 230b〉≡ (229)

if (strlen(yytext+1) <= 2000)

strcpy(linebuf, yytext+1);

yyless(1);

tokenpos = 0; mylineno++;

Uses linebuf 229, mylineno 229, and tokenpos 229.

230c 〈lex:tpos 230c〉≡ (210d 213b 217b 218b 221 226c 229)

tokenpos += yyleng;

Uses tokenpos 229.


Comment 6.2.3. Escher allows nested import statements in program files. Unfortunately, wecannot simply switch input files every time we see an import statement to read from the correctfile because flex scanners do a lot of buffering. That is to say, the next token comes from thebuffer, not the file yyin.

The solution provided by flex is a mechanism to create and switch between input buffers, andthis is what we used here. A stack of input buffers is used to handle multiply nested importstatements. Every time we see an import statement, we call switchBuffer to push the currentbuffer onto stack, and then create a new buffer and switch to it. When we are done with thecurrent buffer, the scanner will call yywrap to delete the existing buffer and then revert to theprevious buffer stored on top of the stack.

See, for more details on flex, [Pax95].

231 〈facilities for handling multiple input files 231〉≡ (229)

stack<YY_BUFFER_STATE> import_stack;

stack<int> lineno_stack;

void switchBuffer(FILE * in)

YY_BUFFER_STATE current = YY_CURRENT_BUFFER;

import_stack.push(current);

lineno_stack.push(mylineno);

// cout « "Switching to new file.\n";

YY_BUFFER_STATE newf = yy_create_buffer(in, YY_BUF_SIZE);

yy_switch_to_buffer(newf);

mylineno = 1;

int yywrap()

cerr « "done\n";

YY_BUFFER_STATE current = YY_CURRENT_BUFFER;

yy_delete_buffer(current);

if (import_stack.size())

yy_switch_to_buffer(import_stack.top()); import_stack.pop();

mylineno = lineno_stack.top(); lineno_stack.pop();

return 0;

return 1;

Defines:import_stack, never used.lineno_stack, never used.switchBuffer, used in chunks 206, 212a, and 215a.yywrap, never used.

Uses mylineno 229.

Chapter 7

Administration Overhead

7.1 File Structure

Comment 7.1.1. We import the global module of Escher here as well. The actual content of theEscher global module is given below.

232 〈global.h 232〉≡#ifndef _GLOBAL_H_

#define _GLOBAL_H_


#include <string>#include <fstream>#include <vector>#include "terms.h"

#include "types.h"


#include "io.h"

#include <math.h>#define feq(x,y) (fabs(x - y) < 0.000001)

〈global.h::declarations 234〉

#define ALKERROR -555

#include "escher-global.h"

extern void cleanup_typeobjs();

#endif

Defines:ALKERROR, used in chunks 136c and 212d.feq, used in chunks 156a, 171b, 172a, 179b, and 186.global.h, used in chunks 2, 29c, 59c, 80a, 95c, 126a, 131, 133, 206, 233, and 242.

Uses cleanup_typeobjs 238b, io.h 246, terms.h 9, types.h 76a, and unification.h 85b.

232

7.1. FILE STRUCTURE 233

233 〈global.cc 233〉≡#include "global.h"

#include <cassert>using namespace std;

#include "escher-global.cc"

〈global.cc::variables 235〉〈transformation info table and functions 236d〉〈Type Name to Type Object Store and Retrieval Facility 238b〉

Uses global.h 232.

7.2. LEARNING OPTIONS 234

7.2 Learning Options

Comment 7.2.1. These are the parameters that can affect the behaviour of the learner.

• verbosity - This sets the level of detail the learner will report its progress.

• strategy - This sets the search strategy.

• i_prune, prune - This sets the prune parameter. Because the prune parameter can getupdated during learning, we want a record of the initial user-selected prune value.

• stump - This specifies that we learn a decision stump.

• cutout - This sets the cutout parameter.

• ollength - This sets the maximum size of the open list.

• balance - This parameter can be used to produce (better) balanced trees.

• crossvalidate - Do a n-fold cross-validation.

• test-percentage - This is used to specify the percentage of the data set to be set aside foruse as the test set.

• exp-count - Repeat the experiment n times.

• seed - This is used to seed the random number generator for use in the partitioning function.

• recursive - This parameter determines whether recursive rewrites can happen. In normalcases, we do not want recursive rewrites of the form ∧2(∧2(top top) top) because allowingthat will produce an infinite search space.

• FAPtable-length, FAPtable-entry-length - These are parameters associated with thefrequently-accessed predicates table. The first parameter specifies the maximum size of thetable. The second parameter specifies the maximum size of each predicate entry in the table.

• postprune - Do tree post-pruning.

• valid - This sets the percentage of training set to use as validation set.

• enumSpace - Compute the size of the total search space.

• boostN - Learn using boosting.

• filter - This can be used to filter out unpromising rewrites.

• purity - This specifies the maximum impurity allowable on the right subtree.

234 〈global.h::declarations 234〉≡ (232) 236a ⊲

#define LR 0

#define EXPECTED 1

#define CLASSIFICATION 0

#define REGRESSION 1

struct options_t int verbosity;int strategy;float prune, i_prune;bool stump;int cutout;int ollength;int balance;int crossvalidate;int foldnumber;int test_percentage; unsigned int exp_count;unsigned int seed;bool recursive;int FAPtable_length; int FAPtable_entry_length;


bool postprune; int valid;bool enumSpace;int boostN;int filter;int purity;bool one_redex;int learning_mode;bool decision_list;int margin;bool pos_only;bool typeCheck;

;extern options_t options;

#define REGRESSION_MODE options.learning_mode==REGRESSION

#define CLASSIFICATION_MODE options.learning_mode==CLASSIFICATION

#define DLIST_MODE options.decision_list

#include <string>extern string commandline;extern string functionname; extern string topleveltype;

Defines:CLASSIFICATION, used in chunks 4a, 132, and 142a.CLASSIFICATION_MODE, used in chunks 146, 147, 150b, 152–54, 156, 162c, 163a, 165, 166, 171–74, 176b, 181a,

182a, 184a, 186, 196a, 198b, 200, 202b, 204, and 205.commandline, used in chunks 2 and 165a.DLIST_MODE, used in chunks 147c, 152a, 156, 165a, 174b, 176b, 181a, 182a, and 184a.EXPECTED, used in chunks 121a and 127a.functionname, used in chunks 157, 166b, and 210a.LR, used in chunks 4a, 7a, 117a, 121a, 165a, and 183d.options, used in chunks 4–7, 117a, 121, 122b, 124, 125b, 127a, 132, 142a, 159–61, 164a, 165a, 171–74, 177–80,

182–84, 186, 193a, 207a, 210b, 214a, and 220c.REGRESSION, used in chunk 210b.REGRESSION_MODE, used in chunks 146c, 147a, 150b, 153a, and 172a.topleveltype, used in chunks 130a, 173a, 210a, and 214b.

235 〈global.cc::variables 235〉≡ (233) 236b ⊲

options_t options;string commandline;string functionname; string topleveltype;

Defines:commandline, used in chunks 2 and 165a.functionname, used in chunks 157, 166b, and 210a.options, used in chunks 4–7, 117a, 121, 122b, 124, 125b, 127a, 132, 142a, 159–61, 164a, 165a, 171–74, 177–80,

182–84, 186, 193a, 207a, 210b, 214a, and 220c.topleveltype, used in chunks 130a, 173a, 210a, and 214b.


Comment 7.2.2. This structure records all the vital information about a transformation. Thepurpose of each member field is as follows: name is the human-readable name of the transformation;tnum is the integer identifier of the transformation; argcount is the rank of the transformation;symmetry records whether the transformation is symmetric.

236a 〈global.h::declarations 234〉+≡ (232) ⊳ 234 236c ⊲

#include "terms.h"

struct trans_info_t string name;int tnum;int argcount;bool symmetricity;type ∗ ttype;void freememory() delete_type(ttype);

;

Defines:trans_info_t, used in chunks 216b, 217c, 219a, 236, and 237a.

Uses delete_type 77b 77c, freememory 19a 19c, and terms.h 9.

236b 〈global.cc::variables 235〉+≡ (233) ⊳ 235

vector<trans_info_t> trans_table;

Defines:trans_table, used in chunks 236–38.

Uses trans_info_t 236a.

236c 〈global.h::declarations 234〉+≡ (232) ⊳ 236a 239 ⊲

extern void insert_trans_info(trans_info_t tinfo);extern trans_info_t ∗ find_trans_info(string trans_name);extern trans_info_t ∗ find_trans_info(int tnum);extern void cleanup_trans_table();extern void print_trans_table();extern unsigned int trans_table_size();extern int getTopID();

Uses cleanup_trans_table 237b, find_trans_info 237a, getTopID 238a, insert_trans_info 236d,print_trans_table 238a, trans_info_t 236a, and trans_table_size 238a.

236d 〈transformation info table and functions 236d〉≡ (233) 237a ⊲

void insert_trans_info(trans_info_t tinfo) trans_table.push_back(tinfo);

Defines:insert_trans_info, used in chunks 215b and 236c.

Uses trans_info_t 236a and trans_table 236b.


Comment 7.2.3. This function is used extensively during run-time. It maybe worthwhile spend-ing some time to optimise it. A simple first step is to use a set (which implements a red-blacktree) instead of a vector.

237a 〈transformation info table and functions 236d〉+≡ (233) ⊳ 236d 237b ⊲

trans_info_t ∗ find_trans_info(string trans_name) int tablesize = trans_table.size();for (int i=0; i6=tablesize; i++)

if (trans_name ≡ trans_table[i].name)return &(trans_table[i]);

setSelector(STDERR);ioprint("Error: Transformation "); ioprint(trans_name);ioprintln(" previously undefined.\n");assert(false);return NULL;

trans_info_t ∗ find_trans_info(int tnum)

if (tnum ≥ (int)trans_table.size()) cerr ≪ "Error. tnum = " ≪ tnum ≪ " >= trans_table.size()\n";exit(1);

return &(trans_table[tnum]);

Defines:find_trans_info, used in chunks 102, 103b, 107a, 113c, 129, 217c, 219a, and 236c.

Uses ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, STDERR 246, trans_info_t 236a,and trans_table 236b.

237b 〈transformation info table and functions 236d〉+≡ (233) ⊳ 237a 238a ⊲

void cleanup_trans_table() cerr ≪ "Cleaning up the transformation info table......";for (unsigned int i=0; i6=trans_table.size(); i++)

trans_table[i].freememory();cerr ≪ "Done.\n";

Defines:cleanup_trans_table, used in chunks 8 and 236c.

Uses freememory 19a 19c and trans_table 236b.


238a 〈transformation info table and functions 236d〉+≡ (233) ⊳ 237b

unsigned int trans_table_size() return trans_table.size(); int getTopID()

if (trans_table.back().name 6= "top") setSelector(STDERR);ioprint("*** Error. The transformation top must be declared ");ioprintln(" last in the spec file.");exit(1);

return trans_table.back().tnum;

void print_trans_table() for (unsigned int i=0; i6=trans_table.size(); i++)

cout ≪ trans_table[i].name ≪ " " ≪ trans_table[i].tnum≪" ";assert(trans_table[i].ttype);cout ≪ trans_table[i].ttype→getName() ≪ " "

≪ trans_table[i].argcount ≪ endl;

Defines:getTopID, used in chunks 102a, 122a, 130a, and 236c.print_trans_table, used in chunk 236c.trans_table_size, used in chunks 112, 126a, and 236c.

Uses ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a, STDERR 246, and trans_table 236b.

238b 〈Type Name to Type Object Store and Retrieval Facility 238b〉≡ (233)

void cleanup_typeobjs() cerr ≪ "Cleaning up the type objects......";map<string, pair<int,type ∗> >::iterator p = type_fac.begin();while (p 6= type_fac.end()) delete_type(p→second.second); p++; for (uint i=0; i6=constants.size(); i++)

delete_type(constants[i].signature);constants.clear();cerr ≪ "Done.\n";

Defines:

cleanup_typeobjs, used in chunks 8 and 232.Uses clear 145b, delete_type 77b 77c, and type_fac 245c.


Comment 7.2.4. We now look at the implementation if arbitrary precision integers. We usethe GNU multiple precision arithmetic library [GMP] to provide support for massive integers. Apossible alternative is NTL [NTL], which is apparently much more user-friendly. The problem isit does not come with most standard Linux distributions.

239 〈global.h::declarations 234〉+≡ (232) ⊳ 236c

#ifndef NO_GMP

#include <gmp.h>struct bigint

mpz_t num;bigint() mpz_init(num); bigint(int x) mpz_init(num); long int y = x; mpz_set_si(num, y); bigint(unsigned int x) mpz_init(num); unsigned long int y = x; mpz_set_ui(num, y); bigint & operator=(const int & op)

long int x = op; mpz_set_si(num, x); return ∗this; bigint & operator=(const unsigned int & op)

unsigned long int x = op; mpz_set_ui(num, x); return ∗this; bigint & operator=(const unsigned long int & op)

mpz_set_ui(num, op); return ∗this; bigint & operator=(const signed long int & op)

mpz_set_si(num, op); return ∗this; bigint & operator=(const bigint &op)

mpz_set(num, op.num); return ∗this; bigint operator+(bigint &op)

bigint ret; mpz_add(ret.num, num, op.num); return ret; bigint operator-(bigint &op)

bigint ret; mpz_sub(ret.num, num, op.num); return ret; bigint operator-(unsigned int op)

bigint ret; unsigned long int temp = op;mpz_sub_ui(ret.num, num, temp);return ret;

bigint operator∗(bigint &op)

bigint ret; mpz_mul(ret.num, num, op.num); return ret; bigint operator∗(int op)

bigint ret; long int temp = op;mpz_mul_si(ret.num, num, temp);return ret;

bigint operator∗(unsigned int op)

bigint ret; unsigned long int temp = op;mpz_mul_ui(ret.num, num, temp);return ret;

bigint operator÷(bigint &op)

bigint ret; mpz_div(ret.num, num, op.num); return ret; bool operator≡(bigint const &other) const

return (mpz_cmp(num, other.num) ≡ 0); bool operator≡(long int const &other) const

return (mpz_cmp_si(num, other) ≡ 0); bool operator6=(bigint const &other) const

return (mpz_cmp(num, other.num) 6= 0); bool operator6=(int const &other) const

long int temp = other;


return (mpz_cmp_si(num, temp) 6= 0);bool operator≥(bigint const &other) const

return (mpz_cmp(num, other.num) ≥ 0); bool operator<(bigint const &other) const

return (mpz_cmp(num, other.num) < 0); void print() gmp_printf("%Zd", num); void freememory() mpz_clear(num);

;#endif

Defines:bigint, used in chunks 123c, 126–31, and 185a.

Uses freememory 19a 19c.


Comment 7.2.5. The Escher global module starts here. The file global.h is renamed escher-global.h

and included in global.h above. In a similar way, the file global.cc is renamed escher-global.cc

and included in global.cc above.

241 〈global.h 241〉≡#ifndef _ESCHER_GLOBAL_H_

#define _ESCHER_GLOBAL_H_

#include <vector>#include <string>#include "terms.h"

#include "types.h"


// this is used to record side conditions on types for statementsstruct type_condition

int sterm;type ∗ dtype;type_condition(int t, type ∗ d) sterm = t; dtype = d; void freememory() delete_type(dtype);

;// these are the escher statementsstruct statementType

term_schema ∗ stmt;int numargs;string anchor;bool typechecked;bool eager;type_condition ∗ tycond;statementType() typechecked = false; eager = false; tycond = NULL;

;extern vector<statementType> statements;extern vector<vector<term_type> > stat_term_types;

extern void initialise_constants();extern void insert_constant(string name, type ∗ sig);extern type ∗ get_signature(string name);

extern int ltime;extern int verbose;

extern bool typeCheckStatements();extern void cleanup_statements();

extern void insert_type(string & tname, int x, type ∗ tp);extern pair<int, type ∗> get_type(string tname);

#define UDEFINED 0

#define SYNONYM 1

#include <sstream>inline std::string numtostring(const int i)

std::stringstream s; s ≪ i; return s.str();


bool inVector(string x, vector<string> & v);

#endif

Defines:inVector, used in chunk 28c.numtostring, used in chunks 29c, 80b, and 209f.statementType, used in chunks 216c and 242.SYNONYM, used in chunk 209d.type_condition, never used.UDEFINED, used in chunks 208a and 227a.

Uses cleanup_statements 243b, delete_type 77b 77c, freememory 19a 19c, get_signature 245b, get_type 245c,initialise_constants 243c, insert_constant 245a, insert_type 245c, ltime 242, stat_term_types 242,statements 242, term_schema 9, term_type 85b, terms.h 9, typeCheckStatements 243a, types.h 76a,unification.h 85b, and verbose 242.

Comment 7.2.6. The variable ltime records the total number of computation steps taken tosimplify the query. Statements in the input Escher program are stored in a vector. Each statementis stored in a structure called statementType. The fields numargs and anchor are used to pickout unsuitable statements during pattern matching. (See Comment 2.2.73 for more details.)

242 〈global.cc 242〉≡#include "global.h"

vector<statementType> statements;vector<vector<term_type> > stat_term_types;int ltime = 0;int verbose = 0;

#include <stdlib.h>#include <cassert>#include <set>#include <vector>#include <string>using namespace std;

〈statements and type checking 243a〉〈constants and their signatures 243c〉〈type name to type objects mapping 245c〉

bool inVector(string x, vector<string> & v) int size = v.size();if (size ≡ 0) return false;for (int i=0; i6=size; i++)

if (v[i] ≡ x) return true;return false;

Defines:inVector, used in chunk 28c.ltime, used in chunks 61b, 63b, and 241.stat_term_types, used in chunks 62b, 241, and 243a.statements, used in chunks 60, 62, 213d, 214a, 216c, 241, and 243.verbose, used in chunks 7a, 61b, 63, 64a, 72, and 241.

Uses global.h 232, statementType 241, and term_type 85b.


243a 〈statements and type checking 243a〉≡ (242) 243b ⊲

bool typeCheckStatements() int size = statements.size();for (int i=0; i6=size; i++)

if (statements[i].typechecked) continue;pair<type ∗, vector<term_type> > res = mywellTyped(statements[i].stmt);type ∗ t = res.first;if (t) delete_type(t);

statements[i].typechecked = true;stat_term_types.push_back(res.second);

int osel = getSelector();int k = stat_term_types.size() - 1;if (i 6= k) cout ≪ "(i,k) = " ≪ i ≪ "," ≪ k ≪endl;assert(i ≡ k);osel = getSelector(); setSelector(SILENT);for (uint j=0; j6=stat_term_types[k].size();j++)

ioprint((int)j);stat_term_types[k][j].first→print();ioprint(" : ");ioprint(stat_term_types[k][j].second→getName());ioprintln();

setSelector(osel);

else return false;return true;

Defines:typeCheckStatements, used in chunk 241.

Uses delete_type 77b 77c, getSelector 246 247a, ioprint 246 247a, ioprintln 246 247a, setSelector 246 247a,SILENT 246, stat_term_types 242, statements 242, and term_type 85b.

Comment 7.2.7. Here we release the memory occupied by the statements and the data structuressupporting side conditions on them. We do not have to free the term part of stat_term_typesbecause they point to subterms of terms residing in the statements vector.

243b 〈statements and type checking 243a〉+≡ (242) ⊳ 243a

void cleanup_statements() cerr ≪ "Cleaning up statements...";for (uint i=0; i6=statements.size(); i++)

statements[i].stmt→freememory();if (statements[i].tycond) statements[i].tycond→freememory();

cerr ≪ "Done.\n";

Defines:cleanup_statements, used in chunks 8 and 241.

Uses freememory 19a 19c and statements 242.

Comment 7.2.8. We now describe a facility that supports the storage and retrieval of the declaredsignatures of constants.


243c 〈constants and their signatures 243c〉≡ (242) 245a ⊲

struct constant_sig string name; type ∗ signature; ;vector<constant_sig> constants;

void initialise_constants() constant_sig temp;temp.name = "True"; temp.signature = new type("Bool");constants.push_back(temp);temp.name = "False"; temp.signature = new type("Bool");constants.push_back(temp);type ∗ a = new type_parameter("a");type ∗ lista = new type_alg("List"); lista→addAlpha(a);temp.name = "[]"; temp.signature = lista;constants.push_back(temp);temp.name = "#";temp.signature =

new type_abstraction(a→clone(),new type_abstraction(lista→clone(), lista→clone()));

constants.push_back(temp);

type ∗ number = new type_parameter("number");type ∗ algtype =

new type_abstraction(number,new type_abstraction(number→clone(),number→clone()));

temp.name = "add"; temp.signature = algtype;constants.push_back(temp);temp.name = "sub"; temp.signature = algtype→clone();constants.push_back(temp);temp.name = "max"; temp.signature = algtype→clone();constants.push_back(temp);temp.name = "min"; temp.signature = algtype→clone();constants.push_back(temp);temp.name = "mul"; temp.signature = algtype→clone();constants.push_back(temp);temp.name = "div"; temp.signature = algtype→clone();constants.push_back(temp);temp.name = "mod"; temp.signature = algtype→clone();constants.push_back(temp);

type ∗ reltype =new type_abstraction(number→clone(),

new type_abstraction(number→clone(), new type("Bool")));temp.name = ">"; temp.signature = reltype→clone();constants.push_back(temp);temp.name = ">="; temp.signature = reltype→clone();constants.push_back(temp);temp.name = "<"; temp.signature = reltype→clone();constants.push_back(temp);temp.name = "<="; temp.signature = reltype→clone();constants.push_back(temp);

Defines:

initialise_constants, used in chunks 3a and 241.Uses clone 19a 19b, type_abstraction 82c, type_alg 84c, and type_parameter 79c.

7.3. IO FACILITIES 245

245a 〈constants and their signatures 243c〉+≡ (242) ⊳ 243c 245b ⊲

void insert_constant(string name, type ∗ sig) for (uint i=0; i6=constants.size(); i++)

if (constants[i].name ≡ name) cerr ≪ "The constant " ≪ name ≪ " has been "

≪ "defined before.\n";assert(false);

constant_sig temp; temp.name = name; temp.signature = sig;constants.push_back(temp);

Defines:

insert_constant, used in chunks 208a, 215b, and 241.

245b 〈constants and their signatures 243c〉+≡ (242) ⊳ 245a

type ∗ get_signature(string name) assert(name.size());for (uint i=0; i6=constants.size(); i++)

if (constants[i].name ≡ name)return constants[i].signature;

cerr ≪ "Unknown constant: " ≪ name ≪ endl;// assert(false);return NULL;

Defines:

get_signature, used in chunks 91a and 241.

Comment 7.2.9. This facility is used to provide mappings from type names to type objects. Theinitial assignment was performed in the parser.

245c 〈type name to type objects mapping 245c〉≡ (242)

#include <map>static map<string, pair<int, type ∗> > type_fac;

void insert_type(string & tname, int x, type ∗ tp) assert(type_fac.find(tname) ≡ type_fac.end());pair<int, type ∗> temp(x, tp);type_fac[tname] = temp;

pair<int, type ∗> get_type(string tname) map<string, pair<int,type ∗> >::iterator p = type_fac.find(tname);if (p ≡ type_fac.end()) pair<int,type ∗> ret(-5,NULL); return ret; return p→second;

Defines:

get_type, used in chunks 130a, 173a, 210e, 214b, 227a, and 241.insert_type, used in chunks 208a, 209d, and 241.type_fac, used in chunk 238b.

7.3 IO Facilities

Comment 7.3.1. Silent printing is a useful trick I learned from [Knu86].


246 〈io.h 246〉≡#ifndef _IO_H_

#define _IO_H_

#include <string>#include <iostream>#include <fstream>#include <stdio.h>using namespace std;

#define STDOUT 1

#define STDERR 2

#define SILENT 3

void setSelector(int x);int getSelector();

void ioprint(string x);void ioprint(int x);void ioprint(long int x);void ioprint(double x);void ioprint(char x);void ioprintln(string x);void ioprintln(int x);void ioprintln(long int x);void ioprintln(double x);void ioprintln(char x);void ioprintln();

#endif

Defines:getSelector, used in chunks 15e, 62b, 73d, 74b, 93–96, 166b, and 243a.io.h, used in chunks 9, 75d, 232, and 247a.ioprint, used in chunks 15e, 37c, 57a, 61–64, 71–74, 91b, 93–96, 102c, 104c, 136c, 181a, 190a, 206, 214, 217d,

237a, 238a, and 243a.ioprintln, used in chunks 19d, 37c, 57a, 59a, 61b, 63, 72b, 91b, 93–95, 110b, 136c, 181a, 190a, 206, 214, 217d,

237a, 238a, and 243a.setSelector, used in chunks 19d, 37c, 57a, 59a, 62b, 63e, 71–73, 91b, 93–96, 110b, 136c, 152a, 166b, 181a,

190a, 206, 214, 217d, 237a, 238a, and 243a.SILENT, used in chunks 15e, 74b, 91b, 181a, 190a, 243a, and 247.STDERR, used in chunks 19d, 37c, 57a, 59a, 71–73, 93–96, 110b, 136c, 190a, 206, 214, 217d, 237a, 238a, and 247.STDOUT, used in chunks 62b, 63e, 91b, 95a, 152a, 166b, 181a, and 247.


247a 〈io.cc 247a〉≡#include "io.h"

static int selector;

void setSelector(int x) selector = x; int getSelector() return selector;

void ioprint(string x) 〈io::common print command 247b〉 void ioprint(int x) 〈io::common print command 247b〉 void ioprint(long int x) 〈io::common print command 247b〉 void ioprint(double x) 〈io::common print command 247b〉 void ioprint(char x) 〈io::common print command 247b〉 void ioprintln(string x) 〈io::common print command ln 247c〉 void ioprintln(int x) 〈io::common print command ln 247c〉 void ioprintln(long int x) 〈io::common print command ln 247c〉 void ioprintln(double x) 〈io::common print command ln 247c〉 void ioprintln(char x) 〈io::common print command ln 247c〉 void ioprintln()

if (selector ≡ SILENT) return;if (selector ≡ STDOUT) cout ≪ endl;else cerr ≪ endl;

Defines:getSelector, used in chunks 15e, 62b, 73d, 74b, 93–96, 166b, and 243a.ioprint, used in chunks 15e, 37c, 57a, 61–64, 71–74, 91b, 93–96, 102c, 104c, 136c, 181a, 190a, 206, 214, 217d,

237a, 238a, and 243a.ioprintln, used in chunks 19d, 37c, 57a, 59a, 61b, 63, 72b, 91b, 93–95, 110b, 136c, 181a, 190a, 206, 214, 217d,

237a, 238a, and 243a.setSelector, used in chunks 19d, 37c, 57a, 59a, 62b, 63e, 71–73, 91b, 93–96, 110b, 136c, 152a, 166b, 181a,

190a, 206, 214, 217d, 237a, 238a, and 243a.Uses io.h 246, SILENT 246, and STDOUT 246.

247b 〈io::common print command 247b〉≡ (247a)

if (selector ≡ SILENT) return;if (selector ≡ STDOUT) cout ≪ x;else if (selector ≡ STDERR) cerr ≪ x;

Uses SILENT 246, STDERR 246, and STDOUT 246.

247c 〈io::common print command ln 247c〉≡ (247a)

if (selector ≡ SILENT) return;if (selector ≡ STDOUT) cout ≪ x ≪ endl;else if (selector ≡ STDERR) cerr ≪ x ≪ endl;

Uses SILENT 246, STDERR 246, and STDOUT 246.

Chapter 8

A Listing of the Code Chunks

〈alkemy definitions 173b〉〈alkemy.cc 158a〉〈alkemy::data structures 155b〉〈alkemy::data structures::functions 156a〉〈alkemy.h 157〉〈alkemy::learn::Calculate elapsed time 164c〉〈alkemy::learn::Compute and display decision tree on test set 166a〉〈alkemy::learn::Compute result on the real test set 166b〉〈alkemy::learn::Display decision tree on training set 165b〉〈alkemy::learn::Display learning options 165a〉〈alkemy::learn::initialise variables 164b〉〈alkemy::private function declarations 161b〉〈alkemy::private functions 161c〉〈alkemy::public functions 158c〉〈alkemy::static functions 158b〉〈apply (x,t) to each eqn in eqns, extend eqns and return true 87a〉〈atomic term 10f〉〈boost::compute err-m 204a〉〈boost::output classifier and result on training and test sets 205〉〈boost::update weight 204b〉〈buildtree::clean up openlist if interrupted 183e〉〈buildtree::construct predicate and test it 174a〉〈buildtree::extend current tree if a better accuracy is obtained 184a〉〈buildtree::get the rewrites for the current olnode 173c〉〈buildtree::impose cutout mechanism 181b〉〈buildtree::initialise the book-keeping data structures 172b〉〈buildtree::initialise the openlist 173a〉〈buildtree::initialise the subtrees 184b〉〈buildtree::insert into openlist if interesting 182a〉〈buildtree::left most consideration 183d〉〈buildtree::predicate evaluation via table lookup 174b〉〈buildtree::predicate evaluation::calculate refinement bound 174c〉〈buildtree::predicate evaluation::record if better than best so far 176b〉〈buildtree::print progress 180b〉〈buildtree::return if nothing can be done 172a〉〈cannot possibly be a redex 56a〉〈case of predicate is in FAPtable 192a〉〈case of predicate not in FAPtable 192b〉

248

249

〈command line help menu 7a〉〈constants and their signatures 243c〉〈cross validate::perform fold i 159b〉〈cross validate::print summary 160c〉〈cross validate::update avg-acc with temp 160b〉〈debug matching 1 63e〉〈debug matching 2 63f〉〈debug matching 3 63g〉〈debug matching 4 64a〉〈delete eqns of the form x = x 86d〉〈distribution function declarations 148〉〈distribution functions 145b〉〈error comp pruning::clean up the candidate trees 197b〉〈error comp pruning::compute best tree::classification 196b〉〈error comp pruning::compute best tree::regression 197a〉〈error comp pruning::compute pruned tree sequence 195〉〈error comp pruning::select best-pruned tree 196a〉〈error comp pruning::special cases 194d〉〈error handling::get previously bound 71b〉〈facilities for handling multiple input files 231〉〈FAP evaluate::switch to plain algorithm if desirable 193b〉〈FAP table::store if there is more space 193a〉〈freememory error checking 19d〉〈generate equality transformations 208b〉〈generate projection transformations 209f〉〈getopt argument 4b〉〈global.cc 233〉〈global.cc 242〉〈global.cc::variables 235〉〈global.h 232〉〈global.h 241〉〈global.h::declarations 234〉〈GMP sensitive options 6a〉〈hash map iterator 191a〉〈if x appears in t, return false 86e〉〈in table::time calculation 1 193c〉〈in table::time calculation 2 193d〉〈initialise-learner::compute training set size 162a〉〈initialise-learner::initialise best predicate 162d〉〈initialise-learner::initialise default accuracy 163a〉〈initialise-learner::initialise distribution count 162c〉〈initialise-learner::push training instances onto local structure 162b〉〈inputproc.l 229〉〈inputproc.l::specific tokens 210d〉〈inputproc.y 206〉〈inputproc.y::declaration 207b〉〈inputproc.y::examples 211b〉〈inputproc.y::more preambles 209b〉〈inputproc.y::options 220c〉〈inputproc.y::prepare trans info 216b〉〈inputproc.y::rewrites 217c〉〈inputproc.y::Setup Classes 210e〉〈inputproc.y::statement schema 216c〉〈inputproc.y::sv condition 223a〉

250

〈inputproc.y::term schema 222〉〈inputproc.y::term schema products 224c〉〈inputproc.y::term schemas 224a〉〈inputproc.y::transinfo 213d〉〈inputproc.y::types 227a〉〈inputproc.y::union members 218c〉〈inputproc.y::variables 209c〉〈insert ftable::error handling 98b〉〈io.cc 247a〉〈io::common print command 247b〉〈io::common print command ln 247c〉〈io.h 246〉〈isEq::common code 48a〉〈isEq::switch t1 and t2 48b〉〈isFuncNotRightArgs::error handling 58d〉〈junk predicate 183c〉〈keep if interesting::case of list learning 182b〉〈keep if interesting::case of regression 183a〉〈keep if interesting::case of tree learning 182d〉〈label::error checking 212d〉〈labelVariables initialization values 22a〉〈labelVariables::APP 23b〉〈labelVariables::PROD 24a〉〈lex error reporting hackery 230b〉〈lex:copy yytext 230a〉〈lex:tpos 230c〉〈main.cc 2〉〈main::change to specified options 5〉〈main::Clean Up 8〉〈main::incompatible options 6b〉〈main::initialise with default options 4a〉〈main::Process Learning Options 3b〉〈main::Process Specification File 7b〉〈main::System Initialisation and Startup 3a〉〈memory debugging code 18b〉〈not in table::time calculation 1 193e〉〈not in table::time calculation 2 193f〉〈old boring evaluation algorithm 194a〉〈parser::make sure statement head has the right form 216d〉〈parser::preprocess statements 216e〉〈pattern-match.cc 75d〉〈pattern-match::function declarations 70a〉〈pattern-match::functions 70b〉〈pattern-match.h 75c〉〈pred-evaluation.cc 189b〉〈pred-evaluation.h 189a〉〈pred-evaluation::hash function 189c〉〈predicate.cc 131a〉〈predicate.h 130c〉〈predicate::representations 101a〉〈primary::error checking 228b〉〈print error handling 16a〉〈prune predicate 183b〉〈record if useful::case of entropy 179a〉

251

〈record if useful::case of list learning 177〉〈record if useful::case of regression 180a〉〈record if useful::case of tree learning 178a〉〈redex-match::case of ABS 73b〉〈redex-match::case of ABS::change variable name 73c〉〈redex-match::case of APP 72c〉〈redex-match::case of APP::debug matching 1 72d〉〈redex-match::case of APP::debug matching 2 72e〉〈redex-match::case of PROD 73a〉〈redex-match::case of SV 70d〉〈redex-match::case of SV::check constraints 71a〉〈redex-match::case of V 72a〉〈redex-match::case of V::check free variable capture condition 72b〉〈redex-match::write a small warning message 73d〉〈reduce::small APP optimization 59b〉〈regularise::check for symmetricity 113c〉〈regularise::insert regularised transformation into ret 114a〉〈rewrite table data structure and functions 115a〉〈rewrite::apply the rewrite 123b〉〈rewrite.cc 131c〉〈rewrite::clone the input predicate 122c〉〈rewrite::compute positional type 122a〉〈rewrite::error checking 217d〉〈rewrite::EXPECTED 121b〉〈rewrite::functions 121a〉〈rewrite.h 131b〉〈rewrite::insert into outlist if not seen 123c〉〈rewrite::insert into outlist if regular 125b〉〈rewrite::LR 124〉〈rewrite::prevent recursive rewrites 122b〉〈rewrite::public functions 116a〉〈rewrite::spsize2 functions 127d〉〈rewrite::struct type-rewrites-t 115b〉〈Setup Classes::error checking 211a〉〈simpl output 61b〉〈simplify update pointers 36b〉〈simplifyArithmetic::add 40b〉〈simplifyArithmetic::div 41c〉〈simplifyArithmetic::max 41d〉〈simplifyArithmetic::min 41e〉〈simplifyArithmetic::mul 41b〉〈simplifyArithmetic::sub 41a〉〈simplifyConjunction2::create body 49b〉〈simplifyEquality::case of applications 38a〉〈simplifyEquality::case of products 37a〉〈simplifyEquality::case of products::empty tuples 37b〉〈simplifyEquality::case of products::error handling 37c〉〈simplifyEquality::check whether we have data constructors 38b〉〈simplifyEquality::identical variables 36c〉〈simplifyEquality::irrelevant cases 36d〉〈simplifyEquality::local variables 37d〉〈simplifyExistential::case one and two 51c〉〈simplifyExistential::move to the body 51b〉〈simplifyExistential::tricky case 52a〉

252

〈simplifyExistential::tricky case::general case 52c〉〈simplifyExistential::tricky case::special case 52b〉〈simplifyUniversal::change end game 55a〉〈simplifyUniversal::check the form of body 54c〉〈simplifyUniversal::general case 55b〉〈simplifyUniversal::special case 54e〉〈simplifyUniversal::true statement 54d〉〈spsize2::error message 130b〉〈standard predicate member functions 103d〉〈start learning 207a〉〈statements and type checking 243a〉〈std predicate function declarations 105c〉〈std predicate::initialiseType::repeat code 108d〉〈std predicate::recalculateType::error handling 110b〉〈struct treenode function declarations 154c〉〈struct treenode functions 151c〉〈subst2::case of SV 33a〉〈subst2::case of V 33c〉〈subst2::free variable captured 33d〉〈subst2::replace by ti 33b〉〈tables.cc 97a〉〈tables.h 96b〉〈term has name 10e〉〈term schema::equal::numbers 15a〉〈term schema::lists 225b〉〈term schema::products 224b〉〈term schema::sets 225a〉〈term schema::syntactic sugar 223e〉〈terms.cc 10a〉〈terms.cc::local functions 14a〉〈term-schema clone parts 12d〉〈term-schema initializations 12c〉〈term-schema parts 10c〉〈term-schema replace parts 12e〉〈term-schema::constructors 12a〉〈term-schema::definitions 16b〉〈term-schema::external functions 13d〉〈term-schema::function declarations 11a〉〈term-schema::function definitions 11b〉〈term-schema::memory management 17a〉〈term-schema::supporting types 16c〉〈term-schema::type defs 10b〉〈terms.h 9〉〈training set::data structures 132〉〈training set::public functions 134〉〈trainset::body 135b〉〈trainset.cc 135a〉〈trainset.h 133〉〈transformation function declarations 103c〉〈transformation info table and functions 236d〉〈transformation member functions 102a〉〈transformation::create and initialise a new transformation 219b〉〈transformation::error checking 219c〉〈tree-dstructs.cc 145a〉

253

〈tree-dstructs::distribution 144〉〈tree-dstructs.h 155a〉〈tree-dstructs::treenode 149〉〈try match 60〉〈try match::debugging code 1 63d〉〈try match::different simplifications 61a〉〈try match::eager statements 62c〉〈try match::find special cases where no matching is required 62a〉〈try match::output answer 63c〉〈try match::output pattern matching information 63b〉〈try match::side conditions on types 62b〉〈try match::unimportant things 63a〉〈type checking 95c〉〈type checking actual 90a〉〈type checking individual 214b〉〈type checking individuals and statements 214a〉〈type checking subsidiary functions 95a〉〈type checking variables 90b〉〈Type Name to Type Object Store and Retrieval Facility 238b〉〈type name to type objects mapping 245c〉〈type::abstractions 82c〉〈type::abstractions::implementation 83a〉〈type::algebraic types 84b〉〈type::algebraic types::implementation 85a〉〈type::composite types 78a〉〈type::composite types::implementation 78b〉〈type::function declarations 81a〉〈type::functions 77c〉〈type::parameters 79c〉〈type::parameters::implementation 80a〉〈types.cc 76b〉〈types.h 76a〉〈type::synonyms 81c〉〈type::tuples 81d〉〈type::tuples::implementation 82a〉〈type::type 77a〉〈unification body 86a〉〈unification.cc 85c〉〈unification.h 85b〉〈unify::case of both non-parameters 89a〉〈unify::verbose 1 89b〉〈unify::verbose 2 89c〉〈variable case::lookup previous occurrence 92a〉〈wellTyped2::abstraction::error reporting 94b〉〈wellTyped2::application::error reporting 93b〉〈wellTyped2::application::error reporting2 93c〉〈wellTyped2::application::t1 should have right form 93a〉〈wellTyped2::case of t a constant 91a〉〈wellTyped2::case of t a tuple 94c〉〈wellTyped2::case of t a variable 91c〉〈wellTyped2::case of t an abstraction 94a〉〈wellTyped2::case of t an application 92b〉〈wellTyped2::save n return 91b〉〈wellTyped2::tuple::error reporting 94d〉

Bibliography

[Bar87] Henk P. Barendregt. The Lambda Calculus: Its Syntax and Semantics. North-Holland,1987.

[BD98] Hendrik Blockeel and Luc De Raedt. Top-down induction of first-order logical decisiontrees. Artificial Intelligence, 101(1-2):285–297, 1998.

[BFOS84] Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone. Classificationand Regression Trees. Chapman & Hall, New York, 1984.

[BGCL01] Antony F. Bowers, Christophe Giraud-Carrier, and John W. Lloyd. A knowledgerepresentation framework for inductive learning. http://rsise.anu.edu.au/~jwl/,2001.

[Blo98] Hendrik Blockeel. Top-Down Induction of First Order Logical Decision Trees. PhDthesis, Departement Computerwetenschappen, Katholieke Universiteit Leuven, 1998.

[Chu40] Alonzo Church. A formulation of the simple theory of types. Journal of SymbolicLogic, 5:56–68, 1940.

[CM98] Mary Elaine Califf and Raymond J. Mooney. Advantages of decision lists and implicitnegatives in inductive logic programming. New Generation Computing, 16 (3):263–281, 1998.

[FS97] Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-linelearning and an application to boosting. Journal of Computer and System Sciences,55(1):119–139, 1997.

[GHJV95] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns -Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.

[GMP] The GNU Multiple Precision arithmetic library. http://www.swox.com/gmp.

[HTF01] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of StatisticalLearning. Springer, 2001.

[Joh79] Steven C. Johnson. Yacc: Yet another compiler compiler. In UNIX Programmer’sManual, volume 2, pages 353–387. Holt, Rinehart, and Winston, New York, NY, USA,1979.

[Knu86] Donald E. Knuth. TEX: The Program. Addison-Wesley, 1986.

[Koh95] Ron Kohavi. A study of cross-validation and bootstrap for accuracy estimation andmodel selection. In Proceedings of the International Joint Conference on ArtificialIntelligence, pages 1137–1145, 1995.

[Kra96] Stefan Kramer. Structural regression trees. In Proceedings of the 13th National Con-ference on Artificial Intelligence, pages 812–819. AAAI Press, 1996.

254

BIBLIOGRAPHY 255

[KW01] Stefan Kramer and Gerhard Widmer. Inducing classification and regression trees infirst order logic. In Sašo Džeroski and Nada Lavrač, editors, Relational Data Mining,chapter 6. Springer, 2001.

[Llo99] John W. Lloyd. Programming in an integrated functional and logic language. Journalof Functional and Logic Programming, 3, 1999.

[Llo00] John W. Lloyd. Predicate construction in higher-order logic. Electronic Transactionson Artificial Intelligence, 4(B):21–51, 2000. http://www.ep.liu.se/ej/etai/2000/009.

[Llo02] John W. Lloyd. Knowledge representation, computation, and learning in higher-orderlogic. Available at http://rsise.anu.edu.au/~jwl/, 2002.

[Llo03] John W. Lloyd. Logic for Learning: Learning Comprehensible Theories from Struc-tured Data. Cognitive Technologies. Springer, 2003.

[LMB92] John R. Levine, Tony Mason, and Doug Brown. lex & yacc. O’Reilly, 1992.

[MC95] Raymond J. Mooney and Mary Elaine Califf. Induction of first-order decision lists:Results on learning the past tense of english verbs. Journal of Artificial IntelligenceResearch, 3:1–24, 1995.

[Min89] John Mingers. An empirical comparison of pruning methods for decision tree induc-tion. Machine Learning, 4, 1989.

[Mit96] John C. Mitchell. Foundations for Programming Languages. MIT Press, Cambridge,MA, 1996.

[Ng05] Kee Siong Ng. Learning Comprehensible Theories from Structured Data. PhD thesis,Computer Sciences Laboratory, The Australian National University, 2005.

[NTL] Number Theory Library. http://www.shoup.net/ntl.

[Pax95] Vern Paxson. Flex: A fast scanner generator, 2.5 edition, March 1995.

[Pey87] Simon L. Peyton Jones. The Implementation of Functional Programming Languages.Prentice-Hall, 1987.

[Riv87] Ronald L. Rivest. Learning decision lists. Machine Learning, 2(3):229–246, 1987.

[SMJST03] Marina Sokolova, Mario Marchand, Nathalie Japkowicz, and John Shawe-Taylor. Thedecision list machine. In S. Becker, S. Thrun, and K. Obermayer, editors, Advancesin Neural Information Processing Systems 15, pages 921–928. MIT Press, 2003.

The Alkemy Source Book - Australian National...

Documents

Transcript of The Alkemy Source Book - Australian National...