Finding Out About - Chapter 2 Extracting Lexical...

61
© R. K. Belew 1996-2001 Finding Out About Finding Out About - Chapter 2 Extracting Lexical Features

Transcript of Finding Out About - Chapter 2 Extracting Lexical...

Page 1: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Finding Out About - Chapter 2Extracting Lexical Features

Page 2: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.1 Building useful tools

Page 3: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.2 Inter-document parsing

Page 4: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

General issues

Page 5: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

EmailFrom: [email protected]: [email protected]: [email protected]: new positions for students at Max Planck Institute,BerlinDate: Tue, 13 Jan 98 19:14:08 +0100X-Mts: smtpStatus:

Dear colleagues,

We are pleased to announce that we have postdoctoral andpredoctoral positions available beginning next year here at ourCenter for Adaptive Behavior and ...

Page 6: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Lex/Yacc doesn’t always work!

“login (Full Name)” vs “Full Name <login>”

Page 7: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Lex/Yacc doesn’t always work! (cont.)

Page 8: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

THESIS# 00001AUTHOR: SHINN, HONG SHIKYEAR: 1989TITLE: A UNIFIED APPROACH TO ANALOGICAL REASONING CLASSIF: MACHINE LEARNINGUNIVERSITY: GEORGIA INSTITUTE OF TECHNOLOGY (0078)ADVISOR: JANET L. KOLODNERABSTRACT:

Experiential reasoning is the most basic form of intelligent activity,consequently, in artificial intelligence, numerous computational models of

...problem solver but also a very general problem solver.

EOABSTRACT

AI Theses (AIT)

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

Page 9: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Common processing flow

Page 10: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.3 Intra-document

Page 11: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Major issues

Page 12: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

LA generators

Page 13: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Finite state automata

0

1

2

8

9

{a-zA-Z0-9}

{b \n \t}

{a-zA-Z}

&

\0

else. .

.

Page 14: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Same lexical analysis for bothdocuments and queries!

Page 15: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.3.1 Stemming and OtherMorphological Processing

Page 16: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Conflation

Page 17: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Stemming

Page 18: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Rewrite rules

“IES” except (“EIES” or “AIES”) --> “y”

Page 19: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Porter stemmer

Page 20: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Rules

static RuleList step1a_rules[] =

{ 101, "sses", "ss", 3, 1, -1, NULL, 102, "ies", "i", 2, 0, -1, NULL, 103, "ss", "ss", 1, 1, -1, NULL, 1100, "\'s", LAMBDA, 1, -1, -1, NULL, 104, "s", LAMBDA, 0, -1, -1, NULL, 000, NULL, NULL, 0, 0, 0, NULL, };

static RuleList step1b_rules[] = {

105, "eed", "ee", 2, 1, 1, NULL,106, "ed", LAMBDA, 1, -1, -1, ContainsVowel,107, "ing", LAMBDA, 2, -1, -1, ContainsVowel,1101, "ingly", LAMBDA, 4, -1, -1, ContainsVowel,000, NULL, NULL, 0, 0, 0, NULL,

};

Page 21: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Rule matching

ReplaceEnd(char *word, RuleList *rule){ while ( 0 != rule->id ) { ending = end - rule->old_offset; if ( word < ending )

if ( 0 == strcmp(ending,rule->old_end) ) { tmp_ch = *ending; *ending =EOS; if ( rule->min_root_size < WordSize(word) ) if ( !rule->condition || (*rule->condition) (word)) { (void)strcat( word, rule->new_end );

end = ending + rule->new_offset; break; }

*ending = tmp_ch; }

rule++; } return( rule->id );} /* ReplaceEnD */

Page 22: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Other approaches

Page 23: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Phrases

Page 24: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Asian languages

Page 25: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.3.2 Noise words

Page 26: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Combining stopword removal withother lexical decisions

Page 27: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Character classes

Page 28: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Character lex-loop (Part 1)

Page 29: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Character lex-loop (Part 2)

Page 30: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.4 Example corpora

Page 31: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

THESIS# 00001AUTHOR: SHINN, HONG SHIKYEAR: 1989TITLE: A UNIFIED APPROACH TO ANALOGICAL REASONING CLASSIF: MACHINE LEARNINGUNIVERSITY: GEORGIA INSTITUTE OF TECHNOLOGY (0078)ADVISOR: JANET L. KOLODNERABSTRACT:

Experiential reasoning is the most basic form of intelligent activity,consequently, in artificial intelligence, numerous computational models of

...problem solver but also a very general problem solver.

EOABSTRACT

AI Theses (AIT)

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

Page 32: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

AIT year distribution

Page 33: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Basic algorithm

for every doc in corpuswhile (token = getNonNoiseToken)

if (StemP)token = stem(token)

Save Posting(token,doc) in Tree for every token in Tree

Accumulate ndoc(token), totfreq(token)Sort p \in Postings(token) in descending docfreq(p) order

write token,ndoc,totfreq, Postings

Page 34: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Basic Posting Data Structures

Page 35: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Refined Posting Data Structures

Page 36: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Minimizing OS dependencies

Page 37: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Assume:

Page 38: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Two central files

FileNo Path Date Size NBlock NMsg� Out?

FileNo MsgNoBPos TxtPos EPos Proxy NLines Date From(To) (Cc)�

Page 39: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Pointer structure4 "Host:Email archive:950818:apple" "1991/08/01 14:15:53" 17058 1 7 05 "Host:Email archive:950818:bib-cites" "1991/05/31 16:18:27" 933 1 1 06 "Host:Email archive:950818:comp-bio" "1991/07/25 18:09:51" 22730 1 6 07 "Host:Email archive:950818:conf" "1991/08/01 13:28:26" 122422 4 11 08 "Host:Email archive:950818:contacts" "1991/08/19 16:31:07" 215475 7 67 09 "Host:Email archive:950818:cse" "1991/08/10 17:03:33" 66052 2 26 0

6 5 7071 8042 14159 236 "Exisiting RTGs" "1995/07/17 08:21:11" 19 (21 ) ()6 6 14160 15178 22729 158 "Summary of Computational Biology Town Meeting, Jul" "1995/07/18 06:36:52"7 1 0 515 3441 111 "[[email protected]: (DBWORLD) ESSIR - Europea" "1995/05/17 13:21:40"7 2 3442 4094 9760 158 "From Animals to Animats" "1995/05/18 09:16:26" 24 (25 ) ()7 3 9761 11341 18242 186 "CFP: Pacific Symposium on Biocomputing" "1995/05/18 11:32:47" 307 4 18243 19057 26378 194 "Call For Papers" "1995/05/18 16:10:29" 58 (57 ) ()

to find out more options send help

From ???@??? Thu May 18 15:17:49 1995Received: from cogsci.ucsd.edu by odin.ucsd.edu; id AA10669 sendmail 5.67/UCSDPSEUDO.4-CS via SMTP Thu, 18 May 95 09:16:41 -0700 for rikReceived: from yakima.UCSD.EDU by cogsci.UCSD.EDU (4.1/UCSDPSEUDO.4) id AA17558 for rik@cs; Thu, 18 May 95 09:16:34 PDTReceived: by yakima.UCSD.EDU (4.1/UCSDPSEUDO.3) id AA00611 for [email protected]; Thu, 18 May 95 09:16:26 PDTDate: Thu, 18 May 95 09:16:26 PDTMessage-Id: <[email protected]>From: John Batali <[email protected]>Sender: batali@cogsciTo: alife-lab@cogsciSubject: From Animals to AnimatsStatus: O

Date: Thu, 18 May 1995 11:08:14 -0400From: Maja Mataric <[email protected]>Subject: Conference Announcement and Call For Papers

==============================================================================

Conference Announcement and Call For Papers

FROM ANIMALS TO ANIMATS

Fourth International Conference on Simulation of Adaptive Behavior (SAB96)

Cape Cod, Massachusetts, USA, September 9-13, 1996

The objective of the conference is to bring together researchers inethology, psychology, ecology, artificial intelligence, artificiallife, robotics, and related fields so as to further our understanding

...

SEP 9-13: Conference dates

General queries to: [email protected] Page: http://www.cs.brandeis.edu/conferences/sab96

==============================================================================

From ???@??? Thu May 18 15:17:59 1995Received: from hydra.sdsc.edu by odin.ucsd.edu; id AA19421 sendmail 5.67/UCSDPSEUDO.4-CS via SMTP Thu, 18 May 95 11:36:01 -0700 for christos

file.d doc.d

(Original) document file

Page 40: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.5.2 Fine Points

Page 41: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

STAIRS Posting Information

Page 42: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Quoted Lines in an Email Message

Page 43: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

2.5.3 Software Libraries

FOA-CD:///FindingOutAbout/index.htm#FOA source code

Page 44: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

C

Page 45: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Java

Page 46: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Other topics

Page 47: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

String matching

Page 48: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Motivation

Page 49: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Formal model

∑∑

|Matches|= n − m +1cm

Page 50: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Naive algorithm

/* Search for pat[1...m] in text[1...n] R. Baeza-Yates Fig 10.1*/naive_search (text, n, pat, m)char text[], pat[];int n,m;{ int i,j,k,lim;

lim= n-m+1; for (i=1; i <= lim; i++) { k=i; for (j=1;j<=m && text[k] == pat[j]; j++) k++; if (j>m) rpt_match(i-j+1); }}

Page 51: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Naive alg. - Complexity

E(Cn ) = c

c −1(1− 1

cm)(n − m +1) + O(1)

Page 52: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Boyer-Moore [1977]

Page 53: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Smart shift after unsuccessful match

Page 54: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Horspool [1980]

Page 55: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Address table

Page 56: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Boyer-Moore/Horspool Alg

/* Search pat[1...m] in text[1...n] */bmhsearch ( text, n, pat, m)char text[], pat[];int n m;{ int d[ALPH_SIZE], i j, k, lim; char save;

/* Initialize d[] */ for (k=0; k<ALPH_SIZE; k++) d[k]=m+1; for (k=1; k<=m;k++) d[pat[k]] = m+1-k;

sav = pat[m+1]; pat [m+1] = STOPPER_CHAR; /* To avoid test for m=n-k+1 */

lim=n-m+1; for (k=1; k<=lim; k+= d[text[k+m]]) { i=k; for (j=1; text[i]==pat[j]; j++) i++; if (j==m+1) rpt_match (k); } pat[m+1] = sav;}

Page 57: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Shift-OR

Page 58: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Shift left one bit/char of pattern

Page 59: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

O(Cn) = E(Cn) = O (kn)

Page 60: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

Comparison

2 3 4 5 6 7 8 9 10 20

TIme

Patternlength (m)

Naive

Shift-OR

Boyer-Moore-Horspool

Page 61: Finding Out About - Chapter 2 Extracting Lexical Featuresrik/courses/cogs188_s10/slides/2-exlex.pdf · ethology, psychology, ecology, artificial intelligence, artificial life, robotics,

© R. K. Belew 1996-2001Finding Out About

PAT trees