L4-Phat Hien Luat Ket Hop[1]

download L4-Phat Hien Luat Ket Hop[1]

of 47

Transcript of L4-Phat Hien Luat Ket Hop[1]

Khai Ph D Liu

Nguyn Nht [email protected] Vin Cng ngh Thng tin v Truyn thng Trng i hc Bch Khoa H NiNm hc 2010-2011

Ni dung mn hc:Gii thiu v Khai ph d liu Gii thiu v cng c WEKA Tin x l d liu Pht hin cc lut kt hp Cc k thut phn lp v d on Cc k thut phn nhm

Khai Ph D Liu

2

Pht hin cc lut kt hp Gii thiuBi ton pht hin lut kt hp (Association rule mining)Vi mt tp cc g p giao dch (transactions) cho trc, cn tm cc ( ) , lut d on kh nng xut hin trong mt giao dch ca cc mc (items) ny da trn vic xut hin ca cc mc khcTID Items

Cc v d ca lut kt hp:{Diaper} {Beer} {Milk, Bread} {Eggs, Coke} {Beer, Bread} {Milk}

1 2 3 4 5

Bread, Milk Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke

Khai Ph D Liu

3

Cc nh ngha c bn (1)Tp mc (Itemset)Mt tp hp gm mt hoc nhiu mcV d: {Milk, Bread, Diaper}TID Items

Tp mc mc k (k-itemset)Mt tp mc gm k mc

1 2 3 4 5

Bread, Milk Bread, Diaper Beer Bread Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk Diaper Bread Milk, Diaper, Coke

Tng s h tr (Support count) (S )S ln xut hin ca mt tp mc V d: ({Milk, Bread, Diaper}) = 2

h tr (Support) sT l cc giao dch cha mt tp mc V d: s({Milk, Bread, Diaper}) = 2/5

Tp mc thng xuyn (Frequent/large itemset)Mt tp mc m h tr ln hn hoc bng mt gi tr ngng minsupKhai Ph D Liu 4

Cc nh ngha c bn (2)Lut kt hp (Association rule)Mt biu thc ko theo c dng: X Y, trong X v Y l cc tp mc V d: {Milk, Diaper} {Beer} {MilkTID Items

1 2 3 4 5

Bread, Milk Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke

Cc o nh gi lut h tr (Support) s ( pp ) T l cc giao dch cha c X v Y i vi tt c cc giao dch tin cy (Confidence) c T l cc giao dch cha c X v Y i vi cc giao dch cha XKhai Ph D Liu

{Milk , Diaper} Beers=c=

( Milk , Diaper, Beer )|T|

=

2 = 0 .4 5

(Milk, Diaper, Beer) 2 = = 0.67 (Milk, Diaper) 35

Pht hin cc lut kt hpVi mt tp cc giao dch T, mc ch ca bi ton pht hin lut kt hp l tm ra tt c cc lut c: h tr gi tr ngng minsup, v tin cy gi tr ngng minconf

Cch tip cn vt cn (Brute-force)Lit k tt c cc lut kt hp c th Tnh t T h ton h tr v ti cy cho mi l t t tin h i lut Loi b i cc lut c h tr nh hn minsup hoc c tin cy nh hn minconf

Phng php vt cn ny c chi ph tnh ton qu ln, khng p dng c trong thc t!Khai Ph D Liu 6

Pht hin lut kt hpTID Items

Cc lut kt hp:{Milk, Diaper} {Beer} {Milk, Beer} {Diaper} {Diaper, Beer} {Milk} {Beer} {Milk, Diaper} {Diaper} {Milk Beer} {Milk, {Milk} {Diaper, Beer} (s=0.4, c=0.67) (s=0.4, c=1.0) (s=0.4, c=0.67) (s=0.4, c=0.67) (s 0.4, c=0 5) (s=0 4 c 0.5) (s=0.4, c=0.5)

1 2 3 4 5

Bread, Milk Bread, Diaper, B B d Di Beer, E Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke

Tt c cc lut trn u l s phn tch (thnh 2 tp con) ca cng tp mc : {Milk, Diaper, Beer} Cc lut sinh ra t cng mt tp mc s c cng h tr, nhng c th khc v tin cy Do , trong qu trnh pht hin lut kt hp, chng ta c th tch ring 2 yu cu v h tr v tin cyKhai Ph D Liu 7

Pht hin lut kt hpQu trnh pht hin lut kt hp s gm 2 bc (2 giai on) quan trng:Sinh ra cc tp mc thng xuyn (frequent/large itemsets) Sinh ra tt c cc tp mc c h tr minsup Sinh ra cc lut kt hp T mi tp mc thng xuyn (thu c bc trn), sinh ra tt c cc lut c tin cy cao ( minconf) Mi lut l mt phn tch nh phn (phn tch thnh 2 phn) ca mt tp mc thng xuyn

Bc sinh ra cc tp mc thng xuyn (bc th 1) vn c chi ph tnh ton qu cao!Khai Ph D Liu 8

Lattice biu din cc tp mc cn xtnull A B C D E

AB

AC

AD

AE

BC

BD

BE

CD

CE

DE

ABC

ABD

ABE

ACD

ACE

ADE

BCD

BCE

BDE

CDE

Vi d mc, th phi xt n 2d cc tp mc c th!

ABCD

ABCE

ABDE

ACDE

BCDE

ABCDE

Khai Ph D Liu

9

Sinh ra cc tp mc thng xuynTID 1 2 3 4 5 Items Bread, B d Milk Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke

Phng php vt cn (Brute force) (Brute-force) Mi tp mc trong lattice u c xt Tnh h tr ca mi tp mc, bng cch duyt qua tt c cc giao d h i dch Vi mi giao dch, so snh n vi mi tp mc c xt phc tp ~ O(N.M.w) Vi M = 2d, th phc tp ny l qu ln!Khai Ph D Liu 10

Cc chin lc sinh tp mc thng xuynGim bt s lng cc tp mc cn xt (M) Tm kim (xt) y : M=2d ( ) y S dng cc k thut ct ta (pruning) gim gi tr M Gim bt s lng cc giao dch cn xt (N) g g ( ) Gim gi tr N, khi kch thc (s lng cc mc) ca tp mc tng ln Gim bt s lng cc so snh (matchings/comparisons) gia cc tp mc v cc g giao dch (N.M) ( ) S dng cc cu trc d liu ph hp (hiu qu) lu cc tp mc cn xt hoc cc giao dch Khng cn phi so snh mi tp mc vi mi giao dchKhai Ph D Liu 11

Gim bt s lng cc tp mc cn xtNguyn tc ca gii thut Apriori Loi b (prunning) da trn h trNu mt tp mc l thng xuyn, th tt c cc tp con (subsets) ca n u l cc tp mc thng xuyn Nu mt tp mc l khng thng xuyn (not frequent) th tt c frequent), cc tp cha (supersets) ca n u l cc tp mc khng thng xuyn

Nguyn tc ca gii thut Apriori da trn c tnh khng n iu (anti-monotone) ca h tr

X , Y : ( X Y ) s( X ) s(Y ) h tr ca mt tp mc nh hn h tr ca cc tp con ca nKhai Ph D Liu 12

Apriori: Loi b da trn h tr

Tp mc khng thng xuyn

Cc tp cha ca tp mc m c (AB) b loi bKhai Ph D Liu 13

Apriori: Loi b da trn h trItem Bread Coke C k Milk Beer Diaper Eggs Count 4 2 4 3 4 1

Cc tp mc mc 1 (1-itemsets)Itemset {Bread,Milk} {Bread,Beer} {Bread,Diaper} {Milk,Beer} {Milk,Diaper} {Beer,Diaper} Count 3 2 3 2 3 3

Cc tp mc mc 2 (2itemsets) (Khng cn xt cc tp mc c cha mc Coke hoc Eggs) Cc tp mc mc 3 (3-itemsets)

minsup = 3

Nu xt tt c cc tp mc c th: 6C + 6C + 6C = 41 1 2 3 Vi c ch loi b da trn h tr: 6 + 6 + 1 = 13

Ite m s e t { B r e a d ,M ilk ,D ia p e r }

C ount 3

Khai Ph D Liu

14

Gii thut AprioriSinh ra tt c cc tp mc thng xuyn mc 1 (frequent 1 itemsets): 1-itemsets): cc tp mc thng xuyn ch cha 1 mc Gn k = 1 Lp li, cho n khi khng c thm bt k tp mc p g p thng xuyn no miT cc tp mc thng xuyn mc k (cha k mc), sinh ra cc tp mc mc mc (k 1) cn xt (k+1) Loi b cc tp mc mc (k+1) cha cc tp con l cc tp mc khng thng xuyn mc k Tnh h tr ca mi tp mc mc (k+1), bng cch duyt qua (k+1) tt c cc giao dch Loi b cc tp mc khng thng xuyn mc (k+1) Thu c cc tp mc thng xuyn mc (k+1)Khai Ph D Liu 15

Gim bt s lng cc so snhCc so snh (matchings/comparisons) gia cc tp mc cn xt v cc giao dchCn phi duyt qua tt c cc giao dch, tnh h tr ca mi tp mc cn xt

gim bt s lng cc so snh, cn s dng cu trc bm (hash structure) lu cc tp mc cn xtThay v phi so snh mi giao dch vi mi tp mc cn xt, th ch cn so snh n vi cc tp mc cha trong cc (hashed buckets)

TID 1 2 3 4 5

Items Bread, Milk , Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke

Khai Ph D Liu

16

Sinh ra cy bm (hash tree)Gi s chng ta c 15 tp mc mc 3 cn xt:{1 4 5}, {1 2 4}, {4 5 7}, {1 2 5}, {4 5 8}, {1 5 9}, {1 3 6}, {2 3 4}, {5 6 7}, {3 4 5}, {3 5 6}, {3 5 7}, {6 8 9} {3 6 7} {3 6 8} 5} 6} 7} 9}, 7},

Sinh ra cy bm (Hash tree):Hm bm (Hash function) V d: h(p) = p mod 3 Kch thc ti a ca nt l (Max leaf size): S lng ti cc tp mc c lu mt nt l (Nu s lng cc tp mc vt qu gi tr ny, nt s tip tc b phn chia) V d: Max leaf size = 3

(Hm bm) 3,6,9 1,4,7 147 2,5,8

145 124 457Khai Ph D Liu

234 567 345 136 125 458 159

356 357 689

367 368

17

Pht hin lut kt hp bng cy bm (1)(Hm bm)

Cy bm lu cc tp mc cn xt

1,4,7 2,5,8 258

3,6,9

234 567 Bm (hash) i vi 1, 4, hoc 7 145 136 345 124 457 125 458Khai Ph D Liu 18

356 357 689

367 368

159

Pht hin lut kt hp bng cy bm (2)(Hm bm)

Cy bm lu cc tp mc cn xt

1,4,7 2,5,8 258

3,6,9

234 567 145 Bm (hash) (h h) i vi 2, 5, hoc 8 136 345 124 457 125 458Khai Ph D Liu 19

356 357 689

367 368

159

Pht hin lut kt hp bng cy bm (3)(Hm bm)

Cy bm lu cc tp mc cn xt

1,4,7 2,5,8 258

3,6,9

234 567 145 Bm (hash) i vi 3, 6, hoc 9 136 345 124 457 125 458Khai Ph D Liu 20

356 357 689

367 368

159

Cc tp mc mc k trong mt giao dchi vi giao dch t, hy xc nh cc tp mc mc 3?G Gi s trong o g mi tp mc, cc mc c lit k theo th t t in

Khai Ph D Liu

21

Xc nh cc tp mc bng cy bm (1)1 2 3 5 6 Giao dch t 1+ 2356 2+ 356 3+ 56234 567 145 136 345 124 457 125 458 159 356 357 689 367 368(Hm bm)

1,4,7 2,5,8

3,6,9

Khai Ph D Liu

22

Xc nh cc tp mc bng cy bm (2)1 2 3 5 6 Giao dch t 1+ 2356 12+ 356 13+ 56 15+ 6145 136 345 124 457 125 458 159 356 357 689 367 368 234 567(Hm bm)

2+ 356 3+ 56

1,4,7 2,5,8

3,6,9

Ch cn so snh giao dch t vi 11 g (trong tng s 15) tp mc cn xt!Khai Ph D Liu 23

Apriori: Cc yu t nh hng phc tpLa chn gi tr ngng minsupGi tr minsup qu thp s sinh ra nhiu tp mc thng xuyn iu ny c th lm tng s lng cc tp mc phi xt v di (kch thc) ti a ca cc tp mc thng xuyn

S lng cc mc trong c s d liu (cc giao dch)Cn thm b nh lu gi tr h tr i vi mi mc Nu s lng cc mc (tp mc mc 1) thng xuyn tng ln, th chi ph tnh ton v chi ph I/O (duyt cc giao dch) cng tng

Kch thc ca c s d liu (cc giao dch)Gii thut Apriori duyt c s d liu nhiu ln. Do , chi ph tnh ton ca Apriori tng ln khi s lng cc giao dch tng ln

Kch thc trung bnh ca cc giao dchKhi kch thc (s lng cc mc) trung bnh ca cc giao dch tng ln, th di ti a ca cc tp mc thng xuyn cng tng, tng v chi ph duyt cy bm cng tngKhai Ph D Liu 24

Biu din cc tp mc thng xuynTrong thc t, s lng cc tp mc thng xuyn c sinh ra t mt csdl giao dch c th rt ln Cn mt cch biu din ngn gn (compact representation)Bng mt tp (nh) cc tp mc thng xuyn i din m c th dng suy ra (sinh ra) tt c cc tp mc thng xuyn khc

C 2 cch biu din nh vyCc tp mc thng xuyn ln nht (Maximal frequent itemsets) Cc tp mc thng xuyn ng (Closed frequent itemsets)

Khai Ph D Liu

25

Cc tp mc thng xuyn ln nhtMt tp mc thng xuyn l ln nht (Maximal frequent itemset), nu mi tp cha (superset) ca n u l tp mc khng thng xuynCc tp mc thng xuyn ln nht

Cc tp mc khng thng xuynKhai Ph D Liu

Ranh gii26

Cc tp mc thng xuyn ngMt tp mc thng xuyn l ng (Closed frequent itemset), nu khng c tp cha no ca n c cng h tr vi n g p g Itemset {A} {B} {C} { } {D} {A,B} {A,C} {A,D} {B,C} {B,D} {C,D} Support 4 5 3 4 4 2 3 3 4 3

TID 1 2 3 4 5

Items {A,B} {B,C,D} {A,B,C,D} {A B C D} {A,B,D} {A,B,C,D}

Itemset Support {A,B,C} 2 {A,B,D} 3 {A,C,D} 2 {B,C,D} 3 {A,B,C,D} 2

Khai Ph D Liu

27

Tp mc thng xuyn: ln nht vs. ng (1)null

TIDs245D E

TID 1 2 3 4 5

Items ABC ABCD BCE ACDE DE12ABC

124A

123B

1234C

345

12AB

124AC

24AD

4AE

123BC

2BD

3

BE

24

CD

34

CE

45

DE

2ABD ABE

24ACD

4ACE

4ADE

2BCD

3

BCE

BDE

4

CDE

2

4ABCD ABCE ABDE ACDE BCDE

Khng c h tr bi g bt k giao dch noKhai Ph D Liu

ABCDE

28

Tp mc thng xuyn: ln nht vs. ng (2)Minsup = 2124A null

ng, nhng khng phi l ln nht245D E

123B

1234C

345

ng v ln nht34CE

12AB

124AC

24AD

4AE

123BC

2BD

3

BE

24

CD

45

DE

12ABC

2ABD ABE

24ACD

4ACE

4ADE

2BCD

3

BCE

BDE

4

CDE

2

4ABCD ABCE ABDE ACDE BCDE

# ng = 9 # Ln nht = 4

ABCDE

Khai Ph D Liu

29

Tp mc thng xuyn: ln nht vs. ng (3)Bt k tp mc thng xuyn ln nht no cng l tp mc thng xuyn ng Cch biu din s dng tp mc thng xuyn ln g y nht khng gi thng tin v h tr ca cc tp con (ca mi tp mc thng xuyn ln nht)Khai Ph D Liu

30

Gii thut FP-GrowthMt phng php khc cho vic xc nh cc tp mc thng xuynNh li: Apriori s dng c ch sinh-kim tra (sinh ra cc tp mc cn xt, v kim tra xem mi tp mc c phi l thng xuyn)

FP-Growth biu din d liu ca cc giao dch bng mt cu trc d liu gi l FP-tree FP tree FP-Growth s dng cu trc FP-tree xc nh trc tip cc tp mc thng xuyn

Khai Ph D Liu

31

Biu din bng FP-treeVi mi giao dch, FP-tree xy dng mt ng i (path) trong cy Hai giao dch c cha cng mt s cc mc, th ng i ca chng s c phn (on) chungCng nhiu cc ng i c cc phn chung, th vic biu din bng FP-tree s cng gn (compressed/compacted)

Nu kch thc ca FP-tree nh c th lu tr trong b nh lm vic, th gii thut FP-Growth c th xc nh cc tp mc thng xuyn trc tip t FP tree FP-tree lu trong b nhKhng cn phi lp li vic duyt d liu lu trn cngKhai Ph D Liu

32

Xy dng FP-tree (1)Ban u, FP-tree ch cha duy nht nt gc (c biu ) din bi k hiu null) C s d liu cc giao dch c duyt ln th 1, xc nh (tnh) h tr ca mi mc ( ) Cc mc khng thng xuyn (infrequent items) b loi b Cc m c th ng xuyn (frequent items) c sp xp mc thng n (freq ent c p theo th t gim dn v h trTrong v d ( cc slides tip theo), th t gim dn v h tr: A, B, C, D, E

C s d liu cc giao dch c duyt ln th 2, xy dng FP t d FP-treeKhai Ph D Liu 33

Xy dng FP-tree (2)(Sau khi xt giao dch th 1)

null A:1 A1 A:1 B:1 null A:2 B:1 C:1 C1

null

(Sau khi xt giao dch th 2)

B:1 B1 C:1 D:1 B:1 C:1(Sau khi xt giao dch th 3)

TID 1 2 3 4 5 6 7 8 9 10

Items { , } {A,B} {B,C,D} {A,C,D,E} {A,D,E} {A,B,C} {A,B,C,D} {A} {A,B,C} {A B C} {A,B,D} {B,C,E}

B:1

D:1 E:1Khai Ph D Liu

D:1 D134

Xy dng FP-tree (3)TID 1 2 3 4 5 6 7 8 9 10 Items {A,B} {B,C,D} {A,C,D,E} {A,D,E} {A,B,C} {A,B,C,D} {A} {A,B,C} {A,B,D} { , , } {B,C,E}

C s d liu cc giao dch

null A:8

(Sau khi xt giao dch th 10)

B:2 C:2 D:1

B:5 C:3 D:1 D:1

C:1 D:1 E:1

D:1 E:1

Bng con trItem A B C D E Pointer

E:1

Cc con tr c s dng trong q qu trnh sinh cc tp mc p thng xuyn ca FP-GrowthKhai Ph D Liu 35

FP-Growth: Sinh cc tp mc thng xuynFP-Growth sinh cc tp mc thng xuyn trc tip t FP tree, FP-tree t mc l n mc gc (bottom up) (bottom-up)Trong v d trn, FP-Growth trc ht tm cc tp mc thng xuyn kt thc bi E sau mi tm cc tp mc thng xuyn kt thc bi D bi C bi B v bi A D C B

V mi giao dch c biu din bng mt ng i trong FP-tree, chng ta c th xc nh cc tp mc FP tree, thng xuyn kt thc bi mt mc (vd: E), bng cch duyt cc ng i cha mc (E)Nhng Nh ng i ny c xc nh d dng b cc con t h d bng tr gn vi nt (vd: E)

Khai Ph D Liu

36

Cc ng i kt thc bi mt mc(Cc ng g i kt thc bi e) (Cc ng g i kt thc bi d)

( (Cc ng i g kt thc bi c)

( (Cc ng i g kt thc bi b)Khai Ph D Liu

( (Cc ng i g kt thc bi a)37

Xc nh cc tp mc thng xuynFP-Growth tm tt c cc tp mc thng xuyn kt thc bi mt mc da theo chin lc chia tr (divide(divide and-conquer)V d, cn tm tt c cc tp mc thng xuyn kt thc bi e Trc ht, ki t t T ht kim tra tp mc mc 1 ({ }) c phi l t mc ({e}) hi tp thng xuyn Nu n l tp mc thng xuyn, xt cc bi ton con: tm tt c cc t mc thng xuyn kt thc bi d tp th th de bi cebi b bi bev bi ae Mi bi ton con nu trn li c phn tch thnh cc bi ton con nh hn h h Kt hp cc li gii ca cc bi ton con, chng ta s thu c cc tp mc thng xuyn kt thc bi eKhai Ph D Liu

38

Vd: Cc tp mc thng xuyn kt thc bi eXc nh tt c cc ng i trong FP-tree kt thc bi e FP treeCc ng i tin t (prefix paths) i vi e

Da vo cc ng i tin t i vi e, xc nh h tr ca e, bng cch cng cc gi tr h tr gn vi nt e Gi s minsup=2 th tp mc {e} minsup=2, l tp mc thng xuyn (v n c h tr =3 > minsup)Khai Ph D Liu

Cc ng i tin t i vi e

39

Vd: Cc tp mc thng xuyn kt thc bi eV {e} l tp mc thng xuyn, nn FP-Growth phi gii quyt cc bi ton con: tm cc tp mc thng xuyn kt thc bi debi cebi bev bi ae Trc tin, cn chuyn cc ng i tin t ca e thnh biu din FP-tree c iu kin (conditional FP-tree)C cu trc tng t nh FP-tree c d t cc tp mc thng xuyn kt thc bi mt dng tm t th th t mc

Khai Ph D Liu

40

Xy dng FP-tree c iu kinCp nht cc gi tr h tr i vi cc ng i tin tV mt s gi tr h tr tnh n c cc giao dch khng cha mc e V d: ng i null b:2 c:2 e:1 g tnh n c giao dch {b,c} khng cha mc e. Do , gi tr h tr phi gn bng 1, th hin s lng cc giao dch cha {b,c,e}

Loi b nt e khi cc ng i tin t Sau khi cp nht cc gi tr h tr i vi cc ng i tin t, mt s mc c th tr nn khng thng xuyn B loi bVd: Nt b by gi c gi tr h tr =1Khai Ph D Liu

FP-tree c iu kin i vi e

41

Vd: Cc tp mc thng xuyn kt thc bi eFP-Growth s dng cu trc biu din FPtree c iu kin i vi e, gii quyt cc bi ton con: tm cc tp mc thng xuyn kt thc bi debi cebi bev bi ae Vd: Vd t cc tp mc thng xuyn kt tm t th thc bi de, cc ng i tin t i vi d c xy dng t biu din FP-tree c iu kin ki i vi e i Bng cch cng vi gi tr h tr gn vi nt d, d chng ta xc nh c h tr cho tp {d,e} h tr ca {d,e}=2: n l mt tp mc thng th xuyn Khai Ph D Liu 42

Cc ng i tin t i vi de

Sinh ra cc lut kt hp (1)Vi mi tp mc thng xuyn L, cn tm tt c cc tp con khc rng f L sao cho: f L f tha mn iu kin v tin cy ti thiu Vd: Vi tp mc thng xuyn {A,B,C,D}, cc lut cn xt gm c: ABC D, ABD C, ACD B, BCD A, A BCD BCD, B ACD ACD, C ABD ABD, D ABC AB CD, AC BD, AD BC, BC AD, BD AC, CD AB, Nu |L| = k, th s phi xt (2k 2) cc lut kt hp c th (b qua 2 lut: L v L)Khai Ph D Liu 43

Sinh ra cc lut kt hp (2)Lm th no sinh ra cc lut t cc tp mc thng xuyn, mt cch c hiu qu? y , q Xt tng qut, tin cy khng c c tnh khng n iu (anti-monotone)c(ABC D) c th ln hn hoc nh hn c(AB D)

Nhng, tin cy ca cc lut c sinh ra t cng mt tp mc thng xuyn th l i c tnh kh h h li c h khng n iuV d: Vi L = {A,B,C,D}: c(ABC D) c(AB CD) c(A BCD) tin cy c c tnh khng n iu i vi s lng cc mc v phi ca lut p utKhai Ph D Liu 44

Apriori: Sinh ra cc lut (1)Lattice ca cc lutLut c tin cy thp

Cc lut b loi bKhai Ph D Liu 45

Apriori: Sinh ra cc lut (2)Cc lut cn xt c sinh ra bng cch kt hp 2 lut c cng tin t (phn bt u) ca phn kt lun (rule consequent)CD=>AB BD=>AC

V d: Kt hp 2 lut (CD=>AB, BD=>AC) s sinh ra lut cn xt D => ABC Loi b lut D=>ABC nu bt k mt lut con ca n (AD=>BC, BCD=>A, ) khng c tin cy cao (< minconf)Khai Ph D Liu 46

D=>ABC

Ti liu tham kho P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining (chapter 6). Addison Wesley, 2005. Addison-Wesley,

Khai Ph D Liu

47