Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015...

20
Elnaz Delpisheh York University Department of Computer Science and Engineering July 4, 2022 Identifying Interesting Association Rules with Genetic Algorithms

Transcript of Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015...

Page 1: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Elnaz DelpishehYork University

Department of Computer Science and Engineering

April 21, 2023

Identifying Interesting Association Rules with Genetic

Algorithms

Page 2: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Data mining

2

Data

Data Mining

Association rules

Too much data

•I = {i1,i2,...,in} is a set of items.•D = {t1,t2,...,tn} is a transactional database.•ti is a nonempty subset of I.•An association rule is of the form AB, where A and B are the itemsets, A⊂ I, B⊂ I, and A∩B=∅ .•Apriori algorithm is mostly used for association rule mining.•{milk, eggs}{bread}.

Page 3: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Apriori Algorithm

TID List of item IDs

T100

I1,I2,I3

T200

I2, I4

T300

I2, I3

T400

I1,I2,I4

T500

I1, I3

T600

I2, I3

T700

I1, I3

T800

I1, I2, I3, I5

T900

I1, I2, I3

3

Page 4: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Apriori Algorithm (Cont.)

4

Page 5: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Association rule mining

5

Too many

association rules

Data

Data Mining

Association rules

Too much data

Page 6: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Interestingness criteria

6

Comprehensibility.Conciseness.Diversity.Generality.Novelty.Utility....

Page 7: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Interestingness measures

Subjective measuresData and the user’s prior knowledge are considered.Comprehensibility, novelty, surprisingness, utility.

Objective measuresThe structure of an association rule is considered.Conciseness, diversity, generality, peculiarity.Example: Support

It represents the generality of a rule. It counts the number of transactions containing both A and

B.

7

Page 8: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Drawbacks of objective measuresDetabase-dependence

Lack of knowledge about the databaseThreshold dependence

SolutionMultiple database reanalysis

Problemo Large number of disk I/O

Detabase-independence

8

Page 9: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Genetic algorithm-based learning (ARMGA )1. Initialize population2. Evaluate individuals in population3. Repeat until a stopping criteria is met

A. Select individuals from the current population

B. Recombine them to obtain more individualsC. Evaluate new individualsD. Replace some or all the individuals of the

current population by off-springs

4. Return the best individual seen so far

9

Page 10: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA ModelingGiven an association rule XYRequirement

Conf(XY) > Supp(Y)

Aim is to maximise

10

Page 11: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA EncodingMichigan Strategy

Given an association k-rule XY, where X,Y⊂I, I is a set of items I=i1,i2,..., in, and X∩Y=∅.

For example{A1,...,Aj}{Aj+1,...,Ak}

11

Page 12: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA Encoding (Cont.)

12

The aforementioned encoding highly depends on the length of the chromosome.

We use another type of encoding:Given a set of items {A,B,C,D,E,F}Association rule ACFB is encoded as follows

00A11B00C01D11E00F00: Item is antecedent11: Item is consequence01/10: Item is absent

Page 13: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA Operators

SelectCrossoverMutation

13

Page 14: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA Operators-SelectSelect(c,ps): Acts as a filter of the

chromosomeC: ChromosomePs: pre-specified probability

14

Page 15: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA Operators-CrossoverThis operation uses a two-point strategy

15

Page 16: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA Operators-Mutate

16

Page 17: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA Initialization

17

Page 18: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

ARMGA Algorithm

18

Page 19: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

Empirical studies and EvaluationImplement the entire procedure using

Visual C++Use WEKA to produce interesting

association rulesCompare the results

19

Page 20: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms.

20