Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

19
1 Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison Shih-Ming Bai and Shyi-Ming Chen Department of Computer Science and Informat ion Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, R.O.C.

description

Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison. Shih-Ming Bai and Shyi-Ming Chen Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, R.O.C. Outline. 1. Introduction - PowerPoint PPT Presentation

Transcript of Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

Page 1: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

1

Fuzzy Versus Quantitative Association Rules:A Fair Data-Driven Comparison

Shih-Ming Bai and Shyi-Ming Chen

Department of Computer Science and Information Engineering,

National Taiwan University of Science and Technology,

Taipei, Taiwan, R.O.C.

Page 2: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

2

Outline

1. Introduction

2. A New Method for Automatically Constructing Concept Maps Based on Fuzzy Rules

3. An Example

4. Conclusions

Page 3: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

3

1. Introduction

The discovery of knowledge in databases, also called data mining, is a most promising and important research area. In data mining, association rules are often used to represent and identify dependencies between attributes in a database.

In most real-life applications, databases contain many other values besides 0 and 1. Very common, for instance, are quantitative attributes such as age or income.

Page 4: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

4

2. Association Rule Mining

Table I and Table II presents what could happen if we replace the quantitative attributes in a small database by either binary or fuzzy attributes.

Page 5: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

5

Page 6: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

6

Page 7: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

7

Page 8: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

8

Page 9: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

9

3. Experimental Approach

A. Data Set: FAM95

FAM95.DAT contains data for the 63,756 families that were interviewed in the March 1995 Current Population Survey (CPS).

Page 10: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

10

B. Data-Driven Partition: Fuzzy c-means algorithm

Formula:

m = 1:

Page 11: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

11

m = 2:

m = 3:

Page 12: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

12

C. Comparing Association Rules

They compare the rankings obtained by the quantitative and the fuzzy algorithm using the Spearman rank correlation coefficient

Page 13: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

13

D. Quantitative Versus Fuzzy Association Rules

Table III lists the 20 strongest rules obtained from the discrete (m = 1) and the fuzzy algorithm (m = 3) along with their confidence and support values.

Page 14: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

14

Page 15: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

15

Page 16: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

16

4. Conclusion

The typical argumentation or motivation for involving fuzzy set theory in association rule mining is as follows:

1) that it allows for the rules to be formulated using vague linguistic

expressions, hence easier to grasp by humans;

2) that it suppresses the unwanted effect that boundary cases might cause.

Page 17: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

17

But quantitative association rule mining also gives (the same strong) rules formulated in the same way in natural

The sharp boundary problem is already inherently suppressed and can be further minimized by using sensible partitioning methods, as is already being done in quantitative association rule mining.

Page 18: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

18

Hence, we may expect rules obtained using a data-driven approach to be significantly different from the rules obtained using an expert-driven approach. The comparison of fuzzy and quantitative association rules using an expert-driven approach (for large databases) is certainly an interesting topic for future research.

In this case, however, experts should also define the crisp intervals that correspond best to human intuition! The common practice of comparing data-driven crisp data mining with expert-driven fuzzy data mining does not provide convincing arguments for the introduction of fuzzy association rules.

Page 19: Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison

19

Thank You!