SAK 5609 DATA MINING

12
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman [email protected] 03-89466514

description

SAK 5609 DATA MINING. Prof. Madya Dr. Md. Nasir bin Sulaiman [email protected] 03-89466514. Synopsis. Kredit: 3(3+0) Contact hours: 3 x 1 hour per week Semester: I - PowerPoint PPT Presentation

Transcript of SAK 5609 DATA MINING

Page 1: SAK 5609 DATA MINING

SAK 5609DATA MINING

Prof. Madya Dr. Md. Nasir bin Sulaiman

[email protected]

03-89466514

Page 2: SAK 5609 DATA MINING

Synopsis Kredit: 3(3+0) Contact hours: 3 x 1 hour per week Semester: I Emphasis on concepts of data mining. It includes

principles of data mining, data mining functions, data mining processes, data mining techniques such as K-nearest neighbour and clustering algorithms, rule induction, decision tree algorithms, association rule mining, neural networks and genetic algorithms; and data mining examples. Industrial and scientific applications will be given.

Page 3: SAK 5609 DATA MINING

Assessment & References Assessment:

– Exercises (10%)– Project I (15%) + presentation I (5%) Week 7 Project II (15%) + presentation II (5%) Week 14– Mid-exam 20% (1 hour) Week 6– Final exam 30% (1.5 hours) Week 15 - 17

References:– Jiawei Han & Micheline Kamber, (2006), “Data Mining: Concepts

and Techniques”, 2nd. Ed., Morgan Kaufman.– Michael J.A.Berry & Gordon S. Linoff, (2004), “Data Mining

Techniques (2nd edition)”, Wiley.– Other related articles

Page 4: SAK 5609 DATA MINING

Course Contents

Chapter 1 Introduction– Motivation– Origin of data mining– What it is/ isn’t– The KDD process– Types of data

Page 5: SAK 5609 DATA MINING

Chapter 2 Data mining tasks– Classification – Association rule mining – Sequential pattern mining– Clustering– Anomaly detection

Page 6: SAK 5609 DATA MINING

Chapter 3 Data issues– What is data set?– Types of attributes– Transformation for different types– Types of data

• Structured data, record data, data matrix, document data, transaction data, graph data, ordered data

– Data quality• Noise and outliers, missing values,

inconsistent/duplicate data

Page 7: SAK 5609 DATA MINING

Chapter 4 Data preprocessing– Why Data Preprocessing?– Why Is Data Preprocessing Important?– Major Tasks in Data Preprocessing

• Data Cleaning

• Data integration

• Data transformation

• Data reduction

• Data discretization

Page 8: SAK 5609 DATA MINING

Chapter 5 Association rule mining– Introduction– The Model– Goal and Key Features– Mining Algorithms– Problems with the Association Rule Model– Issues of association rules– Other Main Works on Association Rules

Page 9: SAK 5609 DATA MINING

Chapter 6 Sequential Pattern Mining– Sequence databases and pattern analysis– Mining algorithms– Challenges on sequential mining– Studies on sequential mining

Page 10: SAK 5609 DATA MINING

Chapter 7 Classification and Prediction– Classification Model– General Approach– Classification—A Two-Step Process

– Classification Techniques– Evaluating classification methods– Decision Tree Based Classification, rule based

classifiers, nearest neighbor classifiers etc

Page 11: SAK 5609 DATA MINING

Chapter 8 Clustering and Anomaly– What is/is not cluster analysis?

– Examples of clustering applications– Types of data in clustering analysis– Types of clustering – hierarchical, partitional– Major Clustering Techniques– Approaches to anomaly detection – Issues dealing with anomalies

Page 12: SAK 5609 DATA MINING

Chapter 9 Data Mining Applications