SAK 5609 DATA MINING
-
Upload
zephania-camacho -
Category
Documents
-
view
26 -
download
4
description
Transcript of SAK 5609 DATA MINING
Synopsis Kredit: 3(3+0) Contact hours: 3 x 1 hour per week Semester: I Emphasis on concepts of data mining. It includes
principles of data mining, data mining functions, data mining processes, data mining techniques such as K-nearest neighbour and clustering algorithms, rule induction, decision tree algorithms, association rule mining, neural networks and genetic algorithms; and data mining examples. Industrial and scientific applications will be given.
Assessment & References Assessment:
– Exercises (10%)– Project I (15%) + presentation I (5%) Week 7 Project II (15%) + presentation II (5%) Week 14– Mid-exam 20% (1 hour) Week 6– Final exam 30% (1.5 hours) Week 15 - 17
References:– Jiawei Han & Micheline Kamber, (2006), “Data Mining: Concepts
and Techniques”, 2nd. Ed., Morgan Kaufman.– Michael J.A.Berry & Gordon S. Linoff, (2004), “Data Mining
Techniques (2nd edition)”, Wiley.– Other related articles
Course Contents
Chapter 1 Introduction– Motivation– Origin of data mining– What it is/ isn’t– The KDD process– Types of data
Chapter 2 Data mining tasks– Classification – Association rule mining – Sequential pattern mining– Clustering– Anomaly detection
Chapter 3 Data issues– What is data set?– Types of attributes– Transformation for different types– Types of data
• Structured data, record data, data matrix, document data, transaction data, graph data, ordered data
– Data quality• Noise and outliers, missing values,
inconsistent/duplicate data
Chapter 4 Data preprocessing– Why Data Preprocessing?– Why Is Data Preprocessing Important?– Major Tasks in Data Preprocessing
• Data Cleaning
• Data integration
• Data transformation
• Data reduction
• Data discretization
Chapter 5 Association rule mining– Introduction– The Model– Goal and Key Features– Mining Algorithms– Problems with the Association Rule Model– Issues of association rules– Other Main Works on Association Rules
Chapter 6 Sequential Pattern Mining– Sequence databases and pattern analysis– Mining algorithms– Challenges on sequential mining– Studies on sequential mining
Chapter 7 Classification and Prediction– Classification Model– General Approach– Classification—A Two-Step Process
– Classification Techniques– Evaluating classification methods– Decision Tree Based Classification, rule based
classifiers, nearest neighbor classifiers etc
Chapter 8 Clustering and Anomaly– What is/is not cluster analysis?
– Examples of clustering applications– Types of data in clustering analysis– Types of clustering – hierarchical, partitional– Major Clustering Techniques– Approaches to anomaly detection – Issues dealing with anomalies
Chapter 9 Data Mining Applications