Data Mining: Penelitian Data Mining Romi Satria Wahono [email protected] +6281586220090.

19
Data Mining: Penelitian Data Mining Romi Satria Wahono [email protected] http://romisatriawahono.net +6281586220090

Transcript of Data Mining: Penelitian Data Mining Romi Satria Wahono [email protected] +6281586220090.

Page 1: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Data Mining:Penelitian Data Mining

Romi Satria [email protected]://romisatriawahono.net

+6281586220090

Page 2: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

SD Sompok Semarang (1987) SMPN 8 Semarang (1990) SMA Taruna Nusantara, Magelang (1993) S1, S2 dan S3 (on-leave)

Department of Computer SciencesSaitama University, Japan (1994-2004)

Research Interests: Software Engineering and Intelligent Systems

Founder IlmuKomputer.Com Peneliti LIPI (2004-2007) Founder dan CEO PT Brainmatics Cipta Informatika

Romi Satria Wahono

Page 3: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Course Outline1. Pengenalan Data Mining2. Proses Data Mining3. Evaluasi dan Validasi pada Data Mining4. Metode dan Algoritma Data Mining5. Penelitian Data Mining

Page 4: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Penelitian Data Mining

Page 5: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Penelitian Data Mining1. Standard Proses Penelitian pada Data Mining2. Journal Publications on Data Mining3. Research on Classification4. Research on Clustering5. Research on Prediction6. Research on Association Rule

Page 6: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Standard Proses Penelitian pada Data Mining

Page 7: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Data Mining Standard Process (CRISP–DM)

A cross-industry standard was clearly required that is industry neutral, tool-neutral, and application-neutral

The Cross-Industry Standard Process for Data Mining (CRISP–DM) was developed in 1996 (Chapman, 2000)

CRISP-DM provides a nonproprietary and freely available standard process for fitting data mining into the general problem-solving strategy of a business or research unit

Page 8: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

CRISP-DM

Page 9: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

1. Business Understanding Phase Enunciate the project objectives and requirements

clearly in terms of the business or research unit as a whole

Translate these goals and restrictions into the formulation of a data mining problem definition

Prepare a preliminary strategy for achieving these objectives

Page 10: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

2. Data Understanding Phase Collect the data Use exploratory data analysis to familiarize yourself

with the data and discover initial insights Evaluate the quality of the data If desired, select interesting subsets that may

contain actionable patterns

Page 11: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

3. Data Preparation Phase Prepare from the initial raw data the final data set

that is to be used for all subsequent phases. This phase is very labor intensive

Select the cases and variables you want to analyze and that are appropriate for your analysis

Perform transformations on certain variables, if needed

Clean the raw data so that it is ready for the modeling tools

Page 12: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

4. Modeling phase Select and apply appropriate modeling techniques Calibrate model settings to optimize results Remember that often, several different techniques

may be used for the same data mining problem If necessary, loop back to the data preparation

phase to bring the form of the data into line with the specific requirements of a particular data mining technique

Page 13: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

5. Evaluation phase Evaluate the one or more models delivered in the

modeling phase for quality and effectiveness before deploying them for use in the field

Determine whether the model in fact achieves the objectives set for it in the first phase

Establish whether some important facet of the business or research problem has not been accounted for sufficiently

Come to a decision regarding use of the data mining results

Page 14: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

6. Deployment phase Make use of the models created: Model creation

does not signify the completion of a project Example of a simple deployment: Generate a report Example of a more complex deployment:

Implement a parallel data mining process in another department

For businesses, the customer often carries out the deployment based on your model

Page 15: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Latihan Pelajari dan pahami Case Study 1-5 dari buku

Larose (2005) Chapter 1

Pelajari dan pahami bagaimana menerapkan CRISP-DM pada tesis Firmansyah (2011) tentang penerapan algoritma C4.5 untuk penentuan kelayakan kredit

Page 16: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Journal Publications on Data Mining

Page 17: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Transactions and Journals Review Paper (survey and state-of-the-art):

• ACM Computing Surveys (CSUR)

Research Paper (technical):• ACM Transactions on Knowledge Discovery from Data (TKDD)• ACM Transactions on Information Systems (TOIS)• IEEE Transactions on Knowledge and Data Engineering• Springer Data Mining and Knowledge Discovery • International Journal of Business Intelligence and Data Mining

(IJBIDM)

Page 18: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Cognitive Assignment III1. Baca 1 paper ilmiah yang diterbitkan di journal 2010-2012 yang

berhubungan dengan metode data mining yang sudah kita pelajari

2. Rangkumkan masing-masing dalam bentuk slide dengan struktur:1. Latar Belakang Masalah (Research Background)2. Pernyataan Masalah (Problem Statements)3. Pertanyaan Penelitian (Research Questions)4. Tujuan Penelitian (Research Objective)5. Metode-Metode yang Sudah Ada (Existing Methods)6. Metode yang Diusulkan (Proposed Method)7. Hasil (Results)8. Kesimpulan (Conclusion)

3. Presentasikan di depan kelas pada mata kuliah berikutnya

Page 19: Data Mining: Penelitian Data Mining Romi Satria Wahono romi@romisatriawahono.net  +6281586220090.

Referensi1. Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: Practical

Machine Learning Tools and Techniques 3rd Edition, Elsevier, 2011

2. Daniel T. Larose, Discovering Knowledge in Data: an Introduction to Data Mining, John Wiley & Sons, 2005

3. Florin Gorunescu, Data Mining: Concepts, Models and Techniques, Springer, 2011

4. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques Second Edition, Elsevier, 2006

5. Oded Maimon and Lior Rokach, Data Mining and Knowledge Discovery Handbook Second Edition, Springer, 2010

6. Warren Liao and Evangelos Triantaphyllou (eds.), Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications, World Scientific, 2007