Introduction to Data Science and Data...

Post on 11-May-2018

232 views 5 download

Transcript of Introduction to Data Science and Data...

Rapidminer

Juan Camilo Estevez Cárdenas

July 5th to 29th of 2016

Juan Camilo

Estevez

Cárdenas

Ingeniero de Sistemas Universidad Nacional de Colombia

2013

Maestría en Ingeniería Industrial Universidad Nacional de Colombia

2015

Beca Asistente Docente Programación de computadores

Universidad Nacional de Colombia

2013 – 2014

Universidad de Buenos Aires UBA

Gerencia de proyectos informáticos

Sistemas Inteligentes

2015 – I

Project Manager Professional (PMP)

Project Management Institute

Organizational analytical evolution

Advanced analytics

Business Intelligence Architecture

(Rapidminer,2015)

(Chaudhuri,2011)

Rapidminer

OPEN SOURCE DATA SCIENCE PLATFORM

Prep data, create models, validate, operationalize and embed in business processes.

https://rapidminer.com/

http://www.kdnuggets.com

/

Data scientist tool free

of code

Characteristics

Connection to different data sources

- Excel, CSV, data bases, text files,

dropbox, amazon, twitter, salesforce.

Preprocessing or data preparation (format

and cleaning)

- Creation attributes, - Format and cleaning attributes, - Table operations, replaces, - Filters- Type conversions- Missing values treatment- Normalization- Oultiers treatment.

Characteristics

Modeling (Data mining)

- Predictive

- Segmentation (Clustering)

- Classification

- Association

- Correlation

Models Validation

- Cross validation, split validation...

Characteristics

Extensions

- Series

- R

- Python

- Text processing

- Weka

- Reporting

Learning rapidminer

Documentation

- Web page: http://docs.rapidminer.com/

- Stand alone installation:

Examples

- Welcome window

- Click on operator and review help menú

- Repository Samples

Rapidminer Academia

https://rapidminer.com/academia/studen-ts/

Rapidminer example Beauty Data

- Load data from BeautyData.csv

- Exploratory data analysis.

- Example of Decision tree with rapidminer

Bibliography

Chaudhuri, S., Dayal, U., & Narasayya, V. (2011). An overview of

business intelligence technology. Communications of the ACM,

54(8), 88. doi:10.1145/1978542.1978562

Laudon, K. C., & Laudon, J. P. (2012). Management Information

Systems (12th ED). Prentice Hall.

http://businessanalytics.com.mx/2014/08/27/diferencias-entre-

business-analytics-y-business-intelligence/

Gartner.Magic Cuadrant Survey, 2012.

Rapidminer. 2015. An introduction to advanced analytics