Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ......

25
Introduction Evangelos Pournaras, Izabela Moise, Dirk Helbing Evangelos Pournaras, Izabela Moise, Dirk Helbing 1

Transcript of Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ......

Page 1: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

IntroductionEvangelos Pournaras Izabela Moise Dirk Helbing

Evangelos Pournaras Izabela Moise Dirk Helbing 1

Outline

1 Data Science

2 Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 2

Part 1 - Data Science

Evangelos Pournaras Izabela Moise Dirk Helbing 3

What is Data Science

A collection of orchestrated methods from different scientific fieldseg statistics computer science etc that provide understanding ofdomain data and result in data-based products and services

Evangelos Pournaras Izabela Moise Dirk Helbing 4

Is Data Science about Big Data I

Evangelos Pournaras Izabela Moise Dirk Helbing 5

Is Data Science about Big Data II

Itrsquos more about using the right dataand asking the right questions

Evangelos Pournaras Izabela Moise Dirk Helbing 6

What about Techno-socio-economic Systems

Evangelos Pournaras Izabela Moise Dirk Helbing 7

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 2: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Outline

1 Data Science

2 Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 2

Part 1 - Data Science

Evangelos Pournaras Izabela Moise Dirk Helbing 3

What is Data Science

A collection of orchestrated methods from different scientific fieldseg statistics computer science etc that provide understanding ofdomain data and result in data-based products and services

Evangelos Pournaras Izabela Moise Dirk Helbing 4

Is Data Science about Big Data I

Evangelos Pournaras Izabela Moise Dirk Helbing 5

Is Data Science about Big Data II

Itrsquos more about using the right dataand asking the right questions

Evangelos Pournaras Izabela Moise Dirk Helbing 6

What about Techno-socio-economic Systems

Evangelos Pournaras Izabela Moise Dirk Helbing 7

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 3: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Part 1 - Data Science

Evangelos Pournaras Izabela Moise Dirk Helbing 3

What is Data Science

A collection of orchestrated methods from different scientific fieldseg statistics computer science etc that provide understanding ofdomain data and result in data-based products and services

Evangelos Pournaras Izabela Moise Dirk Helbing 4

Is Data Science about Big Data I

Evangelos Pournaras Izabela Moise Dirk Helbing 5

Is Data Science about Big Data II

Itrsquos more about using the right dataand asking the right questions

Evangelos Pournaras Izabela Moise Dirk Helbing 6

What about Techno-socio-economic Systems

Evangelos Pournaras Izabela Moise Dirk Helbing 7

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 4: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

What is Data Science

A collection of orchestrated methods from different scientific fieldseg statistics computer science etc that provide understanding ofdomain data and result in data-based products and services

Evangelos Pournaras Izabela Moise Dirk Helbing 4

Is Data Science about Big Data I

Evangelos Pournaras Izabela Moise Dirk Helbing 5

Is Data Science about Big Data II

Itrsquos more about using the right dataand asking the right questions

Evangelos Pournaras Izabela Moise Dirk Helbing 6

What about Techno-socio-economic Systems

Evangelos Pournaras Izabela Moise Dirk Helbing 7

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 5: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Is Data Science about Big Data I

Evangelos Pournaras Izabela Moise Dirk Helbing 5

Is Data Science about Big Data II

Itrsquos more about using the right dataand asking the right questions

Evangelos Pournaras Izabela Moise Dirk Helbing 6

What about Techno-socio-economic Systems

Evangelos Pournaras Izabela Moise Dirk Helbing 7

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 6: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Is Data Science about Big Data II

Itrsquos more about using the right dataand asking the right questions

Evangelos Pournaras Izabela Moise Dirk Helbing 6

What about Techno-socio-economic Systems

Evangelos Pournaras Izabela Moise Dirk Helbing 7

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 7: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

What about Techno-socio-economic Systems

Evangelos Pournaras Izabela Moise Dirk Helbing 7

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 8: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

ICT amp Techno-socio-economic Systems

bull Embedded ICT systems in most societal domains How

bull Internet of Things pervasiveubiquitous computing advancednetworking systems inter-operability Result

bull A new explosion of data sources Opportunities

bull Understanding improving managing amp sustaining our complexsociety Threats

bull Privacy discrimination misinterpretations over-fitting etc

Evangelos Pournaras Izabela Moise Dirk Helbing 8

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 9: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Threats I

Evangelos Pournaras Izabela Moise Dirk Helbing 9

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 10: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Threats II

Evangelos Pournaras Izabela Moise Dirk Helbing 10

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 11: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Who is a Data Scientist

bull A statistician

bull A computer programmer

bull Both and More

TipDomain knowledge can be more valuable than machine learning datamining etc

Evangelos Pournaras Izabela Moise Dirk Helbing 11

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 12: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Real-world Profile I

Evangelos Pournaras Izabela Moise Dirk Helbing 12

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 13: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Real-world Profile II

Evangelos Pournaras Izabela Moise Dirk Helbing 13

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 14: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

More about Data Scientists

Evangelos Pournaras Izabela Moise Dirk Helbing 14

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 15: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Part 2 - Course Description

Evangelos Pournaras Izabela Moise Dirk Helbing 15

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 16: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Course ObjectivesQualify you with knowledge amp skills to tackle real-world problemsusing data

1 Acquiring domain knowledge and understanding2 Better understanding and interpretation of data

3 Awareness about the applicability of different data sciencemethods

4 Development of technical skills eg programming use ofdifferent tools etc

5 Presenting scientific results both written and orally

Evangelos Pournaras Izabela Moise Dirk Helbing 16

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 17: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Course Prerequisites

Some programming skills are required eg skills for the material ofthis course

1 JavaC++

2 UNIX

Didnrsquot you have an opportunity to practice this earlier

No problem this is a golden opportunity

TipProgramming skills will make you more flexible and efficient datascientist

Evangelos Pournaras Izabela Moise Dirk Helbing 17

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 18: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Assessment

bull Seminar thesis

bull 100 of the grade no exams

bull Detailed illustration in a next lecture

TipStart early Give the opportunity for your project and your skills todevelop during the course

Evangelos Pournaras Izabela Moise Dirk Helbing 18

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 19: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Lectures

bull Every Tuesday 1000-1200 at LFW B 1

bull Participation is not obligatory but highly recommended

bull 60 minutes lectures followed by 40 minutes interactivediscussions

bull Opportunity to discuss your project

bull Lectures at httpwwwcossethzcheducationdatasciencehtml

Evangelos Pournaras Izabela Moise Dirk Helbing 19

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 20: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Subjects I

1 Applications - 2 weeksndash Smart Grids geolocation traffic systems Planetary Nervous

System etcndash Tools NervousNet

2 Data Science Fundamentals - 2 weeksndash databases data types data collection data pre-processing

plotting visualization etcndash Tools Java AWK R MySQL Gnuplot Gephi etc

3 Data Mining and Machine Learning - 3 weeksndash classification clustering prediction neural networks etcndash Tools Weka

4 Big Data Analytics - 2 weeks

Evangelos Pournaras Izabela Moise Dirk Helbing 20

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 21: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Subjects II

ndash MapReduce parallel computing etcndash Tools Hadoop Spark Mahout etc

5 Real-time Data Analytics - 1 weekndash data streaming social media etcndash Tools Spark Streaming Storm

6 Other - 4 weeksndash Project workndash Tools LATEX Github etc

Evangelos Pournaras Izabela Moise Dirk Helbing 21

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 22: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Lectures OutlineLecture 01 (170215)IntroductionLecture 02 (240215)Seminar ThesisLecture 03 (030315)ApplicationsLecture 04 (100315)ApplicationsLecture 05 (170315)Data Science FundamentalsLecture 06 (240315)Data Science FundamentalsLecture 07 (310315)Data Mining and Machine Learning

Lecture 08 (140415)Data Mining and Machine LearningLecture 09 (210415)Data Mining and Machine LearningLecture 10 (280415)Big Data AnalyticsLecture 11 (050515)Big Data AnalyticsLecture 12 (120515)Real-time Data AnalyticsLecture 13 (190515)Oral PresentationsLecture 14 (260515)Oral Presentations

Evangelos Pournaras Izabela Moise Dirk Helbing 22

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 23: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

How to contact us

Communication

bull Discussion session in the course

bull E-mail with subject[DATA-SCIENCE-COURSE-2015]ltotherinfogtto

ndash Evangelos Pournaras epournarasethzch andorndash Iza Moise imoiseethzch

Supervision - strictly for issues not addressed in the course

bull Wednesdays 1400-1600

Evangelos Pournaras Izabela Moise Dirk Helbing 23

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 24: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

Proposed Literature

B Ellis

Real-Time Analytics Techniques to Analyze and Visualize Streaming Data

Wiley Publishing 1st edition 2014

J Han

Data Mining Concepts and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 2005

T White

Hadoop The Definitive Guide

OrsquoReilly Media Inc 2012

I H Witten E Frank and M A Hall

Data Mining Practical Machine Learning Tools and Techniques

Morgan Kaufmann Publishers Inc San Francisco CA USA 3rd edition2011

Evangelos Pournaras Izabela Moise Dirk Helbing 24

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25

Page 25: Introduction - ETH Zürich · PDF fileDetailed illustration in a next lecture. Tip ... geolocation, traffic systems ... Introduction Lecture 02 (24.02.15) Seminar Thesis

What is next

bull Seminar thesis - attendance is strongly recommended

bull Examples and applications

Evangelos Pournaras Izabela Moise Dirk Helbing 25