Big Data Analytics — Welcome - mse-bda.s3-website-eu...

11
(C) 2015 Marcel Graf HES-SO | Master of Science in Engineering Big Data Analytics — Welcome Academic year 2015/16

Transcript of Big Data Analytics — Welcome - mse-bda.s3-website-eu...

Page 1: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

(C) 2015 Marcel Graf

HES-SO | Master of Science in Engineering

Big Data Analytics — Welcome

Academic year 2015/16

Page 2: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Introduction

■You■Who is who?■Why are you here?■What is your experience related to this subject?

■ Programming in Java■ Data analytics

■Us■Who are we?■What are our domains of competence?■Why are we here?■How to contact us

[email protected][email protected][email protected]

2

Page 3: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Course objectives

■This course presents techniques to manipulate, store and analyze large volumes of data■The accent of the course is put on two paradigms for the design and implementation of

algorithms:■Using MapReduce and the Apache Hadoop framework. ■Using Resilient Distributed Datasets (RDDs) and the Apache Spark framework.

■MapReduce is one of the most well known recent programming models for parallel and distributed computing, used widely in industrial applications dealing with Big Data. ■RDD is emerging as an important programming model taking advantage of functional

programming.

3

Page 4: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Organization

■Lectures■ Presentation of theoretical concepts

■Labs■Gain practical experience with Hadoop, HDFS, MapReduce and Spark and RDD programming■Work in a team of two

■Project■Work in a team of two on a self-chosen Big Data project

4

Page 5: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Course structure (subject to change)

5

■S01 Introduction to MapReduce (MGF)■S02 MapReduce advanced topics (MGF)■S03 MapReduce algorithm design (MGF)■S04 MapReduce algorithm design 2 (MGF)■S05 Inverted index (MGF)■S06 Introduction to Scala collections and their operations (NFI)■S07 Programming with Spark (NFI)■S08 Spark lab 1 (NFI)■S09 Programming with Spark (NFI)■S10 Spark lab 2 (NFI)■S11 Project■S12 Project■S13 Project■S14 Project presentations

Page 6: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Evaluation

6

Activity Coefficient

Labs 25%

Project 25%

Final written exam 50%

Page 7: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Documentation

■Lecture slides■Lab documentation■Homework■ ... are available on the course web site http://mse-bda.s3-website-eu-west-1.amazonaws.com

7

Page 8: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Recommended books

8

Page 9: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Recommended books

9

Page 10: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Recommended books and web resources

■Books■ Jimmy Lin, Chris Dyer — Data-Intensive Text

Processing with MapReduceMorgan & Claypoolavailable at https://github.com/lintool/MapReduceAlgorithms/blob/master/MapReduce-book-final.pdf

■ Tom White — Hadoop, The Definitive GuideO'Reilly Media

■ Anand Rajaraman, Jure Leskovec, Jeffrey D. Ullman — Mining of Massive DatasetsCambridge University Pressavailable at http://infolab.stanford.edu/~ullman/mmds.html

■ Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills — Advanced Analytics with Spark, Patterns for Learning from Data at Scale O’Reilly Media

■Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia — Learning Spark, Lightning-Fast Big Data AnalysisO’ReillyMedia

■Nick Pentreath — Machine Learning with SparkPackt Publishing

■On the web■ Stack Overflow : http://stackoverflow.com/

tags: hadoop, mapreduce, apache-spark10

Page 11: Big Data Analytics — Welcome - mse-bda.s3-website-eu ...mse-bda.s3-website-eu-west-1.amazonaws.com/lectures... · HES-SO | MSE Big Data Analytics | Welcome | Academic year 2015/16

HES-SO | MSE

Big Data Analytics | Welcome | Academic year 2015/16 (C) 2015 Marcel Graf

Your questions

11