Amir H. Payberah amir@sics Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 7 / 49....

download Amir H. Payberah amir@sics Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 7 / 49. Presentation

of 75

  • date post

    26-Apr-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Amir H. Payberah amir@sics Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 7 / 49....

  • Introduction

    Amir H. Payberah amir@sics.se

    Amirkabir University of Technology (Tehran Polytechnic)

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 1 / 49

  • Course Information

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 2 / 49

  • Course Objective

    I Introduction to main concepts and principles of cloud computing and data intensive computing.

    I How to read, review and present a scientific paper.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 3 / 49

  • Course Objective

    I Introduction to main concepts and principles of cloud computing and data intensive computing.

    I How to read, review and present a scientific paper.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 3 / 49

  • Topics of Study

    I Topics we will cover include: • Cloud platforms • Cloud storage, NoSQL and NewSQL databases • Cloud resource management • Batch processing frameworks • Stream processing frameworks • Graph processing frameworks

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 4 / 49

  • Course Material

    I Mainly based on research papers.

    I You will find all the material on the course web page: http://www.sics.se/∼amir/cloud14.htm

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 5 / 49

  • Course Examination

    I Mid term exam: 20%

    I Final exam: 20%

    I Reading assignments: 27%

    I Final presentation: 23%

    I Final project: 10%

    I Lab assignments: 0% :)

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 6 / 49

  • Reading Assignments

    I Nine reading assignments.

    I You should write a review for each paper (at most two pages).

    I For each paper you should: • Identify and motivate the problem. • Pinpoint the main contributions. • Identify positive/negative aspects of the solution/paper.

    I For each paper, you might be given some questions to answer.

    I Students will work in groups of two.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 7 / 49

  • Reading Assignments

    I Nine reading assignments.

    I You should write a review for each paper (at most two pages).

    I For each paper you should: • Identify and motivate the problem. • Pinpoint the main contributions. • Identify positive/negative aspects of the solution/paper.

    I For each paper, you might be given some questions to answer.

    I Students will work in groups of two.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 7 / 49

  • Reading Assignments

    I Nine reading assignments.

    I You should write a review for each paper (at most two pages).

    I For each paper you should: • Identify and motivate the problem. • Pinpoint the main contributions. • Identify positive/negative aspects of the solution/paper.

    I For each paper, you might be given some questions to answer.

    I Students will work in groups of two.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 7 / 49

  • Reading Assignments

    I Nine reading assignments.

    I You should write a review for each paper (at most two pages).

    I For each paper you should: • Identify and motivate the problem. • Pinpoint the main contributions. • Identify positive/negative aspects of the solution/paper.

    I For each paper, you might be given some questions to answer.

    I Students will work in groups of two.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 7 / 49

  • Presentation

    I Each group give a 30 minutes talk on a scientific paper.

    I The list of papers will be available in the course web page.

    I You are also free to choose any other paper, but it should be con- firmed.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 8 / 49

  • Lab Assignments and the Project

    I Implement simple applications on different frameworks through the course.

    I The solution of each lab assignment will be uploaded on the course page, one week after their start dates.

    I The final project on top of Spark.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 9 / 49

  • Discussion Forum

    I Use the course discussion forum if you have any questions: http://www.sics.se/∼amir/cloud14.htm

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 10 / 49

  • Course Overview

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 11 / 49

  • Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom.

    - Clifford Stoll

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 12 / 49

  • Big Data

    small data big data

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 13 / 49

  • I Big Data refers to datasets and flows large enough that has outpaced our capability to store, process, analyze, and understand.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 14 / 49

  • The Four Dimensions of Big Data

    I Volume: data size

    I Velocity: data generation rate

    I Variety: data heterogeneity

    I This 4th V is for Vacillation: Veracity/Variability/Value

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 15 / 49

  • Where Does Big Data Come From?

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 16 / 49

  • Big Data Market Driving Factors

    The number of web pages indexed by Google, which were around one million in 1998, have exceeded one trillion in 2008, and its expansion is accelerated by appearance of the social networks.∗

    [∗Wei Fan et al., Mining big data: current status, and forecast to the future, 2013]

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 17 / 49

  • Big Data Market Driving Factors

    The amount of mobile data traffic is expected to grow to 10.8 Exabyte per month by 2016.∗

    [∗Dan Vesset et al., Worldwide Big Data Technology and Services 2012-2015 Forecast, 2013]

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 18 / 49

  • Big Data Market Driving Factors

    More than 65 billion devices were connected to the Internet by 2010, and this number will go up to 230 billion by 2020.∗

    [∗John Mahoney et al., The Internet of Things Is Coming, 2013]

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 19 / 49

  • Big Data Market Driving Factors

    Open source communities

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 20 / 49

  • Big Data Market Driving Factors

    Many companies are moving towards using Cloud services to access Big Data analytical tools.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 21 / 49

  • History of Data

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 22 / 49

  • 4000 B.C

    I Manual recording

    I From tablets to papyrus, to parchment, and then to paper

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 23 / 49

  • 1450

    I Gutenberg’s printing press

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 24 / 49

  • 1800’s - 1940’s

    I Punched cards (no fault-tolerance)

    I Binary data

    I 1890: US census

    I 1911: IBM appeared

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 25 / 49

  • 1940’s - 1950’s

    I Magnetic tapes

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 26 / 49

  • 1950’s - 1960’s

    I Large-scale mainframe computers

    I Batch transaction processing

    I File-oriented record processing model (e.g., COBOL)

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 27 / 49

  • 1960’s - 1970’s

    I Hierarchical DBMS (one-to-many)

    I Network DBMS (many-to-many)

    I VM OS by IBM → multiple VMs on a single physical node.

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 28 / 49

  • 1970’s - 1980’s

    I Relational DBMS (tables) and SQL

    I ACID

    I Client-server computing

    I Parallel processing

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 29 / 49

  • 1990’s - 2000’s

    I Virtualized Private Network connections (VPN)

    I The Internet...

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 30 / 49

  • 2000’s - Now

    I Cloud computing

    I NoSQL: BASE instead of ACID

    I Big Data

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 31 / 49

  • Cloud and Big Data

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 32 / 49

  • Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 33 / 49

  • Big Data Analytics Stack

    Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/22 34 / 49

  • Hadoop Big Data Analytics Stack