Opensource Frameworks and BigData Processing

24
Linux and Ubuntu 14.10 Release Conf 1 Big-Data Processing utilizing Open-Source Technology Stack By Amir Sedighi http://www.linkedin.com/in/amirsedighi @amirsedighi

Transcript of Opensource Frameworks and BigData Processing

Page 1: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 1

Big-Data Processing utilizingOpen-Source Technology Stack

By

Amir Sedighi

http://www.linkedin.com/in/amirsedighi@amirsedighi

Page 2: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 2

References

● http://www.slideshare.net/BernardMarr/140228-big-data-slide-share?qid=017848e2-9e2a-4dc3-963c-52b6a90fba2a&v=default&b=&from_search=1

● http://www.forbes.com/fdc/welcome_mjx.shtml

● ZYMR Spark Your Real-Time Big Data Analytics

● http://dataconomy.com

● https://datakulfi.wordpress.com/2013/03/27/big-data-open-source-technology-landscape/

● http://www.slideshare.net/andrefaria/big-data-abc?qid=1ac97e4a-4acc-460a-b3f8-9122f7210440&v=qf1&b=&from_search=12

● https://wiki.apache.org/hadoop/PoweredBy

Page 3: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 3

Data Explosion

Page 4: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 4

Data Explosion

Page 5: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 5

● Big-Data is that everything we do is increasingly leaving a digital trace which we (or others) can gather, use and analyze.

– Data Providers● Business Companies● People

Page 6: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 6

Volume, Velocity, Variety

● “There was 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days, and the pace is increasing.” Eric Schmidt

Page 7: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 7

Big-Data Processing

Page 8: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 8

How to provide a Big-Data processing platform using commodity machines?

Page 9: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 9

Vertical or Horizontal?

Page 10: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 10

Scale Up vs Scale Out

Page 11: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 11

Scale Up vs Scale Out

Page 12: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 12

Big-Data Processing Open-Source Technology Stack

Page 13: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 13

Map-Reduce

Page 14: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 14

Hadoop Framework

Page 15: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 15

Apache Hadoop Main Projects

Page 16: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 16

Page 17: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 17

Data Stores

● Data Stores

– KeyValue

– Graph

– Columnar

– Document Store

– In Memory

Page 18: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 18

Data Transfer

● Apache Flume

● Apache Sqoop

Page 19: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 19

Search

● Elasticsearch

● Apache SolR

Page 20: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 20

Messaging and Queuing

● Apache Kafka

● ZeroMQ

Page 21: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 21

Log Management

● ELK

● Logstash

● FluentD

Page 22: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 22

Stream Processing

● Apache Storm

● Apache Samza

● Apache Spark

Page 23: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 23

Machine Learning

● Apache Mahout

● MLLib

● GraphX

Page 24: Opensource Frameworks and BigData Processing

Linux and Ubuntu 14.10 Release Conf 24

Questions?