汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei...

63
BREAD PPT DESIGN 汇汇汇 汇汇汇 Discretized Streams: Fault- Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems Principles Streaming

description

BREAD PPT DESIGN Introduction 数据库与知识工程 实验室 Much of “big data” is received in real time, and is most valuable at its time of arrival  Social network may wish to detect trending conversation topics in minutes  E-Commerce website may wish to model which users visit a new page  Service operator may wish to monitor program logs to detect failures in seconds

Transcript of 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei...

Page 1: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

汇报人:李旺龙

Discretized Streams: Fault-Tolerant Streaming Computation at Scale

Matei Zaharia UC Berkeley AMPLab

SOSP 2013ACM Symposium on Operating Systems Principles

Streaming

Page 2: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

目录Introduction1

CONTENTS

Background2

Implementation3

Experiment4

Conclusion5数据库与知识工程实验室

www.dbke.sinaapp.com

Streaming

Page 3: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

Much of “big data” is received in real time, and is most valuable at its time of arrival

Social network may wish to detect trending conversation topics in minutes

E-Commerce website may wish to model which users visit a new page

Service operator may wish to monitor program logs to detect failures in seconds

Page 4: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

To enable these low-latency processing applications, there is a need for streaming computation models that scale transparently to large clusters

Most distributed streaming systems, including Storm,TimeStream, MapReduce Online, and streaming databases, are based on a continuous operator model

long-running, stateful operators receive each record, update internal state, and send new records.

Page 5: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 6: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

Major Problems : Faults & Stragglers

Continuous operator model perform recovery through two approaches : Replication, where there are two copies of each

node costs 2× the hardware Upstream Backup, where nodes buffer sent

messages and replay them to a new copy of a failed node takes a long time to recover

Page 7: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

Major Problems : Faults & Stragglers

Neither approach handles stragglers: Replication, synchronization protocols to

coordinate replicas slow down

Upstream Backup, treated as a failure costly recovery

Page 8: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

This paper presents a new stream processing model, discretized streams (D-Streams), that overcomes these challenges

Instead of managing long-lived operators, the idea in D-Streams is to structure a streaming computation as a series of stateless, deterministic batch computations on small time intervals

Page 9: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 10: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

Challenge 1 : latency lowWe use a data structure called Resilient Distributed Datasets (RDDs) , which keeps data in memory and can recover it without replication by tracking the lineage graph of operations that were used to build it

Challenge 2 : quickly recovery from faults and stragglersParallel recovery, When a node fails, each node in the cluster works to recompute part of the lost node’s RDDs, resulting in significantly faster recovery than upstream backup without the cost of replication

Page 11: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

We have implemented D-Streams in a system calledSpark Streaming, based on the Spark engine

The system can process over 60 million records/second on 100 nodes at sub-second latency, and can recover from faults and stragglers in sub-second time.

Page 12: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Introduction

数据库与知识工程实验室

www.dbke.sinaapp.com

Spark Streaming’s per-node throughput is comparable to commercial streaming databases, while offering linear scalability to 100 nodes, and is 2–5× faster than the open source Storm and S4 systems, while offering fault recovery guarantees that they lack.

D-Streams use the same processing model and data structures (RDDs) as batch jobs, a powerful advantage of our model is that streaming queries can seamlessly be combined with batch and interactive computation.

Page 13: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

目录Introduction1

CONTENTS

Background2

Implementation3

Experiment4

Conclusion5数据库与知识工程实验室

www.dbke.sinaapp.com

Streaming

Page 14: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Background

数据库与知识工程实验室

www.dbke.sinaapp.com

Review SparkCreator of Hadoop Doug Cutting says “the use of MapReduce engine for Big Data projects will decline, replaced by Apache Spark”

Page 15: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Background

数据库与知识工程实验室

www.dbke.sinaapp.com

Review Spark

The Spark Stack

Spark SQLRelationalOperators

MLLibMachineLearning

GraphXGraph

Processing

SparkStreamingReal-time

Spark Runtime

YARN, Mesos, AWS HDFS, S3, Cassandra …Cluster Managers Data Sources

A fast and general engine for large-scale data processing

Page 16: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Background

数据库与知识工程实验室

www.dbke.sinaapp.com

Review Spark

Resilient distributed datasets (RDDs) that enables efficient data reuse in a broad range of applications

Fault-tolerant Parallel data structures Explicitly persist in memory Control their partition A rich set of operators

jack

hash

arthur

tom

jack

arthur

tom

hash

Page 17: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Background

数据库与知识工程实验室

www.dbke.sinaapp.com

Review Spark

1-102-11

1-jack2-tom

1-(10,jack)2-(11,tom)

join

Page 18: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Background

过去“人人都是产品经理”这两年“人人都是大数据专家”再过两年“人人都是电影导演”数据库与知识工程实验室

www.dbke.sinaapp.com

Review MapReduce

Page 19: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

data block (key,value)

(key,value)

(key,value_list)

(key,value)

split map reduceshuffle/partition

The school motto analysis by MapReduce自强 弘毅 求是 拓新 (武大)明德 厚学 求是 创新 (华科) 自强 1 弘毅

1求是 1 拓新 1 明德 1 厚学 1 求是 1 创新 1求实 1 创新 1 进取 1 团结 1严紧 1 求实 1 团结 1 创新 1

自强 1求是 1 1…明德

求实 创新 进取 团结(大连理工)严紧 求实 团结 创新(同济)

0 自强 弘毅 求是 拓新 1 明德 厚学 求是 创新 0 求实 创新 进取 团结1 严紧 求实 团结 创新

自强 1求是 2…明德 1

map shuffle reduce

弘毅 1严紧 1…创新 3

弘毅 1严紧 1…创新 1 1 1

数据库与知识工程实验室

www.dbke.sinaapp.com

BackgroundReview MapReduce

Page 20: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

data block (key,value)

(key,value)

(key,value_list)

(key,value)

split map reduceshuffle/partition

The school motto analysis by MapReduce

数据库与知识工程实验室

www.dbke.sinaapp.com

BackgroundReview MapReduce

val file = spark.textFile("src/main/resources/abc") val counts = file.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey( (a,b) => a+b ) counts.saveAsTextFile("src/main/resources/out")

Page 21: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Background

数据库与知识工程实验室

www.dbke.sinaapp.com

Our work targets applications that need to run on tens to hundreds of machines, and tolerate a latency of several seconds. Some examples are: Site activity statistics Cluster monitoring Spam detection

For these applications, we believe that the 0.5–2 second latency of D-Streams is adequate, as it is well belowthe timescale of the trends monitored. We purposely donot target applications with latency needs below a fewhundred milliseconds, such as high-frequency trading

Page 22: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

目录Introduction1

CONTENTS

Background2

Implementation3

Experiment4

Conclusion5数据库与知识工程实验室

www.dbke.sinaapp.com

Streaming

Page 23: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

val conf = new SparkConf().setMaster("local[2]") .setAppName("NetworkWordCount")val ssc = new StreamingContext(conf, Seconds(3))val lines = ssc.socketTextStream("203.195.218.212“ ,10000)val words = lines.flatMap(line=> line.split(" "))val pairs = words.map(word => (word,1))val wordCounts = pairs.reduceByKey( (a,b) => a+b )wordCounts.print()ssc.start()ssc.awaitTermination()

Page 24: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 25: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

The window operation groups all the records from a sliding window of past time intervals into one RDD

Windowingwords.window("5s") yields a D-Stream of RDDs containing the words in intervals [0 , 5), [1 , 6), [2 , 7)…

Page 26: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Incremental aggregationpairs.reduceByWindow("5s", (a, b) => a + b)pairs.reduceByWindow("5s", (a,b) => a+b, (a,b) => a-b)

Page 27: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

State trackinghow many sessions have a bitrate above X ?

One could count the active sessions from a stream of (ClientID, Event)

sessions = events.track((key, ev) => 1, // initialize function(key, st, ev) => // update functionev == Exit ? null : 1,"30s") // timeoutcounts = sessions.count() // a stream of ints

Page 28: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Unification with Batch & Interactive ProcessingSpark Streaming provides several powerful features to unify streaming and batch processing D-Streams can be combined with static RDDs computed using a standard Spark job Users can run a D-Stream program on previous

historical data using a “batch mode.” Users run ad-hoc queries on D-Streams

interactively by attaching a Scala console to their Spark Streaming program and running arbitrary Spark operations on the RDDs there

counts.slice("21:00", "21:05").topK(10)

Page 29: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 30: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Master tracks the D-Stream lineage graph and schedules tasks to compute new RDD partitions.Worker nodes that receive data, store the partitionsof input and computed RDDs, and execute tasks.Client library used to send data into the system

Page 31: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

New data is replicated across two worker nodes before sending an acknowledgement to the client library, because D-Streams require input data to be stored reliably to recompute results. If a worker fails, the client library sends unacknowledged data to another worker.

Page 32: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Spark Streaming relies on Spark’s existing batchscheduler within each timestep, and performs manyof the optimizations in systems

It pipelines operators that can be grouped into a single task, such as a map followed by another map.

It places tasks based on data locality. It controls the partitioning of RDDs to avoid

shuffling data across the network

Page 33: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Optimizations for Stream Processing Network communication :asynchronous I/O Timestep pipelining: submitting tasks from the

next timestep before the current one has finished

Task Scheduling : messages size, more task Storage layer: RDDs are immutable, they can

be checkpointed over the network without blocking computations on them and slowing jobs.

Lineage cutoff : forget lineage after an RDD has been checkpointed

Master recovery : run 24/7

Page 34: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Memory ManagementEach node’s block store manages RDD partitions in an LRU fashion

User can set a maximum history timeout, after which the system will simply forget old blocks without doing disk I/O

The memory required by Spark Streaming is not onerous, because the state within a computation istypically much smaller than the input data

Page 35: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Parallel RecoveryThe system periodically checkpoints some of the state RDDs, by asynchronously replicating them to other worker nodesWhen a node fails, the system detects all missing RDD partitions and launches tasks to recompute them from the last checkpoint. Many tasks can be launched at the same time to compute different RDD partitions, allowing the whole cluster to partake in recovery.

Page 36: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Parallel Recovery

Page 37: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Parallel Recovery

恢复量满载恢复时间 新数据

Page 38: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Straggler MitigationD-Streams also let us mitigate stragglers like batch systems do, by running speculative backup copies of slow tasks.Such speculation would be difficult in a continuous operator system, as it would require launching a new copy of a node, synchronizd populating its state, and overtaking the slow copy. whenever a task runs more than 1 .4×longer than the median task in its job stage, we mark it as slow. More refined algorithms

Page 39: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Implementation

数据库与知识工程实验室

www.dbke.sinaapp.com

Master RecoveryWriting the state of the computation reliably when starting each timestep Having workers connect to a new master and report their RDD partitions to it when the old master fails

Stores D-Stream metadata in HDFSgraph, function objects, checkpoint time,updated rdd

A 100-node cluster resuming work in 12 seconds

Page 40: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

目录Introduction1

CONTENTS

Background2

Implementation3

Experiment4

Conclusion5数据库与知识工程实验室

www.dbke.sinaapp.com

Streaming

Page 41: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Experiment

数据库与知识工程实验室

www.dbke.sinaapp.com

Amazon EC2 m1.xlarge 4 cores and 15 GB RAM1 s latency target -> 500 ms input intervals2 s latency target -> 1 s intervals100-byte input records

Page 42: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Experiment

数据库与知识工程实验室

www.dbke.sinaapp.com

Spark Streaming’s per-node throughput of 640,000 records/s for Grep and 250,000 records/s for TopKCount on 4-core nodesOracle CEP 1 million records/s on 16 coresStreamBase 245,000 records/s on 8 coresEsper 500,000 records/s on 4 cores

While there is no reason to expect D-Streams to be slower or faster per-node, the key advantage is that Spark Streaming scales nearly linearly to 100 nodes

Page 43: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Experiment

数据库与知识工程实验室

www.dbke.sinaapp.com

S4 was limited in the number of records/second it could process per, which made it almost 10× slower than Spark and Storm.

Storm is still adversely affected by smaller record sizes

Page 44: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Experiment

数据库与知识工程实验室

www.dbke.sinaapp.com

1-second batches with input data residing in HDFS20 MB/s/node for WordCount 80 MB/s/node for Grepcheckpoint interval of 10 seconds 20 four-core nodes

Page 45: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Experiment

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 46: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Experiment

数据库与知识工程实验室

www.dbke.sinaapp.com

doubling the nodes reduces the recovery time in half

Page 47: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Experiment

数据库与知识工程实验室

www.dbke.sinaapp.com

We tried slowing down one of the nodes instead of killing it, by launching a 60-thread process that overloaded the CPU

Page 48: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

目录Introduction1

CONTENTS

Background2

Implementation3

Experiment4

Conclusion5数据库与知识工程实验室

www.dbke.sinaapp.com

Streaming

Page 49: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Conclusion

数据库与知识工程实验室

www.dbke.sinaapp.com

We have proposed D-Streams, a new model for distributed streaming computation that • enables fast recovery from both faults and stragglers

without the overhead of replication• forgot conventional streaming wisdom by batching

data into small timesteps• support a wide range of operators and can attain high

per-node throughput, linear scaling to 100 nodes, sub-second latency, and sub-second fault recovery

• compose seamlessly with batch and interactive queries

Page 50: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

工作进展

论文工作

数据库与知识工程实验室

www.dbke.sinaapp.com

实习工作手 Q 质量数据处理

流数据挖掘

Spark 调研

Page 51: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

实习工作对手机 QQ 十多个质量指标,约 50 个事件进行监控收发图片、收发消息、收发文件、登陆、页面切换等群图片、讨论组图片、用户间图片等每天收图片日志 iPhone 13 亿 Android 60 亿 约 8 万条 / 秒, 80M/ 秒, 7T/ 天 Java + Python + Hive + Pig + PostgreSQL

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 52: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

实习工作数据样例

2014080108, 中国 , 浙江省 , 中国电信 ,unknown,unknown,10.157.89.36 2014-08-01 07:59:59.949,INFO,0S200MNJT807V3GE,5.0.0.146,beacon,1.8.0,H30-U10;Android 4.2.2,level 17,122.242.114.5,122.242.114.5,wifi,actGroupPicSmallDownV1,true, 4397,66,A2=000000000000000&A1=1085779492&A4=000000000000000&param_NetworkInfo=2&A3=000000000000000_00:0c:e7:30:13:cf&A6=20:08:ed:07:c6:8d&param_step=1_1_1_0_65;2_-1_0_0_0;3_-1_0_0_0&serverip=61.151.234.34&A7=7638540fc92e0c2e &param_groupPolicy=1&param_uuid={2115FC55-2DA4-4A73-3570-FA89969A3C17}.jpg &param_uinType=1&A67=com.tencent.mobileqq:MSF&QQ=&A28=122.242.114.5&A27=4397&param_FailCode=0&A26=66&A25=true&A23=2017&param_DownMode=1&param_ProductVersion=537039093&param_NetworkOperator= 中国移动&param_SsoServerIp=14.17.42.23:8080&param_runStatus=0&A19=wifi&param_grpUin=213478033&param_GatewayrIp=122.242.114.5&param_Server=61.151.234.34,2014-08-01 07:59:15,2014-08-01 07:59:59,,1085779492,Android,4.2.2,1085779492,0, 000000000000000_00:0c:e7:30:13:cf,0,,20:08:ed:07:c6:8d,7638540fc92e0c2e,,,,,,,,,,,,wifi,,,,2017,,true,66,4397,122.242.114.5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,com.tencent.mobileqq:MSF,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,20140801080

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 53: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

实习工作手 Q 质量数据处理流程

灯塔灯塔库表

各指标小时统计总表小时汇总到天总表

收图小时表

收图天表

TDWHDFS PG入库 出库计算

发图天表

发图小时表

……

……

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 54: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

手 Q 质量数据处理由 SQL 转向 Pig

灯塔灯塔库表

各指标小时统计总表小时汇总到天总表

收图小时表

收图天表

TDWHDFS PG入库 出库计算

发图天表

发图小时表

……

…收图统计

发图统计

…TDW

PigHDFS

入库

PigHDFS转移

数据库与知识工程实验室

www.dbke.sinaapp.com实习工作

Page 55: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

从 SQL 转向 Pig 后,每天手 Q 质量数据处理成本 由 3000+ 降低至 约 1000

数据库与知识工程实验室

www.dbke.sinaapp.com实习工作

Page 56: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

手 Q 质量数据处理由 SQL 转向 Pig 成效——原因分析SQL 重复解析 0_1_0_12;1_1_0_372;2_1_1_2245 => col1--col15

Pig 定义一个 UDF

PS : Hive 也支持 UDF 但 TDW SQL 不支持

数据库与知识工程实验室

www.dbke.sinaapp.com实习工作

Page 57: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Spark 现状

HiveStormMahout Giraph

采用 Scala 编写,支持Python 、 Scala 、 Java

Spark SQL Spark StreamingMLlib GraphX

实习工作

数据库与知识工程实验室

www.dbke.sinaapp.com

Page 58: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

实习工作

数据库与知识工程实验室

www.dbke.sinaapp.com

• 学术界对工业的理论创新 RDD vs MapReduce• 不仅支持 MapReduce ,还支持 Pregel 等多范式• 充分利用内存,支持 DAG ,少序列化、 IO 、网络• 数据加载时, partition 可控• 多级别内存持久化可控,交互式查询• 基于血统的容错机制,类管道支持• 速度优势明显、内存消耗大• 支持 SQL 、流数据、离线数据、图数据、机器学习等• 学习了基本的 Spark 使用,其他框架上手容易不仅仅是快

Spark VS Hadoop

Page 59: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

实习工作

数据库与知识工程实验室

www.dbke.sinaapp.com

Spark 未来 ( San Francisco| June 30 - July 2, 2014Spark Summit 2014 )Spark SQL

• 优化:代码生成、更快的 join 等• 语言扩展:将支持 SQL92• 更好的集成

Page 60: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

实习工作

数据库与知识工程实验室

www.dbke.sinaapp.com

Spark 未来 ( San Francisco| June 30 - July 2, 2014Spark Summit 2014 )MLlib

• 支持的算法将由 15 个翻倍到 30 个左右,涵盖抽样、相关性、估计、检验等描述性统计学算法以及 NMF 、 Sparse SVD 和 LDA 等机器学习算法• SparkR 上线并集成到 MLlibStreaming将支持更多的数据源GraphX优化和 API稳定业界的贡献特性

停止MapReduce转向 Spark

Page 61: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

实习工作

数据库与知识工程实验室

www.dbke.sinaapp.com

Spark meetup in China

2014 年 8月 9 日 @北京 1st Intel 、亚信、 Databrick 2014 年 8月 31 日 @杭州 华为、阿里巴巴2014 年 9月 6 日 @北京 2nd traintracks.io 、微软、京东2014 年 9月 21 日 @深圳 华为、腾讯2014 年 10月 26 日 @北京 3rd Intel 、阿里巴巴、微软、美团、NJU

8月 9 日, Spark-User Beijing Meetup第一次分享活动在亚信科技总部研发中心大厦成功举办。本次活动吸引了包括百度、新浪、京东、 Tibco 、豌豆荚、豆瓣、微博、小米、华为、爱奇艺、美团、 58 、海星、搜狗、 CBSI 、神舟泰岳、大唐电信、 Talking Data 、安达佳、中航信、清华大学、北京邮电大学及银行系统等32 家不同公司、高校、金融系统共 121 人参与。

星火燎原

Page 62: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

进展• 研读了几篇关于流数据挖掘的博士论文,对于流数据挖掘的挑战与常用解决方法有了基本认识• 查阅了 Storm 这个工业界比较成熟的流数据系统的科普知识• 阅读MOA ( Massive Online Analysis )这个流数据挖掘工具的文档,测试了一些例子计划• 深入对流数据挖掘算法的研究• 完成小论文的实验与撰写• 确定毕业论文的具体题目

数据库与知识工程实验室

www.dbke.sinaapp.com论文工作

Page 63: 汇报人:李旺龙 Discretized Streams: Fault-Tolerant Streaming Computation at Scale Matei Zaharia UC Berkeley AMPLab SOSP 2013 ACM Symposium on Operating Systems.

BREAD PPT DESIGN

Thank You !

数据库与知识工程实验室

www.dbke.sinaapp.com