Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
-
Upload
sean-zhong -
Category
Internet
-
view
365 -
download
0
Transcript of Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
![Page 1: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/1.jpg)
1
Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore 2015
Sean Zhong [email protected], Intel Software
![Page 2: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/2.jpg)
2
What is Gearpump
• Akka based lightweight Real time data processing platform. • Apache License http://gearpump.io version 0.7
• Akka: • Communication, concurrency, Isolation, and fault-tolerant
Simple and Powerful
Message level streaming Long running daemons
What is Akka?
![Page 3: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/3.jpg)
3
What is Akka? • Micro-service(Actor) oriented.
• Message Driven
• Lock-free
• Location-transparent
It is like our human society, driven by message Which can scale to 7 billion population!
![Page 4: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/4.jpg)
4
Micro-service oriented higher abstraction
• Break your application into Micro services instead of object.
• Throw away locks
• Use Immutable Async message to exchange information between micro-service instead of shared object.
![Page 5: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/5.jpg)
5
Gearpump in Big Data Stack
store
visualization
batch stream
SQL Catalyst StreamSQL Impala
Cluster manager monitor/alert/notify
Machine learning
Graphx
Cloudera Manager
Storage
Engine
Analytics
Visualization & management
storm
Data explore
Gearpump
![Page 6: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/6.jpg)
6
Why another streaming platform?
• The requirements are not fully met.
• A higher abstraction like micro-service oriented can largely simplify the problem.
![Page 7: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/7.jpg)
7
What we want • Meet The 8 Requirements of Real-Time Stream
Processing (2006)
7
Flexible Volume Speed Accuracy Visual
Any where Any size Any source Any use case dynamic DAG ②StreamSQL
High throughput ⑦Scale linearly
①In-Stream Zero latency ⑥HA ⑧Responsive
Exactly-once ③Message loss/delay/out of order ④Predictable
Easy to debug WYSWYG
![Page 8: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/8.jpg)
8
![Page 9: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/9.jpg)
9
DAG representation and API
Graph(A~>B~>C~>D, B~>E~>D)
Low level Graph API Syntax:
A B C D
E
Processor
Processor Processor Processor
Processor DAG
Shuffle
Field grouping
Field grouping
![Page 10: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/10.jpg)
10
Architecture - Actor Hierarchy
Client
Hook in and query state
As general service
YARN
Each App has one isolated AppMaster, and use Actor Supervision tree for error handling.
Master Cluster HA Design
![Page 11: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/11.jpg)
11
Worker Worker Worker
Master
standby Master
Standby Master
State
Gossip
Architecture - Master HA (no SPOF) • Akka Cluster for a centerless HA system • Conflict free data types(CRDT) for consistency
CRDT Data type example:
Decentralized: Not rely on single central meta server
leader
![Page 12: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/12.jpg)
12
Feature Highlights
Akka/Akka-stream/Storm
compatible
Throughput 14 million/s (*)
2ms Latency(*)
Exactly-once Dynamic DAG Out of Order
Message
Flexible DSL DAG
Visualization Internet of
Thing
function
usability
[*] Test environment: Intel® Xeon™ Processor E5-2690, 4 nodes, each node has 32 CPU cores, and 64GB memory, 10GbE network. We use default configuration of Gearpump 0.7 release. See backup page for more details. We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark) (tested carried by Intel team)
![Page 13: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/13.jpg)
13
![Page 14: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/14.jpg)
14
Three steps to use it
1. Download binary from http://gearpump.io
2. Submit jar by UI
3. Monitor Status
![Page 15: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/15.jpg)
15
Application Submission Flow
Master
Workers 1. Submit a Jar
Master
Workers
AppMaster 2. Create AppMaster
Master
Workers
AppMaster
YARN Master
Workers
AppMaster Executor Executor Executor
1 2
4 3
15 4. Report Executor to AppMaster
YARN or without YARN
![Page 16: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/16.jpg)
16
Low level Graph API - WordCount val context = new ClientContext()
val split = Processor[Split](splitParallism)
val sum = Processor[Sum](sumParallism)
val app = StreamApplication("wordCount", Graph(split ~> sum), UserConfig.empty)
val appId = context.submit(app)
context.close()
class Split(taskContext : TaskContext, conf: UserConfig) extends Task(taskContext, conf) {
override def onNext(msg : Message) : Unit = { /* split the line */ }
}
class Sum (taskContext : TaskContext, conf: UserConfig) extends Task(taskContext, conf) {
val count = /**count of words **/ override def onNext(msg : Message) : Unit = {/* do aggregation on word*/}
}
Scala Java
![Page 17: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/17.jpg)
17
High Level DSL API - WordCount
val context = ClientContext()
val app = new StreamApp("dsl", context)
val data = "This is a good start, bingo!! bingo!!"
app.fromCollection(data.lines)
// word => (word, count = 1)
.flatMap(line => line.split("[\\s]+")).map((_, 1))
// (word, count1), (word, count2) => (word, count1 + count2)
.groupByKey().sum.log
val appId = context.submit(app)
context.close()
![Page 18: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/18.jpg)
18
Akka-stream API – WordCount implicit val system = ActorSystem("akka-test") implicit val materializer = new GearpumpMaterializer(system) val echo = system.actorOf(Props(new Echo())) val sink = Sink.actorRef(echo, "COMPLETE") val source = GearSource.from[String](new CollectionDataSource(lines)) source.mapConcat{line => line.split(" ").toList}
.groupBy2(x=>x)
.map(word => (word, 1))
.reduce {(a, b) => (a._1, a._2 + b._2)}
.log("word-count").runWith(sink)
Available at branch https://github.com/gearpump/gearpump/tree/akkastream
![Page 19: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/19.jpg)
19
DAG Page
UI Portal - DAG Visualization
Track global min-Clock of all message DAG:
• Node size reflect throughput • Edge width represents flow rate • Red node means something goes wrong
19
![Page 20: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/20.jpg)
20
UI Portal – Processor Detail Processor Page
Data skew distribution Task throughput and latency
![Page 21: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/21.jpg)
21
![Page 22: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/22.jpg)
22
Throughput and Latency
Throughput: 11 million message/second
Latency: 17ms on full load
SOL Shuffle test 32 tasks->32 tasks
[*] Test environment: Intel® Xeon™ Processor E5-2680, 4 nodes, each node has 32 CPU cores, and 64GB memory, 10GbE network. (tested carried by Intel team) We use default configuration of Gearpump 0.2 release. We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark)
![Page 23: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/23.jpg)
23
Scalability • Test run on 100 nodes(*) and 3000 tasks
• Gearpump performance scales:
100 nodes
[*] We use 8 machines to simulate 100 worker nodes Test environment: Intel® Xeon™ Processor E5-2680, each node has 32 CPU cores, and 64GB memory, 10GbE network. (tested carried by Intel team) We use default configuration of Gearpump 0.3.5 release. We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark)
![Page 24: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/24.jpg)
24
Fault-Tolerance: Recovery time
[*]: Recovery time is the time interval between: a) failure happen b) all tasks in topology resume processing data.
91 worker nodes, 1000 tasks
Test environment: Intel® Xeon™ Processor E5-2680 91 worker nodes, 1000 tasks (We use 7 machines to simulate 91 worker nodes). Each node has 32 CPU cores, and 64GB memory, 10GbE network. We use default configuration of Gearpump 0.3.5 release. We use the SOL workload (https://github.com/intel-hadoop/storm-benchmark) (by Intel team)
![Page 25: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/25.jpg)
25
High performance Messaging Layer • Akka remote message has a big overhead, (sender + receiver address) • Reduce 95% overhead (400 bytes to ~20 bytes)
Effective batching
convert from short address
convert to short address
Sync with other executors
![Page 26: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/26.jpg)
26
Effective batching Network Idle: Flush as fast as we can
Network Busy: Smart batching until the network is open again.
This feature is ported from Storm-297
Network Bandwidth Doubled
For 100 byte per message
Test environment: Same as storm-297
![Page 27: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/27.jpg)
27
High performance flow Control
Pass back-pressure level-by-level
About ~1% throughput impact
Task Task Task Task Task Task Task Task Task Task Task Task
Back-pressure Sliding window
Another option(not used): big-loop-feedback flow control
1. NO central ack nodes
2. Each level knows network status, thus can optimize the network at best
![Page 28: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/28.jpg)
28
![Page 29: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/29.jpg)
29
What is a good use case for Gearpump?
When you want exactly-once message processing, with millisecond latency.
When you want to integrate with Akka and Akka-stream transparently.
When you want dynamic modification of online DAG
When you want to connect with IoT edge device, location tranparency.
When you want a simple scheduler to distribute customized application, like collecting logs, distributed cron…
When you want to use Monoid state like Count-Min Sketch…
Besides, it can integrate with: YARN, Kafka, Storm and etc..
![Page 30: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/30.jpg)
30
IoT Transparent Cloud
Location transparent. Unified programming model across the boundary.
log
Data Center
dag on device side
Target Problem Large gap between edge devices and data center
case: Intelligent Traffic System, 3000 traffic lights, travel time, overspeed detection…
![Page 31: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/31.jpg)
31
Exactly-once: Financial use cases Target Problem both real-time and accuracy are important
realtime Stock index
Crawlers Process Alerts Actions Reports
Rules
Transaction Account Other
data source
Programing trading
![Page 32: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/32.jpg)
32
Delete
Transformers: Dynamic DAG
Add
change parallelism online to scale out without message loss add/remove source/sink processor dynamically
B
Target Problem No existing way to manipulate the DAG on the fly
Each Processor can has its own independent jar
Replace
It can
![Page 33: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/33.jpg)
33
Eve: Online Machine Learning
• Decide immediately based online learning result
sensor
Learn
Predict
Decide
Input
Output
Target Problem ML train and predict online for real-time decision support.
TAP analytics platform integration: http://trustedanalytics.org/
![Page 34: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/34.jpg)
34
![Page 35: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/35.jpg)
35
General ideas
State
Minclock service
Replayable Source DAG
MinClock service track the min timestamp of all pending messages in the system now and future
Message(timestamp)
Every message in the system Has a application timestamp(message birth time)
Normal Flow
![Page 36: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/36.jpg)
36
General ideas
State
Minclock service
Replayable Source
Checkpoint Store
Message(timestamp)
④Exactly-once State can be reconstructed by message replay:
①Detect Message loss at Tp
② clock pause at Tp
③ replay from Tp
Recovery Flow
DAG
![Page 37: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/37.jpg)
37
1. Detect Message loss
2. Pause the clock at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
![Page 38: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/38.jpg)
38
Detect Failure in time
AckRequest and Ack to detect Message loss:
Easy to trouble-shoot
When An error happen, we know When
Where
Why
Master
AppMaster
Executor
Task
Failure
Failure
Failure
![Page 39: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/39.jpg)
39
Executor Executor
AppMaster
Task Task Task
Store
Source
Global clock service
① error detected
②Fence zombie
Recover the runtime when machine crashed
Use dynamic session ID to fence zombies
Send message
1. Quarantine
![Page 40: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/40.jpg)
40
Executor
AppMaster
Task
Store
Source
Replay
Global clock service
① error detected
②isolate zombie
Executor
Task
③ Recover
DAG Recovery: Quarantine and Recover 2. Recover the executor JVM, and replay message
Send message
![Page 41: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/41.jpg)
41
1. Detect Message loss
2. Pause the clock at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
![Page 42: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/42.jpg)
42
Application’s Clock Service
Definition: Task min-clock is Minimum of ( min timestamp of pending-messages in current task Task min-Clock of all upstream tasks )
Report task min-clock
Level Clock
A
B
C
D
E
Later
Earlier
1000
800
600
400
Ever incremental
![Page 43: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/43.jpg)
43
1. Detect Message loss
2. Pause the clock at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
![Page 44: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/44.jpg)
44
Source-based Message Replay
Normal Flow
Replay from the very-beginning source
Source
Like offset of kafka queue -> message timestamp
![Page 45: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/45.jpg)
45
Source-based Message Replay Replay from the very-beginning source
Global Clock Service
①Resolve offset with timestamp Tp ②Replay from offset
Source
Recovery Flow
![Page 46: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/46.jpg)
46
1. Detect Message loss
2. Clock pause at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
![Page 47: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/47.jpg)
47
Checkpoint Store
DAG runtime
Key: Ensure State(t) only contains message(timestamp <= t)
Exactly-once message processing
How?
append only checkpoint
![Page 48: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/48.jpg)
48
Exactly-once message processing Two states
State Accept (t < Tc)
State Accept all
Checkpoint Store
Streaming System Messages
Normal Flow
![Page 49: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/49.jpg)
49
Exactly-once message processing Two states
State Accept (t < Tc)
State Accept all
Checkpoint Store
Streaming System Messages
Recover checkpoint in failures
Recovery Flow
![Page 50: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/50.jpg)
50
How to do Dynamic DAG? Multiple-Version DAG
A B C
B’ Message.time >= Tc
Target Problem: Replace processor B with B’ at time Tc
DAG(Version = 0) DAG(Version = 1) transit
Message.time < Tc
NO message loss during the transition
A B C
![Page 51: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/51.jpg)
51
![Page 52: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/52.jpg)
52
Live demo
![Page 53: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/53.jpg)
53
References
• 钟翔 大数据时代的软件架构范式:Reactive架构及Akka实践, 程序员期刊2015年2A期 • Gearpump whitepaper http://typesafe.com/blog/gearpump-real-time-streaming-engine-using-akka • 吴甘沙 低延迟流处理系统的逆袭, 程序员期刊2013年10期 • Stonebraker http://cs.brown.edu/~ugur/8rulesSigRec.pdf • https://github.com/intel-hadoop/gearpump • Gearpump: https://github.com/intel-hadoop/gearpump • http://highlyscalable.wordpress.com/2013/08/20/in-stream-big-data-processing/ • https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-
three-cheap-machines • Sqlstream http://www.sqlstream.com/customers/ • http://www.statsblogs.com/2014/05/19/a-general-introduction-to-stream-processing/ • http://www.statalgo.com/2014/05/28/stream-processing-with-messaging-systems/ • Gartner report on IOT http://www.zdnet.com/article/internet-of-things-devices-will-dwarf-number-
of-pcs-tablets-and-smartphones/
![Page 54: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/54.jpg)
54
Legal Disclaimers 1 This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. 2 Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at [intel.com]. 3 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. § For more information go to http://www.intel.com/performance. Intel, the Intel logo, Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © 2015 Intel Corporation
![Page 55: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/55.jpg)
55
Latest Performance evaluation on Gearpump 0.7
This is the latest performance test on version 0.7. Please see the embedded document on the right to see the configuration details.
![Page 56: Strata Singapore: GearpumpReal time DAG-Processing with Akka at Scale](https://reader031.fdocuments.net/reader031/viewer/2022030203/58a8b3c91a28abbd6b8b5371/html5/thumbnails/56.jpg)