Longda Feng
Alibaba
Agenda
Question and Answer.
Basic Concept & Scenarios
Background
JStorm vs Storm
Why start JStorm?
Longda Feng
Who are we?
JStorm Team was among one of the earliest that uses Storm in China. Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1 JStorm 0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/…
Our Duties Application Development JStorm System Development JStorm System Operation
Alibaba
Longda Feng
What is JStorm?
JStorm is a distributed programming framework Similar to Hadoop MapReduce but designed
for real-time/in-memory scenarios Users can build powerful distributed
applications from very simple APIs
Alibaba
Longda Feng
What is JStorm?
Redesigned Storm in Java. Proved stable running in huge clusters. Much faster Much more powerful
Alibaba
Longda Feng
Advantage 1
Easy learning: Simple Building Blocks: Topology/Spout/Bolt
APIs Out of Box RPC/Fault-tolerance/Real-time
Data Grouping & Combining
Alibaba
Longda Feng
Advantage 2
Excellent Scalability Horizontally Scalable DAG-based Adjustable parallelism of each component
Alibaba
Longda Feng
Stable
Guarantees Fault-Tolerance No Single Point of Failure
• Nimbus HA• Any Supervisor can be shutdown
New worker will be spawned and replace the failed one automatically
Alibaba
Longda Feng
Accuracy
Acking framework guarantees no lost of data
Transaction framework guarantees data accuracy.
Alibaba
Longda Feng
Scenarios
Stateless Computation All data come from Tuple
Use Cases: Log Analysis Pipe-lined System Message converter Statistical Analysis Real-time Recommendation Algorithm
Alibaba
Longda Feng
Alibaba
Why start JStorm
Storm community is not as active as we’ve expected
Tailored for enterprise environment Fixed critical bugs in Storm Provided professional technical support,
improved app development pace. Reduced operational cost.
How Many Versions?
https://github.com/alibaba/JStorm/releases 0.9.6(2014/9/22) 0.9.5.1(2014/9/14) 0.9.5 (2014/8/27) 0.9.4.1 (2014/8/15) 0.9.4(2014/7/18) 0.9.3.1 (2014/5/31) 0.9.3 (2014/5/10) 0.9.2 (2014/4/8) 0.9.1(2014/1/24) 0.9.0(2013/12/30) 0.7.1(2013/4/28)
Longda Feng
Alibaba
Longda Feng
JStorm is a superset of Storm
The program run in Storm can run in JStorm without changing code
Alibaba
Longda Feng
More stable (2) -- RPC
Netty supports 2 RPC modes Async Sync
• Sending speed keeps up with the receiving speed, therefore the data flow is more stable.
Alibaba
Longda Feng
More stable(3) – resource isolation
Malicious Worker won’t mess up with others Supported CPU Isolation with cgroups Supported Memory Isolation
Resources quota can be enforced on each group (before 0.9.5)
Alibaba
Longda Feng
More stable(4) -- Monitor
Monitor every component in your Topology Many more metrics(70+) than storm
Supported user-defined metricsSupported user-defined alerts
Alibaba
Longda Feng
More stable (5) – CPU usage
Better utilizing CPU resource Improved disruptor implementation
• Drop CPU usage from 300% to 10% when processing queue is full
Avoid CPU spin-waiting• Relocating nextTuple/ack/fail work to a different
thread
Alibaba
Longda Feng
More stable(6) -- more catch
Add try-catch in any place. Nimbus/supervisor main thread Spout/bolt initialization/cleanup All IO operation, serialization/deserialization All ZK operation
Alibaba
Longda Feng
More stable(7) -- ZK
Reduced unnecessary ZK usage: Removed useless watcher Increased ZK heartbeat frequency Detect failed worker without a full scan of the
entire ZK directory
Alibaba
Longda Feng
More stable ( 8 ) -- other
Improved GC Tuning.Guaranteed that all workers killed after kill
command is issuedGuaranteed single supervisor/nimbus per
instanceAvoid excessive use of local ports by
Netty client。。。
Alibaba
Longda Feng
CPU assignment
By default assign each worker a single CPU slot
Application can be configured to utilize more slots
Why: Some task creates extra threads to do other
things in Alimama, one CPU slot doesn’t meet requirement
Alibaba
Longda Feng
Memory Usage
Default worker memory is 2GApplication can be configured to utilize
more memory slots
Why: In Alipay Mdrill application, Solr bolt will apply
much more memory
Alibaba
Longda Feng
Smarter Balancing
With JStorm Scheduler: Tasks that exchange data heavily tend to be
assigned to the same worker to avoid networking cost.
Alibaba
Longda Feng
User Defined Scheduler
User define task run one designated worker User can setting how many CPU slot /memory
slot will be used
Why: In Taobao TAE project, some bolts want to run
in user defined-nodes
Alibaba
Longda Feng
Task on Different Node
Task of one component can be scheduled to run on different nodes
Why: In ALIPAY Mdrill, Solr bolt must run different
node
Alibaba
Longda Feng
Task on Single Node
All tasks can be scheduled to run on a single node.
Why: In Taobao TLog, there are many small jobs, in
order to reduce network cost, all task of one job must run on single node.
Alibaba
Longda Feng
Old Assignment
“Last Assignment Policy” By default , a task will run on the machine it
runs previous timeWhy:
In Alibaba CDO, When restart one application, user wanted to reuse old workers
Alibaba
Longda Feng
Pluginable
Be able to run on:Hadoop yarn(more stable than storm)Alibaba Apsara Clould SystemAlibaba Elastic Resource Pool
Alibaba
Longda Feng
More convenient UI
More useful stats collected and displayed.Browse Worker Log in UI
Alibaba
Longda Feng
Support libjar
Don’t need assembly all dependency jars into one jar Submit libjar with libjar parameter Support worker.classpath
Alibaba
Longda Feng
Faster
6 Servers (24core/98G)18 Spout/18 Bolt/18 Acker
Alibaba
0 10 20 30 40 50 600
2000000
4000000
6000000
8000000
10000000
12000000
62436806830500
5595900 5474180
3379800
9280598
10818815
9065965
6819139
5610201
Throughput vs workers
jstorm
storm
workers
pollt
uple
s/10
s
Longda Feng
Why Faster
Reduce memory-copying by zeroMq Dedicated Deserializing ThreadBetter Tuned Sampling LogicBetter Tuned Acking FrameworkBetter Tuned GC
Alibaba
Longda Feng
Other Improvement
More than 100 improvements https://github.com/alibaba/JStorm/blob/master/history.md
Fixed assign topology competition Reset rebalance/reassigned worker timeout as 4 minutes Graceful worker shutdown Improvement on thrift server Avoid mistakenly killing of worker while rebalancing jobs.。。。。
Alibaba
Longda Feng
More document
https://github.com/alibaba/JStorm/wiki
Google-group:[email protected]
Wangwang: JStormQQ: 228374502Laiwang: JStorm
Alibaba
Top Related