Company LOGO An Introduction of JStorm LongdaFeng([email protected])

44
Company LOGO An Introduction of JStorm LongdaFeng([email protected])

Transcript of Company LOGO An Introduction of JStorm LongdaFeng([email protected])

Company

LOGO

An Introduction of JStormLongdaFeng([email protected])

Longda Feng

Alibaba

Agenda

Question and Answer.

Basic Concept & Scenarios

Background

JStorm vs Storm

Why start JStorm?

Longda Feng

Who are we?

JStorm Team was among one of the earliest that uses Storm in China. Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1 JStorm 0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/…

Our Duties Application Development JStorm System Development JStorm System Operation

Alibaba

Longda Feng

Who are Using JStorm

Many small Chinese companies are using JStorm

Alibaba

Longda Feng

How Big?

More than 3000 servers More than 3 trillion messages per day

Alibaba

Longda Feng

What is JStorm?

JStorm is a distributed programming framework   Similar to Hadoop MapReduce but designed

for real-time/in-memory scenarios Users can build powerful distributed

applications from very simple APIs

Alibaba

Longda Feng

What is JStorm?

Redesigned Storm in Java. Proved stable running in huge clusters. Much faster Much more powerful

Alibaba

Longda Feng

Basic Conception

Pipe-lined data processing

Alibaba

Longda Feng

Advantage 1

Easy learning: Simple Building Blocks: Topology/Spout/Bolt

APIs Out of Box RPC/Fault-tolerance/Real-time

Data Grouping & Combining

Alibaba

Longda Feng

Advantage 2

Excellent Scalability Horizontally Scalable DAG-based Adjustable parallelism of each component

Alibaba

Longda Feng

Stable

Guarantees Fault-Tolerance No Single Point of Failure

• Nimbus HA• Any Supervisor can be shutdown

New worker will be spawned and replace the failed one automatically

Alibaba

Longda Feng

Accuracy

Acking framework guarantees no lost of data

Transaction framework guarantees data accuracy.

Alibaba

Longda Feng

Scenarios

Stateless Computation All data come from Tuple

Use Cases: Log Analysis Pipe-lined System Message converter Statistical Analysis Real-time Recommendation Algorithm

Alibaba

Longda Feng

Alibaba

Why start JStorm

Storm community is not as active as we’ve expected

Tailored for enterprise environment Fixed critical bugs in Storm Provided professional technical support,

improved app development pace. Reduced operational cost.

How Many Versions?

https://github.com/alibaba/JStorm/releases 0.9.6(2014/9/22) 0.9.5.1(2014/9/14) 0.9.5 (2014/8/27) 0.9.4.1 (2014/8/15) 0.9.4(2014/7/18) 0.9.3.1 (2014/5/31) 0.9.3 (2014/5/10) 0.9.2 (2014/4/8) 0.9.1(2014/1/24) 0.9.0(2013/12/30) 0.7.1(2013/4/28)

Longda Feng

Alibaba

Longda Feng

JStorm is a superset of Storm

The program run in Storm can run in JStorm without changing code

Alibaba

Longda Feng

More stable (1) -- nimbus HA

Nimbus HA Dual-Nimbus HA

Alibaba

Longda Feng

More stable (2) -- RPC

Netty supports 2 RPC modes Async Sync

• Sending speed keeps up with the receiving speed, therefore the data flow is more stable.

Alibaba

Longda Feng

More stable(3) – resource isolation

Malicious Worker won’t mess up with others Supported CPU Isolation with cgroups Supported Memory Isolation

Resources quota can be enforced on each group (before 0.9.5)

Alibaba

Longda Feng

More stable(4) -- Monitor

Monitor every component in your Topology Many more metrics(70+) than storm

Supported user-defined metricsSupported user-defined alerts

Alibaba

Longda Feng

More stable (5) – CPU usage

Better utilizing CPU resource Improved disruptor implementation

• Drop CPU usage from 300% to 10% when processing queue is full

Avoid CPU spin-waiting• Relocating nextTuple/ack/fail work to a different

thread

Alibaba

Longda Feng

More stable(6) -- more catch

Add try-catch in any place. Nimbus/supervisor main thread Spout/bolt initialization/cleanup All IO operation, serialization/deserialization All ZK operation

Alibaba

Longda Feng

More stable(7) -- ZK

Reduced unnecessary ZK usage: Removed useless watcher Increased ZK heartbeat frequency Detect failed worker without a full scan of the

entire ZK directory

Alibaba

Longda Feng

More stable ( 8 ) -- other

Improved GC Tuning.Guaranteed that all workers killed after kill

command is issuedGuaranteed single supervisor/nimbus per

instanceAvoid excessive use of local ports by

Netty client。。。

Alibaba

Longda Feng

More powerful scheduler

Balancing Tasks with regard of : CPU Memory Disk Net

Alibaba

Longda Feng

CPU assignment

By default assign each worker a single CPU slot

Application can be configured to utilize more slots

Why: Some task creates extra threads to do other

things in Alimama, one CPU slot doesn’t meet requirement

Alibaba

Longda Feng

Memory Usage

Default worker memory is 2GApplication can be configured to utilize

more memory slots

Why: In Alipay Mdrill application, Solr bolt will apply

much more memory

Alibaba

Longda Feng

Smarter Balancing

With JStorm Scheduler: Tasks that exchange data heavily tend to be

assigned to the same worker to avoid networking cost.

Alibaba

Longda Feng

User Defined Scheduler

User define task run one designated worker User can setting how many CPU slot /memory

slot will be used

Why: In Taobao TAE project, some bolts want to run

in user defined-nodes

Alibaba

Longda Feng

Task on Different Node

Task of one component can be scheduled to run on different nodes

Why: In ALIPAY Mdrill, Solr bolt must run different

node

Alibaba

Longda Feng

Task on Single Node

All tasks can be scheduled to run on a single node.

Why: In Taobao TLog, there are many small jobs, in

order to reduce network cost, all task of one job must run on single node.

Alibaba

Longda Feng

Old Assignment

“Last Assignment Policy” By default , a task will run on the machine it

runs previous timeWhy:

In Alibaba CDO, When restart one application, user wanted to reuse old workers

Alibaba

Longda Feng

Pluginable

Be able to run on:Hadoop yarn(more stable than storm)Alibaba Apsara Clould SystemAlibaba Elastic Resource Pool

Alibaba

Longda Feng

Classloader

Resolved application jar-confliction with JStorm

Alibaba

Longda Feng

More convenient UI

More useful stats collected and displayed.Browse Worker Log in UI

Alibaba

Longda Feng

Support libjar

Don’t need assembly all dependency jars into one jar Submit libjar with libjar parameter Support worker.classpath

Alibaba

Longda Feng

Faster

6 Servers (24core/98G)18 Spout/18 Bolt/18 Acker

Alibaba

0 10 20 30 40 50 600

2000000

4000000

6000000

8000000

10000000

12000000

62436806830500

5595900 5474180

3379800

9280598

10818815

9065965

6819139

5610201

Throughput vs workers

jstorm

storm

workers

pollt

uple

s/10

s

Longda Feng

JStorm 41W/S Sending SpeedAlibaba

Longda Feng

Storm 41W/S Sending SpeedAlibaba

Longda Feng

Why Faster

Reduce memory-copying by zeroMq Dedicated Deserializing ThreadBetter Tuned Sampling LogicBetter Tuned Acking FrameworkBetter Tuned GC

Alibaba

Longda Feng

Other Improvement

More than 100 improvements https://github.com/alibaba/JStorm/blob/master/history.md

Fixed assign topology competition Reset rebalance/reassigned worker timeout as 4 minutes Graceful worker shutdown Improvement on thrift server Avoid mistakenly killing of worker while rebalancing jobs.。。。。

Alibaba

Longda Feng

More document

https://github.com/alibaba/JStorm/wiki

Google-group:[email protected]

Wangwang: JStormQQ: 228374502Laiwang: JStorm

Alibaba

Longda Feng

Join us

Welcome to Join us [email protected]

Alibaba

Company

LOGO

纪君祥( Longda Feng )