Real-time Stream Processing using Apache Apex

22
Real-time Stream Processing using Apache Apex Bhupesh Chawda [email protected] DataTorrent Software

Transcript of Real-time Stream Processing using Apache Apex

Real-time Stream Processing using

Apache Apex

Bhupesh [email protected]

DataTorrent Software

Apache Apex - Stream Processing● YARN - Native - Uses Hadoop YARN framework for resource negotiation

● Highly Scalable - Scales statically as well as dynamically

● Highly Performant - Can reach single digit millisecond end-to-end latency

● Fault Tolerant - Automatically recovers from failures - without manual intervention

● Stateful - Guarantees that no state will be lost

● Easily Operable - Exposes an easy API for developing Operators (part of an

application) and Applications

Project History● Project development started in 2012 at DataTorrent

● Open-sourced in July 2015

● Apache Apex started incubation in August 2015

● 50+ committers from Apple, GE, Capital One, DirecTV, Silver Spring Networks,

Barclays, Ampool and DataTorrent

● Mentors from Class Software, MapR and Hortonworks

● Soon to be a top level Apache project

Apex Platform Overview

An Apex Application is a DAG(Directed Acyclic Graph)

● A DAG is composed of vertices (Operators) and edges (Streams).● A Stream is a sequence of data tuples which connects operators at end-points called Ports● An Operator takes one or more input streams, performs computations & emits one or more output streams

● Each operator is USER’s business logic, or built-in operator from our open source library● Operator may have multiple instances that run in parallel

Hadoop 1.0 vs 2.0 - YARN

Apex as a YARN Application

● YARN (Hadoop 2.0) replaces MapReduce with

a more generic Resource Management

Framework.

● Apex uses YARN for resource management

and HDFS for storing any persistent storage

Support for Windowing● Apex splits incoming tuples into finite time slices - Streaming Windows

○ Transparent to the user

○ Apex Default = 500 ms

● Checkpointing and book-keeping done at Streaming window boundary

● Applications may need to perform computations in windows - Application Windows○ Specified as a multiple of Streaming Window size

○ Call backs to user operator logic

■ beginWindow(long windowId)

■ endWindow()

○ Example - An application which identifies some aggregates and emits them every minute. Here

application window size = 60 secs = 30 Streaming Windows

● Sliding and Tumbling Application windows are supported natively

Buffer Server● Staging area for outgoing tuples

● Downstream operators connect to upstream Buffer Server to subscribe for tuples

● Plays a role in recovery by replaying data to the downstream operator from a

particular checkpoint

● Spooling to disk is also supported

Fault Tolerance - Checkpointing● During checkpointing all operator state is written to HDFS asynchronously

● This is decentralized and happens independently for each operator

● If all operators in the DAG have checkpointed a particular window, then that window

is said to be committed and all previous checkpoints are purged

O1 O2 O3 O4

3 3 3 2Checkpoint # --->

Committed Window # = 120180 180 180 120Checkpoint Window # --->

Committed Checkpoint # = 2

Checkpoint Window = 60 Streaming Windows

Recovery● Apex Application Master detects the failure

of an operator based on the missing heart

beats from the operators or if windows are

not progressing

● All downstream operators from the failed

operator are restarted from the last

committed checkpoint to recover from

their states.

● Data is replayed from the same checkpoint

by the Buffer Server

● Recovery is automatic and does not require

manual intervention.

Scalability - Partitioning● Operators can be “replicated” (partitioned) into

multiple instances to cope up with high speed

input streams.

● Can be specified at Application launch time

● User can control the distribution of tuples to

downstream partitions.

● Automatic Unifier to unify the tuples

Scalability - Dynamic scaling● Auto scaling is also supported. Number of partitions may automatically increase or

decrease based on the incoming load. Can be customized by the user

● User has to define the trigger for auto scaling:○ Example - Increase partitions if latency goes above 100 ms.

Apex Processing Semantics● AT_LEAST_ONCE (default): Windows are processed at least once

● AT_MOST_ONCE: Windows are processed at most once

○ During recovery, all downstream operators are fast-forwarded to the window of latest checkpoint

● EXACTLY_ONCE: Windows are processed exactly once

○ Checkpoint every window

○ Checkpointing becomes blocking

Apex Guarantees● Apex guarantees No loss of data and computational state - Checkpointed

periodically

● Automatic recovery ensures that processing resumes from where it left off

● Order of incoming data is guaranteed to be maintained○ Not applicable in case of partitioning of operators

● Events in a window are always replayed in the same window in case of failures

Application Specification

1. Add Operators

2. Add Streams

Logical and Physical DAGs

Apex Malhar Library

1. Performance requirementsa. A system which can provide a very very low latency for decision making

(40 ms)b. Ability to handle large volumes of data and ever changing rules (1,000

events per 20 ms burst)c. 99.5% uptime. Which is about 1.5 days downtime in an year

➔ Apex achieved:◆ 2 ms latency against the requirement of 40ms◆ Was able to handle 2,000 events burst against requirement of 1,000

events burst at a net rate of 70,000 events/s.◆ 99.9995% uptime against requirement of and 99.5% uptime and

2. Relevant Roadmap3. Enterprise grade4. Have a healthy and diverse community and committers, i.e. not controlled by one

vendor

Talk Slides: http://www.slideshare.net/ilganeli/nextgen-decision-making-in-under-2ms

DataTorrent Blog: https://www.datatorrent.com/blog/next-gen-decision-making-in-under-2-milliseconds/

Decision Making in < 2ms

Decision making in < 2ms contd..● Comparison finally boiled down to

○ Apache Storm

○ Apache Flink

○ Apache Apex

● Some problems in Storm and Flink among others○ Nimbus is a single point of failure

○ Bolts / Spouts / Operators share a JVM. Hard to debug

○ No dynamic topologies

○ Restarting entire topologies in case of failures

Resources● Mailing List

○ Developers [email protected]○ Users [email protected]

● Apache Apex http://apex.apache.org/● Github

○ Apex Core: http://github.com/apache/incubator-apex-core○ Apex Malhar: http://github.com/apache/incubator-apex-malhar

● DataTorrent: http://www.datatorrent.com● Twitter @ApacheApex Follow - https://twitter.com/apacheapex● Facebook https://www.facebook.com/ApacheApex/● Meetup http://www.meetup.com/topics/apache-apex● Startup Program Free Enterprise License for Startups, Universities, Non-Profits

Thank you!

Please send your questions to [email protected]