Spark meetup stream processing use cases

23
Stream Processing Key Driver for Enabling Instant Insights on Big Data Mohit Jotwani Product Manager, DataTorrent

Transcript of Spark meetup stream processing use cases

Stream ProcessingKey Driver for Enabling Instant Insights on Big Data

Mohit Jotwani Product Manager, DataTorrent

Why is Stream Processing Vital?

SOURCE DATA

MS Queue’s

Events

XML Files

Databases

Sensor data

Social

Enterprise Repositories

RDBMS

EDW

NoSQL

Feed m

Feed 2

Feed 1

Load

(Optional) Staging Area

Traditional Analytics – Data at Rest

Business Analytics

Business Intelligence

Visualization Tools

Vis

ual

ize

Analyze

Extract Transform

Feed n

Feed 2

Feed 1

Visualize

Next Generation – Data in Motion • Organizations need to react to changing business conditions in real time

• Faster decision making across all industries • Few companies outside of financial markets, telecom & utilities have experience with

streaming

• Newer data sources – like sensors, social media feeds • Higher Volume and Greater Velocity • More unstructured and semi-structured data

• Democratization of technologies • Open Source Projects • Large Scale Compute & Storage – Hadoop, NoSQL• Streaming Technologies – Apex, Spark, Storm etc. • Real-time dashboards and alert notification systems

• Beyond niche use cases • Broad applicability but needs more adoption

Stream vs. Batch Processing Pipelines

Transform Analyze ActionVisualize/

PersistIngest

Extract Transform Load Analyze Action

Stream Processing•Continuous processing on data as it flows through a system

•Allows users to act on events instantaneously via alerts

•Processing related to time (event time vs. processing time)

•Real-Time – diff between event time and processing time is negligible

Enables your Data In Motion Architecture

Big Data Application Types

IoTFraud

CDR

CDC

Reporting

SQL

Operations

Data Discovery

SQL on Streams

Streaming Disovery

Sample Streaming Analytics Patterns

Preprocessing • Filtering events• Transforming

attributes

Alerts & Thresholds• Based on complex

conditions

Computing within Windows• Aggregations

Combining Event Streams• Correlation• Error detection

Enrichment • Looking up database,

reference data

Temporal Events• Detecting events

within time windows

Tracking • Tracking events over

space & time

Trend Detection • Rise, Fall • Outliers

Source: https://iwringer.wordpress.com/2015/08/03/patterns-for-streaming-realtime-analytics/

Stream Processing Use Cases

Financial Services

• Detect fraudulent activity in real-time

• Risk Analysis

• Deliver personalized products and

offerings

• Make decisions in real-time for trading

and transactional platforms

Financial services big data fabric

Telecom

• Real-time network monitoring and

protection

• Quality of service and Customer

Satisfaction

• Take action based on users’ location

• Automatic resource allocation and load

balancing

Online Advertising

• Dynamic bidding

• Real-time targeting & personalization

• Maximize click-through and

conversion rates.

• Reporting that can be updated

continuously

Online advertising dynamic inventory purchases

Internet of Things

• Environment monitoring

• Infrastructure management

• Manufacturing

• Energy management

• Public Building & Home automation

• Transportation

IoT secure ingestion and predictive analysis

High performance, multi-customer secure, data ingestion. Complex event processing with

historical data for predictive maintenance

Sensor 2

Sensor 1

Sensor N

Application n

Application 1

Persistent

Data Governance

Complex Event Process

Predictive maintenance

Stream Processing: Conclusion

•Lots of untapped potential!• Gives your business a competitive edge!

•Open Source and Big Data technologies

• Built to address the scale and latency demands

•Broad use cases • Across industries and verticals

https://www.datatorrent.com/careers/

[email protected]

Thank You