Spark meetup stream processing use cases
-
Upload
punesparkmeetup -
Category
Technology
-
view
136 -
download
0
Transcript of Spark meetup stream processing use cases
Stream ProcessingKey Driver for Enabling Instant Insights on Big Data
Mohit Jotwani Product Manager, DataTorrent
SOURCE DATA
MS Queue’s
Events
XML Files
Databases
Sensor data
Social
Enterprise Repositories
RDBMS
EDW
NoSQL
Feed m
Feed 2
Feed 1
Load
(Optional) Staging Area
Traditional Analytics – Data at Rest
Business Analytics
Business Intelligence
Visualization Tools
Vis
ual
ize
Analyze
Extract Transform
Feed n
Feed 2
Feed 1
Visualize
Next Generation – Data in Motion • Organizations need to react to changing business conditions in real time
• Faster decision making across all industries • Few companies outside of financial markets, telecom & utilities have experience with
streaming
• Newer data sources – like sensors, social media feeds • Higher Volume and Greater Velocity • More unstructured and semi-structured data
• Democratization of technologies • Open Source Projects • Large Scale Compute & Storage – Hadoop, NoSQL• Streaming Technologies – Apex, Spark, Storm etc. • Real-time dashboards and alert notification systems
• Beyond niche use cases • Broad applicability but needs more adoption
Stream vs. Batch Processing Pipelines
Transform Analyze ActionVisualize/
PersistIngest
Extract Transform Load Analyze Action
Stream Processing•Continuous processing on data as it flows through a system
•Allows users to act on events instantaneously via alerts
•Processing related to time (event time vs. processing time)
•Real-Time – diff between event time and processing time is negligible
Enables your Data In Motion Architecture
Big Data Application Types
IoTFraud
CDR
CDC
Reporting
SQL
Operations
Data Discovery
SQL on Streams
Streaming Disovery
Sample Streaming Analytics Patterns
Preprocessing • Filtering events• Transforming
attributes
Alerts & Thresholds• Based on complex
conditions
Computing within Windows• Aggregations
Combining Event Streams• Correlation• Error detection
Enrichment • Looking up database,
reference data
Temporal Events• Detecting events
within time windows
Tracking • Tracking events over
space & time
Trend Detection • Rise, Fall • Outliers
Source: https://iwringer.wordpress.com/2015/08/03/patterns-for-streaming-realtime-analytics/
Financial Services
• Detect fraudulent activity in real-time
• Risk Analysis
• Deliver personalized products and
offerings
• Make decisions in real-time for trading
and transactional platforms
Telecom
• Real-time network monitoring and
protection
• Quality of service and Customer
Satisfaction
• Take action based on users’ location
• Automatic resource allocation and load
balancing
Online Advertising
• Dynamic bidding
• Real-time targeting & personalization
• Maximize click-through and
conversion rates.
• Reporting that can be updated
continuously
Internet of Things
• Environment monitoring
• Infrastructure management
• Manufacturing
• Energy management
• Public Building & Home automation
• Transportation
IoT secure ingestion and predictive analysis
High performance, multi-customer secure, data ingestion. Complex event processing with
historical data for predictive maintenance
Sensor 2
Sensor 1
Sensor N
Application n
Application 1
Persistent
Data Governance
Complex Event Process
Predictive maintenance
Stream Processing: Conclusion
•Lots of untapped potential!• Gives your business a competitive edge!
•Open Source and Big Data technologies
• Built to address the scale and latency demands
•Broad use cases • Across industries and verticals
Hadoop Ingestion Made Easyhttps://www.brighttalk.com/webcast/13685/194937/hadoop-ingestion-made-easy-with-datatorrent-dtingest
••••
•••