SVCO 2013 SQLstream The Clash of Big Data and Consumer Expectation

36
High-Velocity Big Data: November 2013 [email protected] The Coming Clash of Big Data Technologies and Consumer Expecta9on Damian Black CEO SQLstream

description

Silicon Valley Comes to Oxford (SVCO) is an invite-only event at the Said Business School, University of Oxford University in the UK. The aim of the event is to provide insight to the Business School graduates on how to start, scale and run high-growth companies. The speakers include prominent entrepreneurs, innovators and investors from Silicon Valley, including Damian Black from SQLstream, Phil Libin CEO of Evernote and Mike Olson co-founder Cloudera, Damian Black, SQLstream CEO, main presentation was to the business school addressed the mismatch between Big Data hype and reality. The talk entitled "The Clash of Big Data and Consumer Expectation", examined how the Big Data movement, as epitomized by Hadoop and NoSQL platforms, is failing to keep up with consumer expectations. In large part this is down to the solution cost and immaturity of the Hadoop-based technologies, but in the face of dramatically increasing data rates from telecoms and the IoT, means that Hadoop and related Big Data technologies are unable to deliver the real-time, low latency performance at any reasonable cost. We're also talking about a world of sensors and other data where much of the data is business as usual and no need to be stored.

Transcript of SVCO 2013 SQLstream The Clash of Big Data and Consumer Expectation

Page 1: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

High-Velocity Big Data:

N ove m b e r 2 0 1 3

d a m i a n @ s q l s t r e a m . c o m

The  Coming  Clash  of  Big  Data  Technologies  and  Consumer  Expecta9on  

 Damian  Black  CEO  SQLstream  

Page 2: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

•  We live in the world of Big Data. –  The Internet of Everything: sensors, services, systems/devices.

•  Everybody expects real-time Internet information. –  We need a new paradigm for Big Data processing.

•  What are the implications on the IT industry and on business? –  Let’s imagine a world where we have continuous real-time visibility into

streaming Big Data, and explore the real-time business possibilities.

Setting the scene

Page 3: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

•  What are the drivers? •  Broadband everywhere. •  Sensors everywhere. •  Wireless everything. •  Parallel commodity computation. •  Elastic computation (Cloud). •  Smartphone (Hi-Res display everywhere).

The emerging world of real-time data

Page 4: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Imagine… Everyone carrying a smartphone. Everyone belonging to a searchable social network. Smartphone apps providing access to all information.

Therefore… You can “follow” anybody, anywhere, at anytime. You can access any information, anywhere, at anytime. All the devices, applications and people are always connected.

Everyone and everything connected

Page 5: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

We are talking Really Big Data…

–  Exponential, compounded growth

So technology has drawn on the old world processing model… –  Store, cleanse then process the data –  Analytics means traversing history –  Querying against stale snapshots

Exponential growth in data volumes is causing this model to break –  Advent of Hadoop –  Using parallel processing to combat volumes –  Batched-based means very high latency

Contemplate the data management challenge

Page 6: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Rear-view mirror thinking…

Page 7: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Hi s to ry   and  emergence  o f  B i g  Da ta  D

ist

rib

ut

ed

D

at

a P

ro

ce

ss

ing

Evolut ion of data management technology over t ime

1 9 6 0    

2 0 0 0  1 9 8 0  

 Centralized  Architecture  

Clustered  Architecture  

Indexed  Files:  SEQUENTIAL    

MODEL  

 Client-­‐Server  Architecture  

 

Network  Sockets:  

SEQUENTIAL    MODEL  

Messaging  Middleware:  HIERARCHICAL    

MODEL    

First  Databases  HIERARCHICAL    

MODEL  

BIG  DATA:  SEQUENTIAL  MODEL    

subsumed  by  RELATIONAL  MODEL  

 Data    

Warehouses:  RELATIONAL    MODEL  

STREAMING  BIG  DATA:  RELATIONAL    MODEL  Distributed  

Architecture  

Page 8: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Evolution of Hadoop: –  Google adopts Map-Reduce for indexing the web. –  Yahoo emulates Google with Java-based Map-Reduce. –  Hadoop open-sourced technology. –  CloudEra, HortonWorks and many others create releases. –  CloudEra creates Impala and Search.

Limitations of Hadoop: –  Storage and archiving architecture. –  Massively parallel execution traded off latency. –  Not designed for interactive applications or real-time response.

Recent history of Big Data

Page 9: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Moving from high-latency to streaming

Collect

Cleanse

Enrich

Analyze

Share

LOW LATENCY

•  Traditional approach leads to high latency: •  “Holding tank” buffers for the data.

•  Streaming approach: –  Replace the intervening “holding tank” buffers

with data pipelines.

–  Stream the data continuously through the pipes. –  Results stream out immediately.

Page 10: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

We have faster data, so surely we need a faster database… 1.  To ingest data faster, 2.  To get answers faster, 3.  For “Big Data Scale”?

Battling conventional wisdom…

Page 11: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

…why not stream the nails? I need a bigger, faster hammer!

Everything looks like a nail when all you have is a hammer…

Page 12: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Expects everything to be available and updated real-time: –  Integrated aggregated view of services, transactions, accounts… –  Able to search and get real-time accurate results –  Based on powerful real-time analytics continuously updated

But… Hadoop does not solve this. Neither does database technology. Neither do log file analytics companies.

Surely there’s got to be a better way?

The consumer’s expectation…

Page 13: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

The  Coming  Reality  Clash:    “Real-­‐Rme  operaRons”  meets  “Batch  analyRcs”.  

Business    Intelligence  

OperaRons  

Real-­‐Rme  OperaRonal  Intelligence  ConRnuous  monitoring  and  analyRcs  

Faster  decision-­‐making  Automated  operaRons  

Security  Cross-­‐selling  &  Ads  

Real-­‐Rme  PromoRons  Quality  &  Compliance  Health  and  Capacity  

Fraud  &  TheX  

As we move toward a real-time business environment, the capability to process data flows swiftly and flexibly will become increasingly important. SQLstream leads the industry in this kind of capability. ”

” Robin Bloor

Chief Analyst for Bloor Group

Page 14: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Enabling a whole new world of possibilities…

What is happening?

What might happen?

What just happened?

Make it happen!

Page 15: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

•  Twitter Storm – 100K downloads, used by Twitter and others.

•  Amazon Kinesis – streaming as a service.

•  IBM Streams – inventors of SQL now invent SPL2.

•  SQLstream s-Server – SQL as the Lingua Franca of data management.

The market wakes up to streaming…

Page 16: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

SQLstream,   Inc.   product   suite  wins   Technology   Innova8on  Award   for   IT   Analy9cs   and  P e r f o r m a n c e  

S Q L s t r e a m ,   I n c .   e n t e r s      -­‐ D B T A   1 0 0 -­‐    the   companies   that   maIer   most   in  d a t a  

Other  winners  Other  Top100  vendors  

The market recognizes SQLstream…

Page 17: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

About SQLstream

facts  o  Launched  2009  now  4th  

generaRon  technology    o  Deployments  spanning  

many  industries  

o  World-­‐leading  benchmarks  

capabiliRes  o  Manage  and  moneRze  

dynamic  data  assets  

o  Both  unstructured  and  structured  data  

o  Both  SQL  and  Java  as    first-­‐class  alternaRves  

innovaRons  o  Massively  scalable  

streaming  pla_orm  

o  Only  standard  SQL  streaming  engine  

o  Six  patents  for  stream  processing  

Streaming  Big  Data  management  pla_orm:  •  does  for  streaming  data  what    

databases  do  for  stored  data  

Page 18: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

A large, wide-open market opportunity

Analytics Capability

Simple Moderate Advanced

Records Per

Second

25K

1M

10M

20M+

Simple time-series with simple joins

Security Intelligence

Internet of Things

Telecomm

Partitioning, n-way joins, full time-series plus spatial

String matching, regular expressions

SQLstream

“Store Data First” Products

Hig

h V

eloc

ity

Dat

a

Page 19: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

The  Technology  

Page 20: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Streaming Big Data Platform

Historical queries for real-time data enrichment

Storing valuable derived streams for future access

Ope

ratio

nal I

ntel

ligen

ce

Logs  

Sensors  

GPS  

Networks  

Social  media  

RFIDs  

Servers  

Telecom  

Smart    grid  

Oil  &  Gas  

Manufacturing  

LogisRcs  

M2M  

TelemaRcs  

Retail  

Internet  

Banking  

Data  centers  

AutomoRve  

¤  Continuous queries with parallel incremental evaluation ¤  Real-­‐Rme  processing  of  unstructured  and  structured  data ¤  Predictive analytics driving automated actions

Page 21: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Use  Case  and  Real-­‐world  ImplicaRons  

Page 22: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

•  Mozilla Firefox 4 – Real-time Download Monitor •  Continuous processing of download requests

•  Real-time integration with Hadoop and HBase

Did you see this? (GOOGLE: “Youtube Mozilla Glow”)

Page 23: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Intelligent real-time reactions to: •  Every movement of every

customer within my supermarket, shopping mall, holiday complex, road trip…

•  Every purchase of every customer in the context of past purchases and other demographic info.

•  Every interaction of every customer showing signs of dissatisfaction.

•  Every interaction of a prospective customer on my website.

•  Every vehicle on my congested road network.

•  Every patient or system exhibiting a state of distress.

•  Every step taken by anyone trying to break into my property or systems.

•  Every phone call, text or application activity on my cloud, hosted service or Telecomm network.

A vision of a Streaming Data driven world…

Page 24: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

History repeats itself, especially in computing, …but Stream Processing really is a new paradigm, …and the time has to be right for a new technology to arise.

Stream Processing looks like it is...

The Right Technology at the Right Time.

Concluding Remarks

Page 25: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

ANY QUESTIONS?

Email: [email protected]

Page 26: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

InfoArmor  

•  Spun out from JP Morgan Chase protecting 10M credit card holders

•  Identity Theft Protection and Internet Surveillance

•  Growing at triple digit rates

IDENTITY  THEFT  MONITORING  

•  “We evaluated building our own and explored other vendors, but chose SQLstream because they met our requirements entirely and they provided the only 100% ISO ANSI/SQL standards-based streaming platform. That enabled us massive scalability, a very fast deployment and a highly competitive TCO.”

Fortune  500  Customers   Benefits  

•  Scalability with continuous integration •  Fast integration •  Lower Total Cost of Solution

Performance

Page 27: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Roads  &  MariRme  Services  

Opportunity  

•  Real-time, accurate and open platform for traffic network information from GPS data

•  Responsible for roads and waterways in NSW, Australia

•  Developed TT5 for advanced real-time traffic information in collaboration with SQLstream

REAL-­‐T IME  TRAFFIC  CONTROL  

•  “The ability to deliver high quality, critical information to the travelling public in real time helps Roads and Maritime to improve the journey experience, reducing frustration and increasing productivity.”

Page 28: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Total  Cost  of  Performance  

Page 29: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

R E C O R D S   P E R   S E C O N D  

TOTAL  COST  OF  PERFORMANCE  FOR  BIG  DATA  

Pacerns   Trends   Mining  ConnecRons  

Searches   Inventory   Reports  StaRsRcs   Billing  

SOCIAL   E-­‐COMM   SECURITY   TELEMATICS   TELECOM  

Trading   AdverRsing   Alerts  DetecRon   Signal  

Intelligence  

TOTAL  C

OST            

Page 30: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Intelligence  

TELECOM  

Pacerns   Trends   Mining  ConnecRons  

Searches   Inventory   Reports  StaRsRcs   Billing  

Trading   AdverRsing   Alerts  DetecRon   Signal  

SOCIAL   E-­‐COMM   SECURITY   TELEMATICS  

R E C O R D S   P E R   S E C O N D  

TOTAL  C

OST            

TOTAL  COST  OF  PERFORMANCE  FOR  BIG  DATA  

Page 31: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Think Different – Stream Processing

The  Bigger  Hammer:  Distributed  Database  Clusters  

The  Streaming  Approach:  Rela8onal  Stream  Processing  

Faster  IngesRon   •  Hold  it  in  memory  (RAM)  •  Use  in  RAM  indexing  •  Compress  the  data  

•  Stream  the  data  into  queries  •  Index  incrementally  on-­‐the-­‐fly  •  Recycle  memory  conRnuously  

Faster  Answers   •  Traverse  large  datasets  in  RAM  •  Use  lots  and  lots  of  servers    

•  Stream  out  answers  in  real-­‐Rme  

•  Just  do  it…  incrementally!    

Big  Data  Scaling  Issues  

•  Synchronize  across  clusters  •  Running  each  query  to  

compleRon  •  Forever  polling  the  database  

•  Umm…what  scaling  issues?  

Page 32: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

SQLSTREAM DATAFLOW TECHNOLOGY PIPELINING AND SUPERSCALAR PARALLEL PROCESSING

Fine-grained parallelism: simple, massively scalable, super fast.

   

   Query Processor =

Page 33: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

CLEANING & FILTERING

STREAMING ANALYTICS

STREAMING AGGREGATION

CONTINUOUS INTEGRATION

Internet Security Fraud

Prevention Network

Monitoring

CyberAttack Monitoring

Compliance Monitoring

Our Streaming Data Management Platform

Log  Files   Databases  LocaRons   Networks   Social  Media  Servers  M2M   Feeds  

s-SERVER

Page 34: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

SELECT STREAM ROWTIME, url, numErrorsLastMinute FROM ( SELECT STREAM ROWTIME, url, numErrorsLastMinute, AVG(numErrorsLastMinute) OVER lastMinute AS avgErrorsPerMinute, STDDEV(numErrorsLastMinute) OVER lastMinute AS stdDevErrorsPerMinute FROM ServiceRequestsPerMinute WINDOW lastMinute AS (PARTITION BY url RANGE INTERVAL ‘1’ MINUTE PRECEDING) ) AS S WHERE S.numErrorsLastMinute > S.avgErrorsPerMinute + 2 * S.stdDevErrorsPerMinute;

Streaming SQL for Cloud Monitoring

Business Need: Detect run-away applications

before resource consumption becomes an issue.

Page 35: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

Streaming Visualization

s-­‐Visualizer  

Page 36: SVCO 2013   SQLstream  The Clash of Big Data and Consumer Expectation

High-Velocity Big Data:

N ove m b e r 2 0 1 3

d a m i a n @ s q l s t r e a m . c o m

The  Coming  Clash  of  Big  Data  Technologies  and  Consumer  Expecta9on  

 Damian  Black  CEO  SQLstream