OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ ·...

33
the NewSQL database you’ll never outgrow OldSQL vs. NoSQL vs. NewSQL on New OLTP Michael Stonebraker, CTO VoltDB, Inc.

Transcript of OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ ·...

Page 1: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

the  NewSQL  database  you’ll  never  outgrow  

OldSQL  vs.  NoSQL  vs.  NewSQL  on  New  OLTP  

Michael  Stonebraker,  CTO  VoltDB,  Inc.  

Page 2: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 2

Old  OLTP  

 Remember  how  we  used  to  buy  airplane  Hckets  in  the  1980s  

+ By  telephone  + Through  an  intermediary  (professional  terminal  operator)  

 Commerce  at  the  speed  of  the  intermediary  

 In  1985,  1,000  transacHons  per  second  was  considered  an  incredible  stretch  goal!!!!  

+ HPTS  (1985)      

Page 3: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 3

How  has  OLTP  Changed  in  25  Years?  

The  internet  + Client  is  no  longer  a  professional  terminal  operator  

+ Instead  Aunt  Martha  is  using  the  web  herself  

+ Sends  volume  through  the  roof  

Page 4: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 4

How  has  OLTP  Changed  in  25  Years?  

PDAs  and  sensors  + Your  cell  phone  is  a  transacHon  originator  + Everything  is  being  geo-­‐posiHoned  by  sensors  (marathon  runners,  your  car,  ….)  

+ Sends  volume  through  the  roof  

Page 5: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 5

How  has  OLTP  Changed  in  25  Years?  

The  definiHons  + “Online”  no  longer  exclusively  means  a  human  operator  

— The  oncoming  data    tsunami  is  o_en  device  and  system-­‐generated  

+ “TransacHon”  now  transcends  the  tradiHonal  business  transacHon  — High-­‐throughput  ACID  write  operaHons  are  a  new  requirement  

+ “HA”  and  “durability”  are  now  core  database  requirements  

Page 6: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 6

Examples  

Maintain  the  state  of  mulH-­‐player  internet  games  

Real  Hme  ad  placement  

Fraud/intrusion  detecHon  

Risk  management  on  Wall  Street  

Page 7: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 7 7VoltDB 7

New  OLTP  Challenges  

New  OLTP  and  You  You  need  to  ingest  the  firehose  in  real  Hme  

You  need  to    process,  validate,  enrich    and  respond  in  real-­‐Hme  

You  o_en  need  real-­‐Hme  analyHcs  

Page 8: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 8

SoluHon  Choices  

 OldSQL  + Legacy  RDBMS  vendors  

 NoSQL  + Give  up  SQL  and  ACID  for  performance  

 NewSQL  + Preserve  SQL  and  ACID  + Get  performance  from  a  new  architecture  

Page 9: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 9

OldSQL  

TradiHonal  SQL  vendors  (the  “elephants”)  

+ Code  lines  daHng  from  the  1980’s    

+ “bloatware”  

+ Not  very  good  at  anything  

— Can  be  beaten  by  at  least  an  order  of  magnitude  in  every  verHcal  market  I  know  of  

+ Mediocre  performance  on  New  OLTP  

— At  low  velocity  it  doesn’t  maeer  

— Otherwise  you  get  to  tear  your  hair  out  

Page 10: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 10

OLTP Data Warehouse

Other apps

DBMS apps

DBMS  Landscape  

Page 11: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 11

DBMS  Landscape  –  Performance  Needs  

OLTP Data Warehouse

Other apps

low

high

high

high

Page 12: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 12

One  Size  Does  Not  Fit  All  -­‐-­‐  Pictorially  

Open source

Column stores

Hadoop

Low-overhead

Main memory DBs

NoSQL

Array DBMSs

Elephants only get “the crevices”

Page 13: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 13

Reality  Check  

 TPC-­‐C  CPU  cycles   On  the  Shore  DBMS  prototype  

 Elephants  should  be  similar  

Page 14: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 14

The  Elephants    

 Are  slow  because  they  spend  all  of  their  Hme  on  overhead!!!  

+ Not  on  useful  work  

 Would  have  to  re-­‐architect  their  legacy  code  to  do  beeer  

Page 15: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 15

To  Go  a  Lot  Faster  You  Have  to……  

 Focus  on  overhead  + Beeer  B-­‐trees  affects  only  4%  of  the  path  length  

 Get  rid  of  ALL  major  sources  of  overhead  + Main  memory  deployment  –  gets  rid  of  buffer  pool  

— Leaving  other  75%  of  overhead  intact  — i.e.  win  is  25%  

Page 16: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 16

Long  Term  Elephant  Outlook  

 Up  against  “The  Innovators  Dilemma”  + Steam  shovel  example  

+ Disk  drive  example  

+ See  the  book  by  Clayton  Christenson  for  more  details  

 Long  term  dri_  into  the  sunset    + The  most  likely  scenario  

+ Unless  they  can  solve  the  dilemma  

Page 17: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 17

NoSQL  

 Give  up  SQL   Give  up  ACID  

Page 18: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 18

Give  Up  SQL?  

 Compiler  translates  SQL  at  compile  Hme  into  a  sequence  of  low  level  operaHons    

 Similar  to  what  the  NoSQL  products  make  you  program  in  your  applicaHon  

 30  years  of  RDBMS  experience  + Hard  to  beat  the  compiler  

+ High  level  languages  are  good  (data  independence,  less  code,  …)  + Stored  procedures  are  good!  

— One  round  trip  from  app  to  DBMS  rather  than  one  one  round  trip  per  record  

— Move  the  code  to  the  data,  not  the  other  way  around  

Page 19: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 19

Give  Up  ACID  

 If  you  need  data  accuracy,  giving  up  ACID  is  a  decision  to  tear  your  hair  out  by  doing  database  “heavy  li_ing”  in  user  code  

 Can  you  guarantee  you  won’t  need  ACID  tomorrow?  

ACID = goodness, in spite of what these guys say

Page 20: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 20

Who  Needs  ACID?  

 Funds  transfer  + Or  anybody  moving  something  from  X  to  Y  

 Anybody  with  integrity  constraints  + Back  out  if  fails  + Anybody  for  whom  “usually  ships  in  24  hours”  is  not  an  acceptable  outcome  

 Anybody  with  a  mulH-­‐record  state  + E.g.  move  and  shoot  

Page 21: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 21

Who  needs  ACID  in  replicaHon  

 Anybody  with  non-­‐commutaHve  updates  + For  example,  +  and  *  don’t  commute  

 Anybody  with  integrity  constraints  + Can’t  sell  the  last  item  twice….  

 Eventual  consistency  means  “creates  garbage”  

Page 22: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 22

NoSQL  Summary  

 Appropriate  for  non-­‐transacHonal  systems  

 Appropriate  for  single  record  transacHons  that  are  commutaHve  

 Not  a  good  fit  for  New  OLTP   Use  the  right  tool  for  the  job  

Two  recently-­‐proposed  NoSQL  language  standards  –  CQL  and  UnQL  –  are  amazingly  similar  to  (you  guessed  it!)  SQL    

Interes5ng  …  

Page 23: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 23

NewSQL  

 SQL   ACID   Performance  and  scalability  through  modern  innovaHve  so_ware  architecture  

Page 24: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 24

NewSQL  

 Needs  something  other  than  tradiHonal  record  level  locking  (1st  big  source  of  overhead)    

+ Hmestamp  order  

+ MVCC  

+ Your  good  idea  goes  here  

Page 25: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 25

NewSQL  

 Needs  a  soluHon  to  buffer  pool  overhead  (2nd  big  source  of  overhead)  

+ Main  memory  (at  least  for  data  that  is  not  cold)  

+ Some  other  way  to  reduce  buffer  pool  cost  

Page 26: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 26

NewSQL  

 Needs  a  soluHon  to  latching  for  shared  data  structures  (3rd  big  source  of  overhead)  

+ Some  innovaHve  use  of  B-­‐trees  

+ Single-­‐threading  + Your  good  idea  goes  here  

Page 27: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 27

NewSQL  

 Needs  a  soluHon  to  write-­‐ahead  logging  (4th  big  source  of  overhead)  

+ Obvious  answer  is  built-­‐in  replicaHon  and  failover    + New  OLTP  views  this  as  a  requirement  anyway  

 Some  details  + On-­‐line  failover?  + On-­‐line  failback?  + LAN  network  parHHoning?  + WAN  network  parHHoning?  

Page 28: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 28

 A  NewSQL  Example  –  VoltDB    

 Main-­‐memory  storage  

 Single  threaded,  run  Xacts  to  compleHon  

+ No  locking  

+ No  latching  

 Built-­‐in  HA  and  durability  

+ No  log  (in  the  tradiHonal  sense)  

Page 29: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 29

Yabut:    What  About  MulHcore?  

 For  A  K-­‐core  CPU,  divide  memory  into  K  (non  

overlapping)  buckets  

 i.e.  convert  mulH-­‐core  to  K  single  cores  

Page 30: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 30

Where  all  the  Hme  goes…  revisited  

Before VoltDB

Page 31: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 31

 Runs  a  subset  of  SQL  (which  is  getng  larger)  

 On  VoltDB  clusters  (in  memory  on  commodity  gear)  

 No  WAN  support  yet    + Working  on  it  right  now  

 50X  a  popular  OldSQL  DBMS  on  TPC-­‐C  

 5-­‐7X  Cassandra  on  VoltDB  K-­‐V  layer   Scales  to  384  cores  (biggest  iron  we  could  get  our  hands  on)  

 Clearly  note  this  is  an  open  source  system!    

Current  VoltDB  Status  

Page 32: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

VoltDB 32

Summary  

Old  OLTP  

OldSQL  for  New  OLTP    Too  slow   Does  not  scale  

NoSQL  for  New  OLTP    Lacks  consistency  guarantees   Low-­‐level  interface  

NewSQL  for  New  OLTP    Fast,  scalable  and  consistent   Supports  SQL    

New  OLTP  

Page 33: OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ · the$NewSQL$database$you’ll$never$outgrow$ OldSQL’vs.’NoSQL’vs.’NewSQL’ onNew’OLTP’ Michael$Stonebraker,$CTO$

the  NewSQL  database  you’ll  never  outgrow  

Thank  You