Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

15
Why Hadoop and SQL just want to be friends Simon Elliston Ball @sireb

description

A lightning talk from NoSQL Matters Dublin on why we need to stop doing ETL and focus on ELT, and how the Hadoop approach helps you short cut the model, parse, query loop when processing data.

Transcript of Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Page 1: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Why Hadoop and SQL just want to be friendsSimon Elliston Ball

@sireb

Page 2: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

ETL

OLTP

Archive

EDWETL

Page 3: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

ETL

OLTP

Archive

EDWETL

Page 4: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

ETL

OLTP

Archive

EDWETL

Page 5: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

ETL

More dataShorter windowsWider queries

Page 6: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

ETL

OLTP

Archive

EDWETL Sqoop

PigHive

OozieFalcon

Page 7: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

ELT

ETL

OLTP

Archive

EDWETL

Less structured

Sqoop

Page 8: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

ELT: saving the T for later2012-01-06 09:22:27 W3SVC1273337584 RD00155D360166 10.211.146.27 GET /ustensiles - 80 Test0001 94.245.127.11 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) __RequestVerificationToken_Lw__=KLZ1dz1Aa4o2UdwJVwr0JhzSwmmSHmID9i/gutMvQkZWX9Q4QDktFHHiBhF8mSd6Cg5oIEeUpy/KNF7VLRFkrqN28raL8PfNuv0IfuKXxgl5s+uZpcvfGE6Olfsu7uNLg2bWwLZkrqXjv9cpRGaiXelmaM8=;+.ASPXAUTH=D5796612E924B60496C115914CC8F93239E99EEF4B3D6ED74BDD5C8C38D8C115D3021AB7F3B06E563EDE612BFBCBBE756803C85DECFACCA080E890C5DA6B4CA00A51792D812C93101F648505133C9E2C10779FA3E5AC19EE5E2B7E130C72C18F6309AEB736ABD06C87A7D636976A20534833E20160EC04B6B6617B378845AE627979EE54 http://site.supersimple.fr/Users/Account/LogOn?ReturnUrl=%2Fustensiles site.supersimple.fr 200 0 0 7136 849 1249

Page 9: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Schema on write:

ELT: saving the T for later

ParseModel Store Query

● Keep going back to the drawing board● Reprocessing all the data

Page 10: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Schema on read:

ELT: saving the T for later

● Only model what you need● Agile Data Modelling● Don’t move the data

QueryStore Model Parse

Page 11: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Cost per TB...

Page 12: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Come for the cheap storage...

The Data Lake

https://www.flickr.com/photos/msvg/5891279010

Page 13: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

...stay for the analytics

Machine learning librariesRecommendation systemsBatch Big Data

Page 14: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Summary

Hadoop can:● Improve your ETL processing● Help you with unstructured data● Save you money

Page 15: Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dublin 2014

Thank you!Simon Elliston Ball

@sireb