relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of...

77
wow

Transcript of relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of...

Page 1: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

wow

Page 2: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

CPSC350

•“relational schemas”

•“table normalization”

•“practical use of relational algebraic operators”

•“tuple relational calculus”

•“and their expression in a declarative query language”

Page 3: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

relational schemas

Page 4: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

CPSC350

•“relational schemas”

•“table normalization”

•“practical use of relational algebraic operators”

•“tuple relational calculus”

•“and their expression in a declarative query language”

Page 5: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 6: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

practical knowledgethe stuff you need to know to be a

better software developer

Page 7: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

An introduction to the non-SQL part of the course

Page 8: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Cloud Storage in a !

Post SQL World

Ian WIlkes ars technica

Page 9: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 10: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

“Since the rise of the Web, SQL-based

relational databases have been the

dominant structured storage technology

behind online applications.”

Page 11: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

“The past few years have seen the

emergence of the cloud as a compelling

environment for online application

development, bringing true utility

computing into the infrastructure

pantheon”

Page 12: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

“But the cloud and SQL do not mix well,

and multiple efforts are now underway

to offer viable alternatives to the

venerable database.”

Page 13: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

“while relational databases are by no

means doomed, they will soon be joined

in the cloud, and possibly out-shined

by, new non-relational database

technologies”

Page 14: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

<disect>

Page 15: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

more and more computing is in the cloud

•google docs

•dropbox

Page 16: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

more and more computing is in the cloud

•google docs

•dropbox

•chromeOS

•Google Cr-48

Page 17: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

more and more computing is in the cloud

•google docs

•dropbox

•chromeOS

•Google Cr-48

•iPad / Android / mobile computing

Page 18: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

more and more computing is in the cloud

•Even cloud/browser based IDEs.

Page 19: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 20: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Take a moment !

do back of envelope calculation

!what % of your computer time

on cloud?

Page 21: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

results

Page 22: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 23: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

some believe the cloud and SQL do not mix well

Page 24: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

<on one side>

Page 25: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

SQL databases work fine

•setup straightforward

•programming not that hard

•wide range of support

•for small projects XAMPP stack near plug and play.

•SQL on a Amazon ec2 instance easy to set up and work with

Page 26: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

SQL can scale vertically

Page 27: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

big iron

Page 28: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

SQL can grow vertically

Page 29: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

SQL cannot grow horizontally

easily

Page 30: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 31: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Brewer’s Theorem a.k.a. CAP Theorem

A system cannot have high consistency, availability, and

partition tolerance simultaneously

Page 32: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

SQL focuses on consistency

Page 33: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

SQL focuses on consistency

constraints on foreign keys !

Page 34: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

ACID semantics (ch11)

Page 35: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 36: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 37: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 38: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 39: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 40: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 41: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Brewer’s Theorem a.k.a. CAP Theorem

A system cannot have high consistency, availability, and

partition tolerance simultaneously

Page 42: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

availability

online web apps - availability is a must

Page 43: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Brewer’s Theorem a.k.a. CAP Theorem

A system cannot have high consistency, availability, and

partition tolerance simultaneously

Page 44: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

mismatchClustered machines -- SQL semantics

Page 45: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

BASEinstead of ACID semantics

Page 46: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

BASEBasically Available, Soft state,

Eventual consistency

Page 47: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Basically Available:

This constraint states that the system does guarantee the

availability of the data as regards CAP Theorem; there will be a response to any request. But, that response could still be ‘failure’ to obtain

the requested data or the data may be in an inconsistent or changing state,

much like waiting for a check to clear in your bank account.

Page 48: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Soft state:

The state of the system could change over time, so even during times

without input there may be changes going on due to ‘eventual

consistency,’ thus the state of the system is always ‘soft.’

Page 49: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Eventual consistency:

The system will eventually become consistent once it stops receiving input. The data will propagate to

everywhere it should sooner or later, but the system will continue to

receive input and is not checking the consistency of every transaction before it moves onto the next one.

Page 50: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Wait, there’s more

Page 51: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Our model

PythonFlask SQL Server

JOIN

ORDER BYJOIN

Page 52: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Extending our model

Python Flask SQL Server

JOIN

ORDER BYJOIN

ORDER BY

JOIN

JOINJOIN

JOIN

ORDER BY

ORDER BYPython Flask

Python Flask

Python Flask

Python Flask

Python Flask

Python Flask

Page 53: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Our model

Apache S. SQL Server

JOIN

ORDER BYJOIN

Page 54: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Extending our model

Apache S. SQL Server

JOIN

ORDER BYJOIN

Apache S.

Apache S. Apache S.

Apache S.

Apache S.

Apache S.ORDER BY

JOIN

JOINJOIN

JOIN

ORDER BY

ORDER BY

Page 55: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

All data processing on database server

!adds load to CPU

!further inhibits scaling

Page 56: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Some SQL platform providers offer cloud relational db

•Amazon’s mySQL based RDS

•Heroku’s PostgreSQL

•but these more like managed hosting than true cloud (distributed) computing

Page 57: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

push to NoSQL alternatives !!

sacrifice consistency in favor of scalability and

availability

Page 58: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

first implementations proprietary

!

•Google Big Table

•Amazon Dynamo

Page 59: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

NoSQL features

•No schemas -- no fixed table structure, no fixed columns

• little to no protection against invalid data

Page 60: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

NoSQL features

•No or limited joins - no built in methods for connecting /joining entries.

• must be done at application level

Page 61: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

NoSQL features

•Restricted query interface - SQL has rich query interface. noSQL somewhat limited.

• although some NoSQL systems (e.g., CouchDB) offer different views.

Page 62: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

NoSQL features

•Limited transactions or locks - ACID transactions not distributed system friendly.

Page 63: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

NoSQL open source projects

Page 64: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Cassandra

•developed by Facebook and used in-house

•now an Apache project

•used in a large number of production environments

• Digg, Rackspace, Twitter (internally)

• Netflix

• Redit

• CERN

• not Facebook

Page 65: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

CouchDB

•an Apache project

•operates on JSON documents

•the read state of the db is not always consistent with the latest writes.

•reasonably stable

•used by:

• wego, BBC, World Wildlife Fund,

• http://www.couchbase.com/customers/case-studies

Page 66: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 67: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

mongoDB

•operates on JSON documents

•considered faster than CouchDB.

•good stability

•used by

• foursquare, bit.ly, intuit, shutterfly, nyTimes, Etsy, Cern (Large Hadron Collider)justin.tv

Page 68: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Voldemort

•developed by LinkedIn

•now open source

•bare-bones key-value pair db.

•all data management work done in the applications.

Page 69: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

HBase

•A Big Table clone

•uses map-reduce

Page 70: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Amazon’s simpleDB

•not open source

•disappointing performance

•not mature

Page 71: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

The top ones

•Cassandra

•CouchDB

•mongoDB

•Redis

Page 72: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Failure of NoSQL

DIGG

Page 73: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 74: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 75: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 76: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage
Page 77: relational schemas · declarative query language ... Ian WIlkes ars technica “Since the rise of the Web, SQL-based relational databases have been the dominant structured storage

Road Map for remainder of course

•mongoDB

•CouchDB

•Redis