Webinar - Making the Shift from Relational to NoSQL

45
Making the Shift from Relational to NoSQL Tugdual “Tug” Grall Technical Evangelist

Transcript of Webinar - Making the Shift from Relational to NoSQL

Page 1: Webinar - Making the Shift from Relational to NoSQL

Making the Shift fromRelational to NoSQL

Tugdual “Tug” Grall

Technical Evangelist

Page 2: Webinar - Making the Shift from Relational to NoSQL

Agenda

• Why moving to NoSQL?

• Key differences

• How to move?

• Modeling the data

• Accessing the data

• Migrating the data

You can download Couchbase at www.couchbase.com/download

Page 3: Webinar - Making the Shift from Relational to NoSQL

Why transition to NoSQL?

Page 4: Webinar - Making the Shift from Relational to NoSQL

You are here because..

You want to build applications

Page 5: Webinar - Making the Shift from Relational to NoSQL

and currently have a relational database

Page 6: Webinar - Making the Shift from Relational to NoSQL

It’s probably gotten a little bit big..

http://www.seoclerks.com/imagedb/2005/BIG-CAT-FOUND-Spoh

Page 7: Webinar - Making the Shift from Relational to NoSQL

Or maybe huge

http://www.2pep.com/extreme-funny-stuff/

Page 8: Webinar - Making the Shift from Relational to NoSQL

And complicated

Page 9: Webinar - Making the Shift from Relational to NoSQL

You’re not aloneTop two big drivers for NoSQL

adoption

Lack of flexibility/rigid schemas

Inability to scale out data

Performance challenges

Cost All of these Other

49%

35%

29%

16%12% 11%

Source: Couchbase Survey, December 2011, n = 1351.

Page 10: Webinar - Making the Shift from Relational to NoSQL

NoSQL catalog

Key-Value

memcached redis

Data Structure Document Column Graph

mongoDB

couchbase cassandra

Cac

he

(mem

ory

on

ly)

Dat

abas

e(m

emo

ry/d

isk)

Neo4j

Page 11: Webinar - Making the Shift from Relational to NoSQL

Key Differences

Page 12: Webinar - Making the Shift from Relational to NoSQL

Relational vs Document data model

Relational data model Document data modelCollection of complex documents with

arbitrary, nested data formats andvarying “record” format.

Highly-structured table organization with rigidly-defined data formats and

record structure.

JSONJSON

JSON

C1 C2 C3 C4

{

}

Page 13: Webinar - Making the Shift from Relational to NoSQL

Document Databases

• Each record in the database is a self-describing document

• Each document has an independent structure

• Documents can be complex • All databases require a unique key• Documents are stored using JSON or

XML or their derivatives• Content can be indexed and queried • Offer auto-sharding for scaling and

replication for high-availability

{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,

“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:

{

“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:

[

“SERVER”, “US-West”, “API”]

}

}

Page 14: Webinar - Making the Shift from Relational to NoSQL

Example: User Profile

Address Info

1 DEN 30303CO

2 MV 94040CA

3 CHI 60609IL

User Info

KEY First ZIP_idLast

4 NY 10010NY

1 Dipti 2Borkar

2 Joe 2Smith

3 Ali 2Dodson

4 John 3Doe

ZIP_id CITY ZIPSTATE

1 2

2 MV 94040CA

To get information about specific user, you perform a join across two tables

Page 15: Webinar - Making the Shift from Relational to NoSQL

All data in a single document

Document Example: User Profile

{“ID”: 1,“FIRST”: “Tug”,“LAST”: “Grall”,“ZIP”: “94040”,“CITY”: “MV”,“STATE”: “CA”

}JSON

= +

Page 16: Webinar - Making the Shift from Relational to NoSQL

User ID First Last Zip

1 Dipti Borkar 94040

2 Joe Smith 94040

3 Ali Dodson 94040

4 Sarah Gorin NW1

5 Bob Young 30303

6 Nancy Baker 10010

7 Ray Jones 31311

8 Lee Chen V5V3M

•••

50000 Doug Moore 04252

50001 Mary White SW195

50002 Lisa Clark 12425

Country ID

TEL3

001

Country ID Country name

001 USA

002 UK

003 Argentina

004 Australia

005 Aruba

006 Austria

007 Brazil

008 Canada

009 Chile

•••

130 Portugal

131 Romania

132 Russia

133 Spain

134 Sweden

User ID Photo ID Comment

2 d043 NYC

2 b054 Bday

5 c036 Miami

7 d072 Sunset

5002 e086 Spain

Photo Table

001

007

001

133

133

User ID Status ID Text

1 a42 At conf

4 b26 excited

5 c32 hockey

12 d83 Go A’s

5000 e34 sailing

Status Table

134

007

008

001

005

Country Table

User ID Affl ID Affl Name

2 a42 Cal

4 b96 USC

7 c14 UW

8 e22 Oxford

Affiliations TableCountry

ID

001

001

001

002

Country ID

Country ID

001

001

002

001

001

001

008

001

002

001

User Table

.

.

.

Making a Change Using RDBMS

Page 17: Webinar - Making the Shift from Relational to NoSQL

Making the Same Change with a Document Database

{“ID”: 1,“FIRST”: “Tug”,“LAST”: “Grall”,“ZIP”: “94040”,“CITY”: “MV”,“STATE”: “CA”,“STATUS”: { “TEXT”: “At Conf”

}

}

“GEO_LOC”: “134” },“COUNTRY”: ”USA”

Just add information to a document

JSON

,}

Page 18: Webinar - Making the Shift from Relational to NoSQL

Where is NoSQL a good fit?

Page 19: Webinar - Making the Shift from Relational to NoSQL

Market Adoption

Internet Companies Enterprises

• Social Gaming

• Ad Networks

• Social Networks

• Online Business Services

• E-Commerce

• Online Media

• Content Management

• Cloud Services

• Communications

• Retail

• Financial Services

• Health Care

• Automotive/Airline

• Agriculture

• Consumer Electronics

• Business Systems

Page 20: Webinar - Making the Shift from Relational to NoSQL

Market Adoption – Customers

Internet Companies Enterprises

More than 300 customers -- 5,000 production deployments worldwide

Page 21: Webinar - Making the Shift from Relational to NoSQL

Application Characteristics - Data driven

• 3rd party or user defined structure (Twitter feeds)

• Support for unlimited data growth (Viral apps)

• Data with non-homogenous structure

• Need to quickly and often change data structure

• Variable length documents

• Sparse data records

• Hierarchical data

Couchbase is a good fit

Page 22: Webinar - Making the Shift from Relational to NoSQL

Application Characteristics -Performance driven

• Low latency critical (ex. 1millisecond)

• High throughput (ex. 200000 ops / sec)

• Large number of users

• Unknown demand with sudden growth of users/data

• Predominantly direct document access

• Read / Mixed / Write heavy workloads

Couchbase is a good fit

Page 23: Webinar - Making the Shift from Relational to NoSQL

Common Use CasesSocial Gaming

• Couchbase stores player and game data

• Examples customers include: Zynga

• Tapjoy, Ubisoft, Tencent

Mobile Apps

• Couchbase stores user info and app content

• Examples customers include: Kobo, Playtika

Ad Targeting

• Couchbase stores user information for fast access

• Examples customers include: AOL, Mediamind, Convertro

Session store

• Couchbase Server as a key-value store

• Examples customers include: Concur, Sabre

User Profile Store

• Couchbase Server as a key-value store

• Examples customers include: Tunewiki

High availability cache

• Couchbase Server used as a cache tier replacement

• Examples customers include: Orbitz

Content & Metadata Store

• Couchbase document store with Elastic Search

• Examples customers include: McGraw Hill

3rd party data aggregation

• Couchbase stores social media and data feeds

• Examples customers include: Sambacloud

Page 24: Webinar - Making the Shift from Relational to NoSQL

From SQL to NoSQL

Page 25: Webinar - Making the Shift from Relational to NoSQL

Document Design

Page 26: Webinar - Making the Shift from Relational to NoSQL

Document Design

• One document that contains all related data

­ Data is de-normalized

­ Better performance and scale

­ Eliminate client-side joins

­ But if the document grows continuously or has write contention, it should be split

• Separate documents for different object types with cross references

­ Data duplication is reduced

­ Objects may not be co-located

­ Transactions supported only on a document boundary

­ Most document databases do not support joins

Page 27: Webinar - Making the Shift from Relational to NoSQL

Example: Entities for a Blog

• User profile

The main pointer into the user data

• Blog entries

• Badge settings, like a twitter badge

• Blog posts

Contains the blogs themselves

• Blog comments

• Comments from other users

Page 28: Webinar - Making the Shift from Relational to NoSQL

Single vs. Multiple Documents

• We need to decide the best way to store comments

{

}

Single Multiple

vs.

Document

Comment

Comment

Comment

Page 29: Webinar - Making the Shift from Relational to NoSQL

{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,

“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:

{

“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:

[

“SERVER”, “US-West”, “API”]

}

}

Blog Document – Option 1 – Single document

{

“_id”: “Couchbase_Hello_World”,

“author”: “dborkar”,

“type”: “post”

“title”: “Hello World”,

“format”: “markdown”,

“body”: “Hello from [Couchbase](http://couchbase.com).”,

“html”: “<p>Hello from <a href=\“http: …

“comments”:[

[“format”: “markdown”, “body”:”Awesome post!”],

[“format”: “markdown”, “body”:”Like it.” ]

]}

Page 30: Webinar - Making the Shift from Relational to NoSQL

Blog Document – Option 2 - Split into multiple docs

{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,

“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:

{

“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:

[

“SERVER”, “US-West”, “API”]

}

}

{

“_id”: “Coucbase_Hello_World”,

“author”: “tgrall”,

“type”: “post”

“title”: “Hello World”,

“format”: “markdown”,

“body”: “Hello from

[Couchbase](http://couchbase.com).”,

“html”: “<p>Hello from <a href=\“http: …

“comments”:[

“comment1_Couchbase_Hello_world”

]

}{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,

“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:

{

“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:

[

“SERVER”, “US-West”, “API”]

}

}

{

“_id”: “comment1_Couchbase_Hello_World”,

“format”: “markdown”,

“body”:”Awesome post!”

}

BLOG DOC

COMMENT

Page 31: Webinar - Making the Shift from Relational to NoSQL

Threaded Comments

• You can imagine how to take this to a threaded list

Blog

First comment

Reply to comment

More Comments

List

List

Advantages

• Only fetch the data when you need it• For example, rendering part of a web page

• Spread the data and load across the entire cluster

Page 32: Webinar - Making the Shift from Relational to NoSQL

Which one is the best?

Case by Case:

• How data are created by the application?

• How data are read by the application?

• Focus on the most common use case in the application

Note: changing document “schema” is really easy !

Page 33: Webinar - Making the Shift from Relational to NoSQL

Write your application

Page 34: Webinar - Making the Shift from Relational to NoSQL

Mental Adjustments

• In SQL we tend to want to avoid hitting the database as much as possible

• In Couchbase, gets and sets on documents are in memory. This is fast.

• The key to finding data is the key of the document.

• Many newcomers see views as a replacement for key design, because it seems more “SQL” –like.

Page 35: Webinar - Making the Shift from Relational to NoSQL

Code Change

• Replace your database calls by Couchbase calls:­ Find/Select by ID : couchbase.get(key);

­ Save/Update : couchbase.set(key, document);

­ Delete : couchbase.delete(key);

• If you need locking for concurrent access use CAS­ Check and Swap : Optimistic Locking

• All these operations are atomic

Page 36: Webinar - Making the Shift from Relational to NoSQL

Query the data

• Create Couchbase views to index your documents

• Use views to:­ Index documents on primary index

­ Index documents on secondary index

­ Query these indexes using range, aggregate, …

CRUD Operations MAP()

emit()

(processed)

Page 37: Webinar - Making the Shift from Relational to NoSQL

Couchbase Views

Page 38: Webinar - Making the Shift from Relational to NoSQL

Data Migration

Page 39: Webinar - Making the Shift from Relational to NoSQL

Data Migration

Move your data:• Using ETL or other tools• Own code

Page 40: Webinar - Making the Shift from Relational to NoSQL

Moving Data using an ETL

Page 41: Webinar - Making the Shift from Relational to NoSQL

DEMO

Moving Data Using TalendMoving Data Using Java Importer

Page 42: Webinar - Making the Shift from Relational to NoSQL

Conclusion

• Move to NoSQL to gain scalability and flexibility

• Design your document based on the application logic

• Migrate the data easily with tools

• Resources:

• Java Petstore Migration: http://goo.gl/mzRNqt

• Talend Connector : http://couchbase.com/couchbase-server/connectors/talend

• Couchbase SQL Importer : https://github.com/tgrall/couchbase-sql-importer

Download Couchbase: www.couchbase.com/download

Page 43: Webinar - Making the Shift from Relational to NoSQL

Questions?

Page 44: Webinar - Making the Shift from Relational to NoSQL

Thank you

Couchbase NoSQL Document Database

[email protected]

Page 45: Webinar - Making the Shift from Relational to NoSQL