Keynote: Beyond NoSQL – Building a digital future – Couchbase Connect 2016
How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling
description
Transcript of How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling
![Page 1: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/1.jpg)
Couchbase 103Todd Greenstein | Engineering, Couchbase
![Page 2: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/2.jpg)
Modeling in NOSQL vs RDBMS
![Page 3: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/3.jpg)
Key-value store with:
special support for JSON documents
counter and string data types
store binaries up to 20MB
Built-in and transparent memcached-compatible caching layer
Distributed around a cluster of servers
Generate secondary indexes using map/reduce queries
The basics of Couchbase Server
©2014 Couchbase, Inc. 3
![Page 4: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/4.jpg)
RDMS Modeling
©2014 Couchbase, Inc. 4
• RDBMS organizes data as tables
- Tables represent data in rows; n columns of m rows
- Table rows have a specific schema, each column as a static type
- Simple Datatypes: strings, numbers, datetimes, booleans, can be
represented by columns in a single table
- Complex Datatypes: dictionaries/hashes, arrays/lists are difficult to
be represented in a single table [Impedence Mismatch]
• All rows have identical schema, schema changes are painful and
resource intensive
• Reading/Writing/Transactions require locking
![Page 5: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/5.jpg)
Couchbase – NOSQL Modeling
©2014 Couchbase, Inc. 5
• Couchbase operates like a Key-Value Document Store
- Simple Datatypes: strings, numbers, datetime, boolean, and binary data
can be stored; they are stored as Base64 encoded strings
- Complex Datatypes: dictionaries/hashes, arrays/lists, can be stored in
JSON format (simple lists can be string based with delimiter)
- JSON is a special class of string with a specific format for encoding simple
and complex data structures
• Schema is unenforced and implicit, schema changes are programmatic, done
online, and can vary from Document to Document
• Document defined schema –”Schema-less” is misleading and inaccurate
![Page 6: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/6.jpg)
Applying the Technology to the Problem
![Page 7: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/7.jpg)
Relational databases are optimised for questions
©2014 Couchbase, Inc. 7
![Page 8: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/8.jpg)
Simple ecommerce example
©2014 Couchbase, Inc. 8
![Page 9: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/9.jpg)
RDMS Complex DataTypes
©2014 Couchbase, Inc. 9
public class User {
private String name;
private String email;
private Integer age;
private Boolean gender_male;
private DateTime created_at;
private ArrayList items_viewed;
private Hashtable preferences;
private ArrayList<Books>
authored;
public User(...) {
...
}
...
}
• Simple Types are easy, make them
columns
• Complex Types are more
challenging, require separate tables
and joins, slower to store and
retrieve
• ORM's reduce complexity but trade
off additional speed/scale, hard to
optimize
![Page 10: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/10.jpg)
Document databases are optimised for answers
©2014 Couchbase, Inc. 10
![Page 11: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/11.jpg)
That order in a heavily denormalised document database
©2014 Couchbase, Inc. 11
![Page 12: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/12.jpg)
Answer oriented databases
©2014 Couchbase, Inc.
order::1001{
uid: ji22jd,customer: Ann,line_items: [
{ sku: 0321293533, quan: 3, unit_price: 48.0 },{ sku: 0321601912, quan: 1, unit_price: 39.0 },{ sku: 0131495054, quan: 1, unit_price: 51.0 }
],payment: { type: Amex, expiry: 04/2001,
last5: 12345 }}
![Page 13: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/13.jpg)
Storing together the data that we access together is efficient
SQL queries are slow because aggregations are slower
Aggregated Documents are easy to distribute
Why optimise for a certain set of questions?
©2014 Couchbase, Inc. 13
![Page 14: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/14.jpg)
Serialization
©2014 Couchbase, Inc. 14
public class User {
private String name;
private String email;
private Integer age;
private Boolean gender_male;
private DateTime created_at;
private ArrayList items_viewed;
private Hashtable preferences;
private ArrayList<Books>
authored;
public User(...) {
...
}
...
}
“User”:{
“name”:”jack benny”,
“email”:[email protected],
“age”:”39”,
“gender”:”male”,
“created_at”:” October 13, 2014 11:13:00”,
“items_viewed”:{
…}
“preferences”:{
…}
“books”:{
…}
}
![Page 15: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/15.jpg)
Denormalization
![Page 16: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/16.jpg)
You could think that denormalisation is a credo of NoSQL.
In the real world, we denormalise all the time in Couchbase.
We have to decide when to embed data (i.e. denormalise) and when to refer to data.
Denormalisation
©2014 Couchbase, Inc. 16
![Page 17: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/17.jpg)
You should embed data when:
You need speed of access (less of a concern with Couchbase)
Reads outnumber writes
You are comfortable with the slim risk of two denormalisedoccurrences of the same data losing sync, or understand programming models around these conditions.
When to embed
©2014 Couchbase, Inc. 17
![Page 18: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/18.jpg)
You should refer to data when:
Query flexibility is important
Consistency is a priority
The data has large growth potential
When to refer
©2014 Couchbase, Inc. 18
![Page 19: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/19.jpg)
Usually, there’s still a schema when we use Couchbase.
The difference is:
Couchbase doesn’t enforce the schema
If schema matters, you can enforce it at the application side
Schema can vary completely from document to document
Migrations are cheap and asynchronous
Impedence mismatch is yesterday’s problem
It’s still okay to store unstructured data
Schema unenforced
©2014 Couchbase, Inc. 19
![Page 20: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/20.jpg)
The key is the key
![Page 21: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/21.jpg)
Key design is as important as document design.
There are three broad types of key:
Human readable/deterministic: e.g. an email address
Computer generated/random: e.g. UUID
Compound: e.g. UUID with a deterministic portion
Three ways to build a key
©2014 Couchbase, Inc. 21
![Page 22: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/22.jpg)
Human readable/deterministic
©2014 Couchbase, Inc. 22
public class user {
private String name;private String email;private String streetAddress;private String city;private String country;private String postCode;private String telephone;private Array orders;private Array productsViewed;
}
{"name": "Matthew Revell","address": "11-21 Paul Street","city": "London","postCode": "EC2A 4JU","telephone": "44-20-3837-9130","orders": [ 1, 9, 698, 32 ],“productsViewed”: [8, 33, 99, 100]
}
Key: [email protected]
![Page 23: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/23.jpg)
Random/computer genereated
©2014 Couchbase, Inc. 23
{"name": "Matthew Revell","email": "[email protected]","address": "11-21 Paul Street","city": "London","postCode": "EC2A 4JU","telephone": "44-20-3837-9130","orders": [ 1, 9, 698, 32 ],“productsViewed”: [8, 33, 99, 100]
}
Key: 1001
![Page 24: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/24.jpg)
Multiple look-up documents
©2014 Couchbase, Inc. 24
u::count
1001
u::1001
{ "name": “Matthew Revell",
"facebook_id": 16172910,
"email": “[email protected]”,
“password”: ab02d#Jf02K
"created_at": "5/1/2012 2:30am",
“facebook_access_token”: xox0v2dje20,
“twitter_access_token”: 20jffieieaaixixj }
fb::16172910
1001
nflx::2939202
1001
twtr::2920283830
1001
1001
1001
uname::mrevell
1001
![Page 25: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/25.jpg)
Compound keys
Compound keys are look-up documents with a predictable name.
It’s a continuation of the embedded versus referred data discussion.
![Page 26: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/26.jpg)
Compound keys: example
u::1001
{
"name": "Matthew Revell",
"email": "[email protected]",
"address": "11-21 Paul Street",
"city": "London",
"postCode": "EC2A 4JU",
"telephone": "44-20-3837-9130",
"orders": [ 1, 9, 698, 32 ],
“productsViewed”: [8, 33, 99, 100]
}
![Page 27: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/27.jpg)
Compound keys: example
u::1001
{
"name": "Matthew Revell",
"email": "[email protected]",
"address": "11-21 Paul Street",
"city": "London",
"postCode": "EC2A 4JU",
"telephone": "44-20-3837-9130",
"orders": [ 1, 9, 698, 32 ]
}
u::1001::productsviewed
{"productsList": [
8, 33, 99, 100]
}
![Page 28: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/28.jpg)
Compound keys: example
u::1001
{
"name": "Matthew Revell",
"email": "[email protected]",
"address": "11-21 Paul Street",
"city": "London",
"postCode": "EC2A 4JU",
"telephone": "44-20-3837-9130",
"orders": [ 1, 9, 698, 32 ]
}
u::1001::productsviewed
{"productsList": [
8, 33, 99, 100]
}
p::8
{
id": 1,"name": "T-shirt","description": "Red Couchbase shirt","quantityInStock": 99,"image": "tshirt.jpg”
}
![Page 29: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/29.jpg)
Compound keys: example
u::1001
{
"name": "Matthew Revell",
"email": "[email protected]",
"address": "11-21 Paul Street",
"city": "London",
"postCode": "EC2A 4JU",
"telephone": "44-20-3837-9130",
"orders": [ 1, 9, 698, 32 ]
}
u::1001::productsviewed
{"productsList": [
8, 33, 99, 100]
}
p::8
{
id": 1,"name": "T-shirt","description": "Red Couchbase shirt","quantityInStock": 99
}
p::8::img
“http://someurl.com/tshirt.jpg”
![Page 30: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/30.jpg)
Couchbase views and N1QL are amazing.
You should use them where:
You discover new query patterns.
You have short-lived query types.
Ad-hoc querying.
However: user defined indexes should be your first port of call.
What about automatic indexes?
©2014 Couchbase, Inc. 30
![Page 31: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/31.jpg)
Demo
Couchbase + Node.JS + Express + Bootstrap
![Page 32: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/32.jpg)
Demo Presentation
©2014 Couchbase, Inc. 32
{
"name": "Aliza Kshlerin",
"username": "Felicita_Reichert61",
"email": "[email protected]",
"address": {
"street": "Ericka Route",
"suite": "Apt. 077",
"city": "Effertzfurt",
"zipcode": "83625",
"geo": {
"lat": "15.5566",
"lng": "-109.3184"
}
},
"phone": "082-502-1159",
"website": "trace.com",
"company": {
"name": "Altenwerth, Sawayn and Kiehn",
"catchPhrase": "Face to face upward-trending matrices",
"bs": "vertical aggregate infrastructures"
}
}
Mock User, generated using faker.js
• Wonderful Library for Testing
• Easily used with node
• More info: https://github.com/marak/Faker.js/
![Page 33: How-To NoSQL 3.0 Webinar Series: Couchbase 103 - Data Modeling](https://reader034.fdocuments.net/reader034/viewer/2022042701/559c1d221a28abc7298b4575/html5/thumbnails/33.jpg)
Further Information
©2014 Couchbase, Inc. 33
Couchbase Node.js Client API Reference: http://docs.couchbase.com/sdk-
api/couchbase-node-client-2.0.0/
N1QL Documentation:
• http://docs.couchbase.com/developer/n1ql-dp3/n1ql-intro.html
Next Session:
• Couchbase 104 Views and Indexes on 11/19/2014 - In this installment explore the
power of creating views and indexes in Couchbase. Learn the underlying view
architecture for how views and indexes are built in Couchbase. Explore
strategies for creating performant and efficient lookups of data stored within the
database including custom reduce operations.