5NoSQL - TU Kaiserslautern€¦ · 5/23/13 1 Distributed&DataManagement SummerSemester2013&...
Transcript of 5NoSQL - TU Kaiserslautern€¦ · 5/23/13 1 Distributed&DataManagement SummerSemester2013&...
5/23/13
1
Distributed Data Management Summer Semester 2013
TU Kaiserslautern
Dr.-‐Ing. Sebas9an Michel
[email protected]‐saarland.de
Distributed Data Management, SoSe 2013, S. Michel 1
Oral Exams: Available Dates
• 7. August 2013 (Wednesday) • 8. August (Thursday) • 9. August (Friday) • 12. August (Monday) • 13. August (Tuesday) • 14. August (Wednesday)
• Star9ng 9am each, 5-‐6 slots, 30min (20+10) each
Distributed Data Management, SoSe 2013, S. Michel 2
Mark the dates in your calendar. Good idea is to be available on 3
of these days.
Assignment of slots will be made later, aWer “official” registra9on.
Things Captured • Roughly everything captured in the lecture. • No need to learn detailed system issues of Hadoop or other systems
• But able to write MapReduce func9ons, e.g. • We assume ac9ve par9cipa9on in exercises for the exam, the en9re sheet is relevant (again, you don’t have to code, but understand)
• Example ques9ons or points are discussed in the lecture.
Distributed Data Management, SoSe 2013, S. Michel 3
NOSQL Lecture 5
Distributed Data Management, SoSe 2013, S. Michel 4
The NoSQL “Movement” • No one-‐size-‐fits-‐all database technology • Not only SQL (not necessarily “no” SQL at all) • Stands for group of non-‐tradi9onal DBMS (not rela9onal, oWen no SQL), for different purposes – key/value stores – graph databases – document stores – (also MapReduce)
Distributed Data Management, SoSe 2013, S. Michel 5
Overview of systems: nosql-‐database.org/
NoSQL-‐Systems Characteris9cs
• Non-‐rela9onal data model • Systems are designed for distributed (scale-‐out) architecture. “Sharding”: one logical DB => mul9ple machines
• No or lijle schema • Simple API (mainly no full SQL support); CRUD • Mainly not full ACID seman9cs. Instead BASE. • OWen such systems are open-‐source projects. Distributed Data Management, SoSe 2013, S. Michel 6
5/23/13
2
Literature: Books • Seven Databases in Seven Weeks: A Guide to Modern
Databases and the NoSQL Movement byEric Redmond and Jim R. Wilson.
• In German: NoSQL: Eins9eg in die Welt nichtrela9onaler Web 2.0 Datenbanken von Stefan Edlich, et al., 2nd edi9on.
• NoSQL Dis9lled: A Brief Guide to the Emerging World of Polyglot Persistence. Pramodkumar J. Sadalage and Mar9n Fowler.
Distributed Data Management, SoSe 2013, S. Michel 7
NoSQL: Key/Value Stores
• Simple storage for key-‐>value pairs • Value can be complex data type • Example systems: Amazon Dynamo, Redis, Voldemort
• Simple “CRUD” opera9ons: Create, Read, Update, Delete
• Some allow more complex queries, e.g., over ranges
Distributed Data Management, SoSe 2013, S. Michel 8
Example Key/Value Store: Redis • hjp://try.redis.io/ <= check this out!
SET name “ddm13” GET name #ddm13 LPUSH list “a” LPUSH list “b” LLENGTH #2 LRANGE list 0 1 # “b”, “a”
Distributed Data Management, SoSe 2013, S. Michel 9
List opera9ons
NoSQL: Document Stores • Store JSON (Javascript Object Nota9on) or XML documents
• Examples: MongoDB, CouchDB
Distributed Data Management, SoSe 2013, S. Michel 10
{ "firstName": "John", "lastName": "Smith", "age": 25, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": 10021 }, "phoneNumbers": [ { "type": "home", "number": "212 555-‐1234" }, { "type": "fax", "number": "646 555-‐4567" } ] }
Example: MongoDB • hjp://try.mongodb.org/
var student = {name: 'Jim', scores: [75, 99, 87.2]}; db.lecture.store(student); db.lecture.find(); #returns all entries db.lecture.find({name: "Jim"}); #specific search db.users.update({name: 'Johnny'}, {name: 'Cash', languages: ['english']}); and you can use mapreduce!
Distributed Data Management, SoSe 2013, S. Michel 11
NoSQL: Graph Databases • Data model: Graph (Ver9ces, Edges)
• Queries: Shortest path, connected components, “followers”, …
• Examples: Neo4j, InfiniteGraph, GraphBase
• Specifically address graph queries. Aim: higher performance than “self joins” in RDBMS to traverse graph
Distributed Data Management, SoSe 2013, S. Michel 12
A
D F
C
B
E
5/23/13
3
Example: Neo4j • hjp://www.neo4j.org/learn/try
Graph Setup: start root=node(0) create (Neo {name:'Neo'}), (Morpheus {name: 'Morpheus'}), (Trinity {name: 'Trinity'}), (Cypher {name: 'Cypher'}), (Smith {name: 'Agent Smith'}), (Architect {name:'The Architect'}), root-‐[:ROOT]-‐>Neo, Neo-‐[:KNOWS]-‐>Morpheus, Neo-‐[:LOVES]-‐>Trinity, Morpheus-‐[:KNOWS]-‐>Trinity, Morpheus-‐[:KNOWS]-‐>Cypher, Cypher-‐[:KNOWS]-‐>Smith, Smith-‐[:CODED_BY]-‐>Architect Query: start n=node:node_auto_index(name='Neo') match n-‐[r:KNOWS*]-‐m return n as Neo,r,m
Distributed Data Management, SoSe 2013, S. Michel 13
NoSQL: Column Stores
• Logically “look” like common rela9onal databases (tables)
• Physically organized in a per column fashion • Good for analy9cal tasks over subsets of columns • Dynamic schema, sparse data
• Example systems: BigTable, HBase, Cassandra, Hypertable
Distributed Data Management, SoSe 2013, S. Michel 14
HBase • Scalable data store • Made for sparse but wide tables: >millions of rows and/or columns
• Hadoop (Apache) project, works also together with Hadoop MapReduce.
• Modeled aWer BigTable* (again first introduced by Google engineers)
Online “book”: hjp://hbase.apache.org/book/book.html *Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert Gruber: Bigtable: A Distributed Storage System for Structured Data. OSDI 2006: 205-‐218
Distributed Data Management, SoSe 2013, S. Michel 15
The Hadoop Stack of Tools/Infrastructure
Distributed Data Management, SoSe 2013, S. Michel 16
HDFS (Hadoop Distributed File System)
Pig (Data Flow)
Hive (SQL)
MapReduce (Distributed Programming Framework)
HCatalog (Metadata)
HBase (Columnar Storage)
Languages
Computa9on
Table storage
Storage
Zookeepe
r (Coo
rdinater)
HBase Data Model • “Table” layout of data • Each row is iden9fied by a key Columns are grouped by so called column families
• Inside column family: columns and their values
Distributed Data Management, SoSe 2013, S. Michel 17
row1
row2
column family 1 column family 2
A=>10
D=>”hello” X=> 2.7182
X=> 3.14159
HBase Data Model: Map (Dic9onary) Style
Key_1 => COLUMN_FAMILY_ONE => A => Value B => Value
=> COLUMN_FAMILY_TWO => C => Value Key_2 => COLUMN_FAMILY_ONE => A => Value => COLUMN_FAMILY_TWO => D => Value
E => Value ….
Distributed Data Management, SoSe 2013, S. Michel 18
5/23/13
4
Column Families and Columns
• column family iden9fier has to be “printable” name
• need to be defined at table crea9on
• columns of a certain column family are stored together, physically
• column names can be arbitrary • columns can be added at run9me
Distributed Data Management, SoSe 2013, S. Michel 19
HBase Shell Example Usage create 'cars', ’features’ put 'cars', 'row1', ’features:make', 'bmw’ put 'cars', 'row1', ’features:model', '5 series’ put 'cars', 'row1', ‘features:year', '2012’ put 'cars', 'row2', ’features:make', 'mercedes’ put 'cars', 'row2', ’features:model', 'e class’ put 'cars', 'row2', ’features:year', '2011'
Distributed Data Management, SoSe 2013, S. Michel 20
Read on: hjp://akbarahmed.com/2012/08/13/hbase-‐command-‐line-‐tutorial/
table row id/key
column family ajribute value
HBase (Shell) Queries scan 'cars’ // returns all entries of this table get 'cars', 'row1’ //get row with key row1 //select only specific columns scan 'cars', {COLUMNS => [’features:make']} //restrict results to a certain number scan 'cars', {COLUMNS => [’features:make'], LIMIT => 1}
Distributed Data Management, SoSe 2013, S. Michel 21
No SQL, No Joins
HBase Java Example Configura9on config = HBaseConfigura9on.create(); HTable table = new HTable(config, ”cars"); Put p = new Put(Bytes.toBytes(”rowid")); p.add(Bytes.toBytes(”features"), Bytes.toBytes(”make"), Bytes.toBytes(”bmw")); table.put(p);
Distributed Data Management, SoSe 2013, S. Michel 22
column family
ajribute value
row id/key
HBase: Other Proper9es • Time to Live (TTL) can be set per column family
• Mul9ple versions of can be kept • Bloomfilter (remember?) can be used to speed up looking up if row/column exists (inser9ng row + column family + column family qualifier, or just row)
• Consistency guaranteed per single row, using Mul9version concurrency control (MVCC)
• Use of write-‐ahead logging (WAL) Distributed Data Management, SoSe 2013, S. Michel 23
HBase Architecture • rows are assigned to so called RegionServers
Distributed Data Management, SoSe 2013, S. Michel 24 source: larsgeorge.com
5/23/13
5
NoSQL: Wide Spectrum
• Systems come with different proper9es. • In-‐memory vs. disk based. • ACID vs. BASE • CRUD, SQL (subset?), or MapReduce support
• Different systems for specific requirements. OWen triggered by demands inside companies.
• Like Voldemort at Linkedin, BigTable (also Hadoop) at Google, etc.
Distributed Data Management, SoSe 2013, S. Michel 25
Overview of Forthcoming Topics
• Consistency: CAP Theorem. ACID vs. Base.
• Placement of data/nodes in network: Consistent hashing.
• Ordering of events: Vector Clocks
• Will look at sample systems, with hands-‐on experience through exercises.
Distributed Data Management, SoSe 2013, S. Michel 26
Wanted Proper9es • Data should be always consistent
• Provided service should be always quickly responding to requests
• Data can be (is) distributed across many machines (par99ons)
• Even if some machines fail, the system should be up and running
Distributed Data Management, SoSe 2013, S. Michel 27
Recap ACID paradigm of tradi9onal RDBMS
• Atomicity – transac9on is executed as a whole or not at all
• Consistency – transac9on leaves database in consistent state
• Isola9on – single transac9on sees database as if it is the only transac9on (client)
• Durability – changed of a successfully finished (commijed) transac9on is persistently stored
Distributed Data Management, SoSe 2013, S. Michel 28
CAP Theorem (Brewer's Theorem)
• System cannot provide all 3 proper9es at the same 9me: – Consistency – Availability – Par99on Tolerance
Distributed Data Management, SoSe 2013, S. Michel 29
C A
P C+P A+P
C+A
hjp://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-‐SigAct.pdf
Consistent + Available
• No support of mul9ple par99ons.
• Strong (ACID) consistency enforced.
• Example: Single-‐site database
Distributed Data Management, SoSe 2013, S. Michel 30
C A C+A
5/23/13
6
Par99on Tolerant + Available
• Example: Domain Name Service (DNS)
Distributed Data Management, SoSe 2013, S. Michel 31
A
P A+P
Consistent + Par99on Tolerant
• Example: Distributed Databases with distributed locking/commit, HBase
Distributed Data Management, SoSe 2013, S. Michel 32
C
P C+P
Can’t do without “P” • Large data => Scale out architecture => Par99on Tolerance is a strict requirement
• Leaves: Trading off consistency and availability
Distributed Data Management, SoSe 2013, S. Michel 33
C A
P C+P A+P
Idea
• Trade off consistency and availability • Write changes to subset of machines • Read data from subsets of machines
• Maintain mul9ple versions per data item • Resolve conflicts based on version mismatch once it occurs (be op9mis9c)
Distributed Data Management, SoSe 2013, S. Michel 34
Generic Architecture
Distributed Data Management, SoSe 2013, S. Michel 35
node1 nodeN node2
Client
Coordinator
write request
write requests (or subset and sync between nodes)
response
ACK
Reads and Writes to Subset of Nodes before Returning to Client
Distributed Data Management, SoSe 2013, S. Michel 36
node1 D0
node2 D0
node3 D0
node4 D0
node5 D0 node6
D0
write
write
read
read
5/23/13
7
Writes and Sync
Distributed Data Management, SoSe 2013, S. Michel 37
node1 node2 D0 D0
write
node1 node2 D1 D0
write
node1 node2 D1 D1
9me
sync
Architecture/Problem • Writes are propagated to machines + sync between
• Can arrive (and be acknowledged) at different 9mes.
• Leading to inconsistencies
Distributed Data Management, SoSe 2013, S. Michel 38
Client gets ACK aWer write
All nodes have same values for data
Time range of possibly inconsistent data
Inconsistent Read
Distributed Data Management, SoSe 2013, S. Michel 39
node1 node2 D0 D1
read
read
different data “version” Can be resolved by using the latest version.
Consistent Read
Distributed Data Management, SoSe 2013, S. Michel 40
node1 node2 D0 D0
read
read
CAP Theorem: Proof Idea
• Consider a system with mul9ple par99ons • Failure prevents sync between node1 and node2.
• Now? Prohibit reads un9l synced? (system not available). Or let clients read (system not consistent)
Distributed Data Management, SoSe 2013, S. Michel 41
node1 node2 D1 D0
sync?
Best effort: BASE
• Basically Available • SoW State • Eventual Consistency
Distributed Data Management, SoSe 2013, S. Michel 42
see hjp://www.allthingsdistributed.com/2007/12/eventually_consistent.html W. Vogels. Eventually Consistent. ACM Queue vol. 6, no. 6, December 2008.
5/23/13
8
Eventual Consistency
• AWer a write, data can be at some nodes/machines inconsistent
• But will eventually(!) become consistent again
Distributed Data Management, SoSe 2013, S. Michel 43
Literature (Other than books) • hjp://www.cs.berkeley.edu/~brewer/cs262b-‐2004/PODC-‐keynote.pdf • Seth Gilbert, Nancy A. Lynch: Brewer's conjecture and the feasibility of
consistent, available, par99on-‐tolerant web services. SIGACT News 33(2): 51-‐59 (2002)
• hjp://hadoop-‐hbase.blogspot.de/2012/03/acid-‐in-‐hbase.html
Distributed Data Management, SoSe 2013, S. Michel 44