Scalability: Rdbms Vs Other Data Stores
-
Upload
ramki-gaddipati -
Category
Technology
-
view
5.706 -
download
0
Transcript of Scalability: Rdbms Vs Other Data Stores
![Page 2: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/2.jpg)
Scalability
• Increase Resources Increase Performance (Linearly)
• Performance?– Latency, Capacity, Throughput
• Vertical Scalability (Scaling Up)– Divide the functionality
• Horizontal Scalability (Scaling Out)– Divide the data
![Page 3: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/3.jpg)
Relational Database
• Table, Row, Column• Set, Item, Property
![Page 4: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/4.jpg)
Relational Theory
• Selection: SELECT• Filter: WHERE• Join: JOIN, LEFT JOIN,RIGHT JOIN• Correlation:
SELECT a FROM A WHERE A.b IN (SELECT b FROM B WHERE b.a > a)
![Page 5: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/5.jpg)
Relational Theory
• Aggregation– Set Operators• Union, Intersection, Minus
– Group By• MAX, MIN, SUM, AVG
![Page 6: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/6.jpg)
Transactions: Atomicity
• Transaction Level– Entire Logical operations is a transaction– Multiple statements
• Statement level– Each statement is either successful or not, no
partial success– Multiple records
• Record Level– All modifications to a record are successful or not
![Page 7: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/7.jpg)
Transactions: Consistency
• Integrity Constraints• Referential Integrity
![Page 8: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/8.jpg)
Transactions: Isolation Levels
• Serializable– A definite order of mutations/transactions is possible to
arrive to state B from state A• Repeatable Read
– Any data read by a transaction will remain so till transaction is complete
• Non Repeatable Read aka Read Committed– Two reads within a transaction may give different results
• Dirty Read– A transaction might read data which might then be
rolledback
![Page 9: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/9.jpg)
RDBMS Luxuries
• Multiple Indexes• Auto Increments/Sequences• Triggers
![Page 10: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/10.jpg)
Scalability in RDBMS
• Replication– Read Replication (Master-Slave)– Read Write Replication (Master-Master)
• Cluster– Distributed Transaction– Two-phase commits
![Page 11: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/11.jpg)
Scalability Impediments
• Performance– Sub-Queries/Correlation, Joins, Aggregates, – Referential Integrity constraints
• Basic Guarantee– Consistency– Availability
![Page 12: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/12.jpg)
CAP?
• Conjecture: Distributed systems cannot ensure all three of the following properties at once– Consistency The client perceives that a set of
operations has occurred all at once.– Availability Every operation must terminate in an
intended response.– Partition tolerance Operations will complete, even
if individual components are unavailable.
![Page 13: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/13.jpg)
ACID to BASE
• Basically Available - system seems to work all the time
• Soft State - it doesn't have to be consistent all the time
• Eventually Consistent - becomes consistent at some later time
![Page 14: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/14.jpg)
BASE: An Example
BEGIN TransactionINSERT INTO ORDER( oid, timestamp, customer)FOREACH item IN itemList
INSERT INTO ORDER_ITEM ( oid, item.id, item.quantity, item.unitprice)
//UPDATE INVENTORY SET quantity=quantity-item.quantity WHERE item = item.idCOMMIT
END Transaction
Assume Each statement is queued for execution You will get COMMIT success
![Page 15: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/15.jpg)
Alternate Implementations
• BigTable – Google – CP• Hbase – Apache – CP • HyperTable – Community - CP • Dynamo – Amazon – AP• SimpleDB – Amazon - AP• Voldemort – LinkedIn – AP• Cassandra – Facebook – AP• MemcacheDB - community – CP/AP
![Page 16: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/16.jpg)
Data Models
• Key/Value Pairs – Dynamo, MemcacheDB, Voldemort
• Row-Column– BigTable, Casandra, SimpleDB, Hypertable, Hbase
![Page 17: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/17.jpg)
Programming Models
// Open the tableTable *T = OpenOrDie("/bigtable/web/webtable");// Write a new anchor and delete an old anchorRowMutation r1(T, "com.cnn.www");r1.Set("anchor:www.c-span.org", "CNN");r1.Delete("anchor:www.abc.com");Operation op;Apply(&op, &r1);
![Page 18: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/18.jpg)
BigTable: Consistent yet Infinitely Scalable
• Single Master• B+ tree based data distribution
![Page 19: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/19.jpg)
BigTable: Transactions
Invoice
Invoice Item
Delivery Note
• Enities and Entity Groups
![Page 20: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/20.jpg)
Dynamo: Highly available and Infinitely Scalable
• Consistent Hashing• Peer to Peer Distributed• Gossip based member discovery
![Page 21: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/21.jpg)
RDBMS or Other?
• Nature of Business• Maturity of the Product• Cost of Adoption• Maturity of the alternative Datastores
![Page 22: Scalability: Rdbms Vs Other Data Stores](https://reader035.fdocuments.net/reader035/viewer/2022062513/555098fcb4c9058b208b47e4/html5/thumbnails/22.jpg)
Q&A