Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

13
Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning

Transcript of Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Page 1: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Dynamo: Amazon's Highly Available Key-Value Store

Offense: Jori and Ning

Page 2: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Outline

• Presentation (Ning)• Symmetry (Jori)• WAN considerations (Ning)• Consistency (Jori)• Disaster Recovery (Ning)• Minor Quibbles (Jori, Ning)

Page 3: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Presentation (Ning)

• Dynamo:o The basic functions are simple; o System implementation could be very complex;  

• Leads to many gaps in the explanation. Missing things that are mentioned, but not explained include:o overload handlingo state transfero concurrencyo job schedulingo request marshallingo request routingo system monitoringo alarmingo configuration management

• If you don't want to talk about them, don't mention them.

Page 4: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Presentation contd.• Almost impossible to understand some concepts without

reading the cited material.o Some concepts are used but not well explained:

the gossip protocol vector clock

o Some concepts are not so important: SLAo Too wordy: at least give a numbered listo No clear graph: please use flow chart!!

• Despite the length and many cited resources, it is still very difficult to use the article as a design document.o Many open-source clones (Cassandra, Voldemort, Riak) have

tried.o Many design concerns aren't touched upon

Why the decentralized structure is better?• Must be well-versed in distributed computing concepts in

order to really understand whats going on on the first read-through.

Page 5: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Symmetry (Jori)

• There are direct contradictions in regard to symmetry:o  In section 2.3: "Symmetry: Every node in Dynamo should

have the same set of responsibilities as its peers; there should be no distinguished node or nodes that take special roles or extra set of responsibilities."

o In section 4.8.2: "To prevent logical partitions, some Dynamo nodes play the role of seeds... Seeds can be obtained either from static configuration or from a configuration service. Typically seeds are fully functional nodes in the Dynamo ring."

Page 6: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Symmetry contd.

• No justification for this design choice except that it "simplifies the process of system provisioning and maintenance."

• Membership and failure detection are presented in a hand-wavy manner.

• In this sort of system, specialization can simplify the overall design. It is not necessary for high availability.o Chubby/Paxos (google-designed distributed storage

system) uses a master coordinator approach which results in much simpler consistency algorithms. It allows updates to be serialized which prevents conflicts.

o A distributed directory service layer for lookup would fix dynamo's scalability issue, since nodes would no longer have to gossip the entire routing table.

Page 7: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Symmetry contd.

• Network connectivity is not symmetric. e.g. connections between nodes in the same data center are different than those between nodes in separate data centers. o The symmetric ring-based system does not reflect

this inherent asymmetry. • Server hardware configurations are inherently asymmetric.

By making a symmetric system, you rule out the advantages of specialization. One can no longer use different hardware for different components of a complex system.

Page 8: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

WAN Considerations (Ning)• Non clear introduction for the interactions between data

centers.• When a Dynamo clusters span a WAN, the odds of nodes

rejoining the clusters and remaining out of date are signficantly increased. 

• If a node goes down, ‘hinted handoff’ sends updates to the next node in the ring. Since nodes of two data centers alternate, the updates are sent to the remote data center. When the node re-joins the cluster, if the network is partitioned (which happen all the time), the node will not catch up on pending updates for a long time (until the network partitioning is healed).

• Authentication and authorization are ignored in this paper. However, these could cause problems in the ring membership management.

Page 9: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Consistency (Jori)

• Principle for Symmetry and Decentralizaiono Centralization does not mean low availability and

consistency does not need to be sacrificed for high availability: BigTable+GFS

o Decentralized Architecture usually causes a lot of complexity

o For handling transient failures, hinted handoff is complicated.

• "0.06% of inconsistent values"o millions of transactions a day for Amazon, so this ends up

being a lot.

Page 10: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Consistency contd.

• Stale reads are possible and inconveniento A node that has been down for a significant amount of time can

rejoin a cluster completely out-of-date. There is no resynchronization barrier for reentry and no concept of how far behind it is. Merkle trees lead to slow catch-up.

o Dynamo provides no bounds on stale reads to the detriment of developers e.g. a stale read could indirectly lead to an incorrect write, which is hard to track.

• Practical implications:o Committed writes don't show up in subsequent reads.o Committed writes may show up in some subsequent reads, but

then go missing.o There is no SLA for when writes are globally committed i.e. no

nodes are still playing catch-up.

Page 11: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Consistency contd.

• Conflict Resolutiono Dynamo exposes resolution logic to the developer,

making application logic more complex. Since there are no bounds for stale reads or any

centralized commit logs, data returned may be woefully out-of-date.

o As noted before, this data loss can lead to unexpected situations that are hard to predict.

o If the returned object is a list, deleted objects may reemerge after a conflict (shopping cart example)

Page 12: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Disaster Recovery (Ning)

• Disaster:o Entire data center fails: no way to describe the state of

surviving data centers, so data loss is unbounded: One cannot quantify exactly how much data was lost. The lost data will be possibly corrupted forever.

o Lost data can result in stale reading:  transactional inconsistencies are  that most applications are ill-

equipped to handle.• Recovery:

o The paper does not outline how disk corruptions and failures are handled.

o Standard log-shipping based replication: one can at least keep track of replication log, and therefore have a general idea of how far behind a surviving cluster is.

Page 13: Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Minor Quibbles

• Amazon implemented the system in Java, but gave no justification as to why. If the concern is providing high-speed availability, why do it in a slow language like Java?

• There are a few grammar mistakes and spelling mistakes throughout - could have used a couple more read-throughs.

• Wish there were comparisons of various (N,R,W) configuration schemes

• The size constraint on objects limits its applications.• End of section 4.4 "However, this problem has not surfaced

in production and therefore this issue has not been thoroughly investigated."