Databases and Distributed Systems Maintaining Replicas in Unstructured P2P Systems CoNEXT, Madrid,...

Databases and Distributed Systems

Maintaining Replicas in Unstructured P2P SystemsCoNEXT, Madrid, 12/11/2008

Christof Leng, TU Darmstadt

Wesley W. Terpstra, TU Darmstadt

Bettina Kemme, McGill University

Wilhelm Stannat, TU Darmstadt

Alejandro P. Buchmann, TU Darmstadt

http://www.bubblestorm.net

http://www.dvs.tu-darmstadt.de

2 Replicate, what? | An INformal model | Don't try this at home | Pull on Join | Flushing Junk

Replicate, what?


Replicas far and small

Our research focuses on peer-to-peer search The data we replicate is usually small (< 10kB) Modern unstructured overlays (Sarshar’04, Ferreira’05,

BubbleStorm) can have several hundred copies of an object

Why replicate so far? The more copies, the easier it is to find one The more providers, the harder to overload them When a node leaves, its copy is gone with the wind

To be clear: this talk is not about replicating files


Two kinds of replicated data

Maintained

“I’m online @132.160.222.1” “Tell me when a paper is

published with ‘P2P’&‘search’” “I can provide these files” “I am waiting for event X”

Service Lists Subscriptions

Collective

Wiki articles Information about physical

objects outside the network Distributed file systems System backups

“Persistent” information


Who maintains replicated data?

A Maintainer

Ideal for data that should never outlive its owner

The owner can manage it

If the owner crashes, any remaining replicas are junk

The Collective

Ideal for data that should live until explicitly deleted

No clear managing authority

Replicas of undeleted objects should remain in the system


Our Paper: Maintainer-based

“Let there be replicas!”


Maintaining Replicas

We want to be able to Ensure the system has no junk replicas (of objects whose

maintainer has left) – they cause bad search results! Ensure that there are exactly the number of replicas requested

We need to be able to Keep junk contained so the system remains useful Increase/decrease the density of replicas in the system Hold the density of replicas steady against network churn Update or destroy all the replicas of an object

...you can’t always get what you want…


An INformal model


The Good (aka assumptions)

Nodes all run our software – so they mostly cooperate

Computing a sum over participants in the network is easy

…and since we do unstructured peer-to-peer:

We don’t care too much about who has which particular replica


The Bad (aka Reality)

Churn is out of our control We can’t stop it Its rate changes over time We cannot influence participant lifetimes

Nodes sometimes crash They don’t say good-bye; they just leave This happens a lot


The Ugly

Storage Providing Peers crash silently fixed replication is impossible

Guarantee probability distribution? sufficient to prove the correctness of

Stochastic algorithms

Maintainers crashes are silent replicas should have been deleted Zero junk is impossible

Perhaps guarantee below a threshold? sufficient to prove the performance of

Stochastic algorithms


Don’t try this at home


A common maintenance strategy

Maintainer:1. Push desired replicas into system2. Wait for X minutes3. Goto 1

Storage Providing Peer: After Y minutes, replica is deleted


Why this is bad

What should the parameters X and Y be?

If X is too slow:

If Y is too slow:

If X or Y are too fast, more traffic is expended than necessary

Problem: Churn rate is out of our control (unboundable) Correctness requires setting X for the worst possible situation (costly)

repl

icas

time

RequirementReplicas

repl

icas

time

TolerableJunk


Pull on Join


Observation: Density

When a storage providing peer leaves the system Replica count might be reduced Expected replica density remains the same Holding replica density fixed dodges the problem of crashes

Not just a semantic difference! Must compensate by replicating whenever peers join

We can adjust the density as the population changes It is still possible to hold replicas at a fixed value (or √n)


Our replication algorithm

Maintainer: Push out initial replicas Record who has received replicas (superset) Push extra replicas to increase density Probabilistically delete existing replicas to decrease density

Storage Providing Peer: On join, ask randomly selected maintainers for replicas Accept replicas pushed out from maintainers


Visual Example

“Give me more replicas!”“Hello!”

“Bye Bye!”

“I don’t need so many.”

…

“Chow!”


Convergence to the Binomial


For the whole network


Flushing Junk


Observation: Pulls have no Junk

Recall: Junk is bad Unnecessarily consumes storage Results in spurious query results

When a node joins, all the replicas it receives are valid

So, to control junk… Blow everything away Reload fresh replicas


But the cost?!

Sounds expensive

Not all storage providing peers (c)rash, only c % We require only g % of replicas to be (g)ood

If c=10% and g=80%, the overhead of flushing is ½ c/(1/g – 1) = 20% extra replica transfers only applies to especially long-lived storage peers

It might be possible to optimize this cost away. We don’t. Be careful! Most of the obvious optimizations are wrong It’s easy to introduce statistical defects that accumulate over time


When to flush?

Expected number of replicas stored is easy to compute (a sum)

Flush when: stored replicas > expected replicas + tolerable junk

Only one problem:

Peers that happen to store more than average flush earlier those peers are preferentially destroyed there are less replicas than there should be

repl

icas

time

Desired ReplicasAverage Replicas


Independence is needed

P(v stores o | v flushes) = P(v stores o) This equality fails if a node flushes because it is full

Solution:

Use the flow of replicas through one bucket to flush the other The buckets’ replica flows are statistically independent: done!


Distribution of Junk


Things you’ll find in the paper

How pull on join handles different replication densities The details of the flush threshold equation Formal Stochastic proof of correctness How to support peers with different storage capacities How to support maintainers behind NAT/firewalls Cost (in operations) or the proposed algorithms Simulations of what-if scenarios


Conclusion

Providing replication guarantees in peer-to-peer is feasible Strong guarantees are impossible Probabilistic guarantees are tricky Seemingly innocent choices result in statistically bad behaviour

Maintainer-based replication is a relevant sub-problem Required for service lists and query subscriptions

Compared to collective replication, maintainer-based requires junk control has an obvious node (maintainer/owner) to manage replication


Thanks for listening!

?Questions

http://www.bubblestorm.net

http://www.dvs.tu-darmstadt.de

Databases and Distributed Systems Maintaining Replicas in Unstructured P2P Systems CoNEXT, Madrid,...

Documents

Transcript of Databases and Distributed Systems Maintaining Replicas in Unstructured P2P Systems CoNEXT, Madrid,...