Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms...

40
Outline for Today’s Lecture Administrative: Objective: – Peer-to-peer file systems • Mechanisms employed • Issues • Some examples

Transcript of Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms...

Page 1: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Outline for Today’s Lecture

Administrative:

Objective: – Peer-to-peer file systems

• Mechanisms employed• Issues• Some examples

Page 2: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

The Security EnvironmentThreats

Security goals and threats

Page 3: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Intruders

Common Categories1. Casual prying by nontechnical users2. Snooping by insiders3. Determined attempt to make trouble (or

personal gain)4. Commercial or military espionage

Page 4: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Accidental Data Loss

Common Causes

1. Acts of God- fires, floods, wars

2. Hardware or software errors- CPU malfunction, bad disk, program bugs

3. Human errors- data entry, wrong tape mounted, rm *

Page 5: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Reliability Mechanisms(Redundancy)

• Replication of data, geographically distributed– As simple as backups– First-class replication (Coda)– Voting schemes

• Error detection-correction– Erasure codes (encode n blocks into >n blocks,

requiring r blocks to recover original content of original n)

– Parity bits, checksums

Page 6: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Basics of Cryptography

Relationship between the plaintext and the ciphertext

Page 7: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

• Secret-key crypto called symmetric-key crypto– If keys are long enough there are OK

algorithms– Secret key must be shared by both parties

Secret-Key Cryptography

Page 8: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Public-Key Cryptography• All users pick a public key/private key pair

– publish the public key– private key not published

• Public key is (usually*) the encryption key• Private key is (usually*) the decryption key

• RSA

Page 9: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

One-Way Functions• Function such that given formula for f(x)

– easy to evaluate y = f(x)• But given y

– computationally infeasible to find x

• Example: Hash functions – produce fixed size result– MD5– SHA

Page 10: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Digital Signatures

(b)• Computing a signature block– Hash is fixed length – apply private key as encryption key*

• What the receiver gets– Use public key as decryption key* on signature block to get hash back– Compute the hash of document part– Do these match?

• Assumes E(D(x)) = x when we usually want D(E(x))=x• Public key must be known by receiver somehow – certificate

Page 11: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Distributing Public Keys

• Certificate authority– Trusted 3rd party– Their public key known

• Send name and public key, digitally signed by ca

Page 12: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Byzantine Generals ProblemReaching consensus among geographically separated

(distributed) players if some of them are compromised.• Generals of army units need to agree on a common

plan of attack (consensus)• Traitorous generals will lie (faulty or malicious)• Generals communicate by sending messages directly

general-to-general through runners between units (they won’t all see the same intell)

• Solutions are for all loyal generals to reach consensus, in spite of liars (up to some % of generals being bad)

Page 13: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Solution with Digital Sigs

• Iteratively execute “rounds” of message exchanges

• As each message passes by, the receiving general digitally signs it and forwards it on.

• Each General maintains the set of orders received

• Inconsistent orders indicate traitor

Page 14: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Peer-to-peer File Systems

Page 15: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Problems with Centralized Storage Server Farms

• Weak availability:– Susceptible to point failures and DoS attacks

• Management overhead– Data often manually partitioned to obtain scale– Management and maintenance large fraction of cost

• Per-application design (e.g., GoogleOS)– High hurdle for new applications

• Don’t leverage the advent of powerful clients– Limits scalability and availability

Slides from Shenker and Stoica, UCB

Page 16: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

What is a P2P system?

• A distributed system architecture:– No centralized control– Nodes are symmetric in function

• Large number of (perhaps) server-quality nodes• Enabled by technology improvements

Node

Node

Node Node

Node

Internet

Slides from Shenker and Stoica, UCB

Page 17: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

P2P as Design Style

• Resistant to DoS and failures– Safety in numbers, no single point of attack or

failure

• Self-organizing– Nodes insert themselves into structure– Need no manual configuration or oversight

• Flexible: nodes can be– Widely distributed or co-located– Powerful hosts or low-end PCs– Trusted or unknown peers

Slides from Shenker and Stoica, UCB

Page 18: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Issues

• Goal is to have no centralized server and to utilize desktop-level idle resources.

• Trust – privacy, security, data integrity– Using untrusted hosts

• Availability – – Using lower “quality” resources– Using machines that may regularly go off-line

• Fairness – freeloaders who just use and don’t contribute any resources– Using voluntarily contributed resources

Page 19: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Issues

• Goal is to have no centralized server and to utilize desktop-level idle resources.

• Trust – privacy, security, data integrity– Using untrusted hosts -- crypto solutions

• Availability – – Using lower “quality” resources -- replication– Using machines that may regularly go off-line

• Fairness – freeloaders who just use and don’t contribute any resources– Using voluntarily contributed resources – use economic

incentives

Page 20: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

What Interface?

• Challenge for P2P systems: finding content– Many machines, must find one that holds file

• Essential task: Lookup(key)– Given key, find host (IP) that has file with that key

• Higher-level interface: Put()/Get()– Easy to layer on top of lookup()– Allows application to ignore details of storage

• System looks like one hard disk

– Good for some apps, not for others

Slides from Shenker and Stoica, UCB

Page 21: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Distributed Hash Tables vs Unstructured P2P

• DHTs good at:– exact match for “rare” items

• DHTs bad at: – keyword search, etc. [can’t construct DHT-based Google]– tolerating extreme churn

• Gnutella etc. good at:– general search– finding common objects– very dynamic environments

• Gnutella etc. bad at:– finding “rare” items

Slides from Shenker and Stoica, UCB

Page 22: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

DHT Layering

Distributed hash table

Distributed application

get (key) data

node node node….

put(key, data)

Lookup service

lookup(key) node IP address

• Application may be distributed over many nodes• DHT distributes data storage over many nodes

Slides from Shenker and Stoica, UCB

Page 23: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Two Crucial Design Decisions

• Technology for infrastructure: P2P– Take advantage of powerful clients– Decentralized– Nodes can be desktop machines or server quality

• Choice of interface: Lookup and Hash Table– Lookup(key) returns IP of host that “owns” key– Put()/Get() standard HT interface– Some flexibility in interface (no strict layers)

Slides from Shenker and Stoica, UCB

Page 24: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

A DHT in Operation: Overlay

Slides from Shenker and Stoica, UCB

Page 25: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

A DHT in Operation: put()

put(K1,V1)

Slides from Shenker and Stoica, UCB

Page 26: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

put(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

A DHT in Operation: put()

Slides from Shenker and Stoica, UCB

Page 27: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

A DHT in Operation: put()

Slides from Shenker and Stoica, UCB

Page 28: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

get(K1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

A DHT in Operation: get()

Slides from Shenker and Stoica, UCB

Page 29: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

get(K1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

A DHT in Operation: get()

Slides from Shenker and Stoica, UCB

Page 30: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Key Requirement

• All puts and gets for a particular key must end up at the same machine– Even in the presence of failures and new

nodes (churn)

• This depends on the DHT routing algorithm– Must be robust and scalable

Slides from Shenker and Stoica, UCB

Page 31: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

DHTs

• Examples– CAN– Chord– Pastry– Tapestry– In BitTorrent and

Coral CDN

• Keyspace partitioning – ownership of keys split among participating nodes– Node has ID and owns keys

“close” to its ID by some distance function

• Hash filename to key• Routing in the overlay

– To node with a closer ID or else it’s mine

Page 32: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

PASTRY Overlay Network

k

Route k

• Nodes assigned 1-dimensional IDs in hash space at random (e.g., hash on IP address)

• Each node has log n neighbors & maintains routing table

• Lookup with fileID k is routed to live node with nodeID close to k

Page 33: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

PAST• Rice Univ. and MSR Cambridge UK• Based on Internet-based overlay• Not traditional file system semantics • File is associated with fileID upon insertion into

PAST and can have k replicas– fileID is secure hash of filename, owner’s public key,

random salt #– K nodes whose nodeIDs are “closest” to msb of fileID

• Instead of directory lookup, retrieve by knowing fileID

Page 34: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

• DHash replicates each key/value pair at the nodes after it on the circle

• It’s easy to find replicas

• Put(k,v) to all

• Get(k) from closest

N32

N10N5

N110

N99

N80N60

N20K19

K19

N40 K19

Data Availability via Replication

Slides from Shenker and Stoica, UCB

Page 35: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

N40

N10

N5

N20

N110

N99

N80

N60

N50

Block19

N68

Copy of19

First Live Successor Manages Replicas

Slides from Shenker and Stoica, UCB

Page 36: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Other P2P FS examples

Page 37: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Farsite

• Microsoft Research – intended to look like NTFS• Desktops on LAN (not Internet-scale)• 3 roles: client, member of directory group, file host• Directory metadata managed by Byzantine replication• File hosts store encrypted replicated file data• Directory group stores secure hash of content to

validate authenticity of file• Multiple namespace tree roots with namespace

certificate provided by CA• File performance by local caching under leasing

system

Page 38: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

LOCKSS

• Lots of Copies Keeps Stuff Safe (HPLabs, Stanford, Harvard, Intel)

• Library application for L-O-N-G term archival of digital library content (deal with bit rot, obsolescence of format, malicious users).

• Continuous audit and repair of replicas based on taking polls of sites with copies of content (comparing digest of content and repairing my copy if it differs from consensus).

• Rate-limited and churn of voter lists to deter attackers from compromising enough copies to force a malicious “repair”.

Page 39: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Sampled Poll• Each peer holds for every preserved Archival Unit

– reference list of peers it has discovered– friends list of peers its operator knows externally– history of interactions with others (balance of contributions)

• Periodically (faster than rate of storage failures)– Poller takes a sample of the peers in its reference list– Invites them to vote: send a hash of their replica

• Compares votes with its local copy– Overwhelming agreement (> 70%) Sleep blissfully– Overwhelming disagreement (< 30%) Repair– Too close to call Raise an alarm

• To repair, the peer gets the copy of somebody who disagreed and then reevaluates the same votes

Page 40: Outline for Today’s Lecture Administrative: Objective: –Peer-to-peer file systems Mechanisms employed Issues Some examples.

Churn of Voter Lists

• Reference List– Take out voters, so that the next poll is based ondifferent group– Replenish with some “strangers” and some “friends”

• Strangers: Accepted nominees proposed by voters who agree with poll outcome

• Friends: From the friends list• The measure of favoring friends is called friend bias• History

– Poller owes its voters a vote (for their future polls)– Detected misbehavior penalized in victim’s history