Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella:...

112
1 Peer-peer and Application-level Networking Don Towsley UMass-Amherst with help of lots of others (J. Kurose, B. Levine, J. Crowcroft, CMPSCI 791N class) Don Towsley 2002

Transcript of Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella:...

Page 1: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

1

Peer-peer and Application-level Networking

Don TowsleyUMass-Amherst

with help of lots of others (J. Kurose, B. Levine, J. Crowcroft, CMPSCI 791N class)

Don Towsley 2002

Page 2: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

2

0. Introduction

backgroundmotivationoutline of the tutorial

Page 3: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

3

Peer-peer networking

Page 4: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

4

Peer-peer networking Focus at the application level

Page 5: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

5

Peer-peer networkingPeer-peer applications

Napster, Gnutella, CAN: file sharingad hoc networksmulticast overlays (e.g., video distribution)

Page 6: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

6

Peer-peer networkingQ: What are the new technical challenges?Q: What new services/applications enabled?Q: Is it just “networking at the application-level”?

“There is nothing new under the sun” (William Shakespeare)

Page 7: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

7

Tutorial Contents

introductionclient-server v. P2Parchitectures

centralized search• Napster

distributed search -flooding

• Gnutelladistributed search -hashing

• CAN, CHORD, …application-level multicast

research issuessecuritymodelingmore general applications

summary

Page 8: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

8

Client Server v. Peer to Peer(1)

RPC/RMIsynchronousassymmetricemphasis on language integration and binding models (stub IDL/XDR compilers etc)Kerberos style security – access control, crypto

messagesasynchronoussymmetricemphasis on service location, content addressing, application layer routing.anonymity, high availability, integrity.harder to get right☺

Page 9: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

9

Peer to peer systems actually old

IP routers are peer to peer.routers discover topology, and maintain itrouters are neither client nor serverrouters continually talk to each otherrouters inherently fault tolerantrouters are autonomous

Page 10: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

10

Peer to peer systems

nodes have no distinguished roleno single point of bottleneck or failure.need distributed algorithms for

service discovery (name, address, route, metric, etc)neighbour status trackingapplication layer routing (based possibly on content, interest, etc)resilience, handing link and node failures…

Page 11: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

11

Ad hoc networks and peer2peer

wireless ad hoc networks have many similarities to peer to peer systemsno a priori knowledgeno given infrastructurehave to construct it from “thin air”!

Page 12: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

12

Overlays and peer 2 peer systems

P2p technology often used to create overlays offering services that could be offered in the IP leveluseful deployment strategyoften economically a way around other barriers to deploymentIP was an overlay (on telephone core infrastructure)not all overlays are P2P (AKAMAI)

Page 13: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

13

P2P Architecture Classification

centralized service location (CSL)Napster

distributed service location with flooding (DSLF)

Gnutelladistributed service location with hashing (DSLH)

CAN, Pastry, Tapestry, Chord

Page 14: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

14

Centralized Search Architecture

centralized directory service

search directory

Lord of the Rings?

A, C

A

B

C

D

E

Page 15: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

15

NAPSTER

the most (in)famousnot the first (c.f. probably Eternity, from Ross Anderson in Cambridge)but instructive for what it gets right, andalso wrong…also has a political message…and economic and legal…

Page 16: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

16

Napsterprogram for sharing files over the Interneta “disruptive” application/technology?history:

5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service12/99: first lawsuit3/00: 25% UWisc traffic Napster2/01: US Circuit Court of

Appeals: Napster knew users violating copyright laws

7/01: # simultaneous online users:Napster 160K, Gnutella: 40K,

Morpheus: 300K

Page 17: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

17

judge orders napsterto stop in July ‘01 other filesharing apps take over!

gnutellanapsterfastrack

8M

6M

4M

2M

0.0bi

ts p

er s

ec

Page 18: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

18

Napster: how does it work

Application-level, client-server protocol over point-to-point TCP

Four steps:connect to Napster serverupload your list of files (push) to server.give server keywords to search the full list with.select “best” of correct answers. (pings)

Page 19: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

19

Napster

napster.com

users

File list is uploaded

1.

Page 20: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

20

Napster

napster.com

user

Requestand

results

User requests search at server.

2.

Page 21: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

21

Napster

napster.com

user

pings pings

User pings hosts that apparently have data.

Looks for best transfer rate.

3.

Page 22: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

22

Napster

napster.com

user

Retrievesfile

User retrieves file

4.

Page 23: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

23

Napster: architecture notes

centralized server: single logical point of failurecan load balance among servers using DNS rotationpotential for congestionNapster “in control” (freedom is an illusion)

no security: passwords in plain textno authentication no anonymity

Page 24: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

24

Distributed Search/Flooding

Page 25: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

25

Distributed Search/Flooding

Page 26: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

26

Gnutella

peer-to-peer networking: applications connect to peer applications focus: decentralized method of searching for fileseach application instance serves to:

store selected filesroute queries (file searches) from and to its neighboring peersrespond to queries (serve file) if file stored locally

Page 27: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

27

Gnutella

Gnutella history:3/14/00: release by AOL, almost immediately withdrawntoo late: 23K users on Gnutella at 8 am this AMmany iterations to fix poor initial design (poor design turned many people off)

what we care about:how much traffic does one query generate?

how many hosts can it support at once?what is the latency associated with querying?is there a bottleneck?

Page 28: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

28

Gnutella: how it worksSearching by flooding:

if you don’t have the file you want, query 7 of your partners.if they don’t have it, they contact 7 of their partners, for a maximum hop count of 10.requests are flooded, but there is no tree structure.no looping but packets may be received twice.reverse path forwarding(?)

Note: Play gnutella animation at: http://www.limewire.com/index.jsp/p2p

Page 29: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

29

Flooding in Gnutella: loop prevention

Seen already list: “A”

Page 30: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

30

Gnutella: initial problems and fixes

freeloading: WWW sites offering search/retrieval from Gnutella network without providing file sharing or query routing.

Block file-serving to browser-based non-file-sharing users prematurely terminated downloads:

long download times over modemsmodem users run gnutella peer only briefly (Napster problem also!) or any users becomes overloadedfix: peer can reply “I have it, but I am busy. Try again later” late 2000: only 10% of downloads succeed2001: more than 25% downloads successful (is this success or failure?)

www.limewire.com/index.jsp/net_improvements

Page 31: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

31

Gnutella: initial problems and fixes (more)

2000: avg size of reachable network ony 400-800 hosts. Why so smalll?

modem users: not enough bandwidth to provide search routing capabilities: routing black holes

Fix: create peer hierarchy based on capabilitiespreviously: all peers identical, most modem blackholesconnection preferencing:

• favors routing to well-connected peers• favors reply to clients that themselves serve large number of

files: prevent freeloadingLimewire gateway functions as Napster-like central server on behalf of other peers (for searching purposes)

www.limewire.com/index.jsp/net_improvements

Page 32: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

32

Gnutella Discussion:

architectural lessons learned?anonymity and security?other?good source for technical info/open questions:http://www.limewire.com/index.jsp/tech_papers

Page 33: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

33

Kazaa

hierarchical Gnutellasupernodes and regular nodes

most popular p2p app>120M downloads

not well understoodbinariesencrypted communications

supernodes

Page 34: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

34

hash tablesessential building block in software systems

Internet-scale distributed hash tablesequally valuable to large-scale distributed

systems?• peer-to-peer systems

– CAN, Chord, Pastry, …• large-scale storage management systems

– Publius, OceanStore,, CFS ...• mirroring on the Web

Internet-scale hash tables

Page 35: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

35

hash tablesessential building block in software systems

Internet-scale distributed hash tablesequally valuable to large-scale distributed

systems?• peer-to-peer systems

– CAN, Chord, Pastry, …• large-scale storage management systems

– Publius, OceanStore,, CFS ...• mirroring on the Web

Internet-scale hash tables

Page 36: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

36

Content-Addressable Network[Ratnasamy,etal]

introductiondesignevaluationstrengths & weaknessesongoing work

Page 37: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

37

Content-Addressable Network(CAN)

CAN: Internet-scale hash table

interfaceinsert(key,value)value = retrieve(key)

Page 38: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

38

Content-Addressable Network(CAN)

CAN: Internet-scale hash table

interfaceinsert(key,value)value = retrieve(key)

propertiesscalableoperationally simplegood performance (w/ improvement)

Page 39: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

39

Outline

introductiondesignevaluationstrengths & weaknessesongoing work

Page 40: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

40

K V

CAN: basic idea

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

Page 41: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

41

CAN: basic idea

insert(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

Page 42: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

42

CAN: basic idea

insert(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

Page 43: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

43

CAN: basic idea

(K1,V1)

K V

K VK V

K V

K V

K V

K V

K V

K V

K V

K V

Page 44: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

44

CAN: basic idea

retrieve (K1)

K V

K VK V

K V

K V

K V

K V

K V

K V

K V

K V

Page 45: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

45

CAN: solution

virtual Cartesian coordinate space

entire space is partitioned amongst all the nodes

every node “owns” a zone in the overall space

abstractioncan store data at “points” in the space can route from one “point” to another

point = node that owns the enclosing zone

Page 46: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

46

CAN: simple example

1

0 1

1

Page 47: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

47

CAN: simple example

1 2

Page 48: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

48

CAN: simple example

1

2

3

Page 49: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

49

CAN: simple example

1

2

3

4

Page 50: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

50

CAN: simple example

Page 51: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

51

CAN: simple example

I

Page 52: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

52

CAN: simple example

node I::insert(K,V)

I

Page 53: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

53

(1) a = hx(K)

CAN: simple example

x = a

node I::insert(K,V)

I

Page 54: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

54

(1) a = hx(K)b = hy(K)

CAN: simple example

x = a

y = b

node I::insert(K,V)

I

Page 55: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

55

(1) a = hx(K)b = hy(K)

CAN: simple example

(2) route(K,V) -> (a,b)

node I::insert(K,V)

I

Page 56: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

56

CAN: simple example

(2) route(K,V) -> (a,b)

(3) (a,b) stores (K,V)

(K,V)

node I::insert(K,V)

I(1) a = hx(K)b = hy(K)

Page 57: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

57

CAN: simple example

(2) route “retrieve(K)” to (a,b) (K,V)

(1) a = hx(K)b = hy(K)

node J::retrieve(K)

J

Page 58: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

58

Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address)

CAN

Page 59: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

59

CAN: routing table

Page 60: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

60

CAN: routing

(a,b)

(x,y)

Page 61: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

61

A node only maintains state for its immediate neighboring nodes

CAN: routing

Page 62: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

62

CAN: node insertion

Bootstrapnode

1) Discover some node “I” already in CANnew node

Page 63: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

63

CAN: node insertion

I

new node1) discover some node “I” already in CAN

Page 64: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

64

CAN: node insertion

2) pick random point in space

I

(p,q)

new node

Page 65: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

65

CAN: node insertion

(p,q)

3) I routes to (p,q), discovers node J

I

J

new node

Page 66: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

66

CAN: node insertion

newJ

4) split J’s zone in half… new owns one half

Page 67: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

67

Inserting a new node affects only a single other node and its immediate neighbors

CAN: node insertion

Page 68: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

68

CAN: node failures

Need to repair the space

recover database (weak point)• soft-state updates• use replication, rebuild database from replicas

repair routing • takeover algorithm

Page 69: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

69

CAN: takeover algorithm

simple failuresknow your neighbor’s neighborswhen a node fails, one of its neighbors takes over its zone

more complex failure modessimultaneous failure of multiple adjacent nodes scoped flooding to discover neighborshopefully, a rare event

Page 70: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

70

Only the failed node’s immediate neighbors are required for recovery

CAN: node failures

Page 71: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

71

Design recap

basic CANcompletely distributedself-organizingnodes only maintain state for their immediate neighbors

additional design featuresmultiple, independent spaces (realities)background load balancing algorithmsimple heuristics to improve performance

Page 72: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

72

Evaluation

scalabilitylow-latencyload balancingrobustness

Page 73: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

73

CAN: scalability

for a uniformly partitioned space with n nodes and ddimensions

per node, number of neighbors is 2daverage routing path is (dn1/d)/4 hopssimulations show that the above results hold in practice

can scale the network without increasing per-node state

optimal choice of d for given n yields~log(n) nbrs with ~log(n) hops

Page 74: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

74

CAN: low-latency

#nodes

Late

ncy

stre

tch

020

406080

100120140

160180

16K 32K 65K 131K

w/o heuristics

w/ heuristics

#dimensions = 2

0

2

4

6

8

10

#nodes16K 32K 65K 131K

#dimensions = 10

Page 75: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

75

CAN: load balancing

dealing with hot-spotspopular (key,value) pairsnodes cache recently requested entriesoverloaded node replicates popular entries at neighbors

uniform coordinate space partitioninguniformly spread (key,value) entriesuniformly spread out routing load

Page 76: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

76

Uniform Partitioning

added check at join time, pick a zonecheck neighboring zonespick the largest zone and split that one

Page 77: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

77

CAN: node insertion

(p,q)

Page 78: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

78

CAN: node insertion

Page 79: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

79

CAN: node insertion

Page 80: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

80

0

20

40

60

80

100

Uniform Partitioning

V 2V 4V 8V

Volume

Percentageof nodes

w/o check

w/ check

V = total volumen

V16

V8

V4

V2

65,000 nodes, 3 dimensions

Page 81: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

81

CAN: Robustness

completely distributed no single point of failure ( not applicable to pieces of database when node failure happens)

not exploring database recovery (in case there are multiple copies of database)

resilience of routingcan route around trouble

Page 82: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

82

Strengths

more resilient than flooding broadcast networksefficient at locating informationfault tolerant routingnode & Data High Availability (w/ improvement)manageable routing table size & network trafficcan build variety of services (application multicast)

Page 83: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

83

Multicast

associate multicast group with index, Ireverse path forwarding tree

I

Page 84: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

84

Multicast

associate multicast group with index, Ireverse path forwarding treesend to (hx(I),hy(I))

I

Page 85: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

85

Weaknesses

impossible to perform a fuzzy searchsusceptible to malicious activitymaintain coherence of all the indexed data (network overhead, efficient distribution)still relatively higher routing latencypoor performance w/o improvementnodes coming and going?

Page 86: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

86

Summary

CANan Internet-scale hash tablepotential building block in Internet applications

scalabilityO(d) per-node state

low-latency routingsimple heuristics help a lot

robustdecentralized, can route around trouble

Page 87: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

87

Related Work

TapestryZhao, Kubiatowicz, Joseph (UCB)

ChordStoica, Morris, Karger, Kaashoek, Balakrishnan (MIT / UCB)

PastryDruschel and Rowstron(Rice / Microsoft Research)

Page 88: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

88

Basic Idea of Chord

m bit identifier space for both keys and nodes

Key identifier = SHA-1(key)

Key=“LetItBe” ID=60SHA-1

IP=“198.10.10.1” ID=123SHA-1node identifier = SHA-1(IP address)

both are uniformly distributed

how to map key IDs to node IDs?

Page 89: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

89

Consistent Hashing [Karger 97]

A key is stored at its successor: node with next higher ID

N32

N90

N123 K20

K5

Circular 7-bitID space

0IP=“198.10.10.1”

K101

K60Key=“LetItBe”

Page 90: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

90

Consistent Hashingevery node knows of every other node

requires global informationrouting tables are large O(N)lookups are fast O(1)

N32

N90

N123

0

Hash(“LetItBe”) = K60

N10

N55

Where is “LetItBe”?

“A has K60”

K60

Page 91: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

91

Chord: Basic Lookup

N32

N90

N123

0

Hash(“LetItBe”) = K60

N10

N55

Where is “LetItBe”?

“A has K60”

K60

every node knows its successor in the ring

requires O(N) time

Page 92: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

92

“Finger Tables”

every node knows m other nodes in the ring

increase distance exponentially

N8080 + 20

80 + 2180 + 22

80 + 23

N9680 + 24

N11280 + 25

N1680 + 26

Page 93: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

93

“Finger Tables”finger i points to successor of n+2i

table contains O(log N) entries

N120

N8080 + 20

N112

N96

N16

80 + 2180 + 22

80 + 23

80 + 24

80 + 25 80 + 26

Page 94: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

94

Lookups are Faster

lookups take O(Log N) hops

N32

N10

N5

N20N110

N99

N80

N60

Lookup(K19)

K19

Page 95: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

95

Issues

joins/leavesload balancing…

Page 96: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

96

Performance

several measurement studies (Napster, Gnutella)

highly variable connection timeslots of freeloading

little analysis

Page 97: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

97

Performance Modeling [Ge, etal]

evaluation of different architecturespopulation of users

cycle through on-off statessystem provides

common services (e.g., search)download services

Page 98: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

98

1

M

think timecommonservices

file downloadservices

off-line

µs(Na)

µs(Na,·)

qf(Na,,·)

Page 99: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

99

Download Services

heavy tailed, Zipf preference distribution, pi ∝ 1/iαservice capacity proportional to popularity

µf(Na,i) = pi Cf

Page 100: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

100

Common Services

CSLµs(Na) = C1; qf(Na ) = 0

DSLFµs(Na) = Cq Na /Tβ

T – max. TTL for flooding β – connectivity parameter; β>1 qf(Na ) > 0

DSLHµs(Na) = Cq Na /log(Na)qf(Na ) = 0

Page 101: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

101

Solution Methods

bounding analysisfixed point solutions

Page 102: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

102

Comparison of Three Architectures

CSL has services bottleneckDSLF suffers from occasional failure to find fileDSLH more scalable

1

10

100

1000

1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09

Total Population: N

Syst

em T

roug

hput

: T

CIADIFADIHA

CSLDSLFDSLH

Page 103: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

103

0

500

1000

1500

2000

2500

3000

0 1E+08 2E+08 3E+08 4E+08Total Population: N

Syst

em T

hrou

ghpu

t: T

capacity: supernode/other node = 10/1,#supernode/#node=1/52.250capacity: supernode/other node = 2/1,#supernode/#node=1/11.250capacity: supernode/other node = 1/1,#supernode/#node=1/6.125

Supernodes (Kazaa)

hierarchy helpsplacing well-provisioned nodes at top good idea

Page 104: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

104

Anonymity: Crowds [Reiter98]

decentralized P2P solutionanonymous within the Crowdjondo (John Doe)

ProxyUser

path based

Page 105: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

105

Path-based Initiator Anonymity

R

X

Y

Z

I

Packets passed from initiator, I, to peers which deliver the packet to the responder R.

Page 106: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

106

Crowds Paths

R

X

Y

Z

I

• weighted coin flip• spinner

Page 107: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

107

Performance Issues

routing in overlays incurs a performance penalty

can it be quantified?can it be mitigated?

dynamic nature of user populationrobustness?performance?tradeoff between service location paradigms?

Page 108: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

108

Performance Issues

p2p file sharing vs. CDN/web (Akamai)compare robustness?compare performance?handling flash crowds?

p2p measurementkazaa!!

security of p2phow to measure/evaluate security

Page 109: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

109

Wrapup discussion questions (1):What is a peer-peer network (what is not a peer-to-peer network?). Necessary:

every node is designed to (but may not by user choice) provide some service that helps other nodes in the network get serviceeach node potentially has the same responsibility, functionality (maybe nodes can be polymorhpic)some applications (e.g., Napster) are a mix of peer-peer and centralized (lookup is centralized, file service is peer-peer) [recursive def. of peer-peer](logical connectivity rather than physical connectivity) routing will depend on service and data

Page 110: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

110

Overlays?

What is the relationship between peer-peer and application overlay networks?

peer-peer and application overlays are different things. It is possible for an application level overlay to be built using peer-peer (or vice versa) but not always necessaryoverlay: in a wired net: if two nodes can communicate in the overlay using a path that is not the path the network level routing would define for them. Logical network on top of underlying network

• source routing?wireless ad hoc nets – what commonality is there REALLY?

Page 111: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

111

Wrapup discussion questions (3):is ad hoc networking a peer-peer application?

Yes (30-1)why peer-peer over client-server?

A well-deigned p2p provides better “scalability”why client-server over peer-peer

peer-peer is harder to make reliableavailability different from client-server (p2p is more often only partially “up”)more trust is required

if all music were free in the future (and organized), would we have peer-peer.

Is there another app: ad hoc networking, any copyrighted data, peer-peer sensor data gathering and retrieval, simulation

evolution #101 – what can we learn about systems?

Page 112: Peer-peer and Application-level Networkingledvina/DHT/ucalgary/p2p-tutorial.pdf · 30 Gnutella: initial problems and fixes freeloading: WWW sites offering search/retrieval from Gnutella

112

THANKS!

slides can be found athttp://gaia.cs.umass.edu/towsley/p2p-tutorial.pdf