Georges Da Costa - irit.frGeorges.Da-Costa/Yerevan2017P2P.pdf · Georges Da Costa erevan,Y Armenian...

55

Transcript of Georges Da Costa - irit.frGeorges.Da-Costa/Yerevan2017P2P.pdf · Georges Da Costa erevan,Y Armenian...

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Introduction on Peer to Peer systems

Georges Da Costa

Yerevan, Armenian National Academy of Sciences

[email protected] 1/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Goal of this Lecture

What can P2P do, not only as a buzzword

What it can't do

Shows some examples & algorithms

A Survey and Comparison of Peer-to-Peer Overlay Network Schemes, by Eng Keong Luaand al.

in IEEE Communications survey and tutorial March 2004

Harnessing the Power of Disruptive Technologies

published by O'Reilly, 2001

[email protected] 2/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

1 What is P2P

2 First generation systems

3 Self-organized systems

4 Structured systems

5 Distributed Hash Table

6 Conclusion

[email protected] 3/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Plan

1 What is P2P

2 First generation systems

3 Self-organized systems

4 Structured systems

5 Distributed Hash Table

6 Conclusion

[email protected] 4/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Universal

What have in common

Net Meeting, Skype, Ekiga

Irc, Msn, Icq, Jabber

Kazza, Freenet, Napster, Gnutella

Seti@Home, Folding@Home

Ebay, Flickr, Facebook

[email protected] 5/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

De�nition

Philosophical one

Participants gathering their resources in order to achieve a common goal

[email protected] 6/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Why ?

Available resources

Large Hard Drives

Powerful CPUs

Correct connexion to Internet

Users want

More freedom

No link to commercial companies

No infrastructure cost

[email protected] 7/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

A new (?) solution : Peer to Peer sys-tems

De�nition

Participant gathering their resources in order to achieve a common goal

Computers are running the same code

There is no global view of the system

View is limited to neighboors

Everyone has the same rights and duties

[email protected] 8/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Peer-to-Peer: New name, old concept

An architecture already there

Internet connects most of existing computers

Most computers are not fully used

Idle time > 75% on personal computersStorage systems are mostly empty

Already used between servers

Usenet

DNS

IP Routing

[email protected] 9/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Comparison with Client/Server

In client/Server each node is either a Client or a Server. Usually there are a fewServers and lots of Clients.

Client/Server systems su�er from single point of failure.

Client/Server are mostly static, at least the Servers. Peer to Peer systems aredynamics.

Client/Server systems need human administrators

Client/Server does not scale

[email protected] 10/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Comparison with Client/Server II

Client

Client

Client

Client

Client

Client

Node

Node

Node

NodeNode

Server

[email protected] 11/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Comparison with Client/Server II

When a new participant joins a service, the service increase the resource consumption

Client/Server : increases the server power/connectivity

Peer to Peer : uses the resources given by the participant

[email protected] 12/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Not so easy

Wanted

Scalability (1K,100K,1M nodes)

Dynamicity

Security (user, task)

Transparent

For the user (CPU,memory,disk)For the network

Heterogeneity

Self-organization

Participation (66% of Free riders)

Go through NAT/Firewall

[email protected] 13/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Self-organization

Participants

High volatility & voluntary

No central administration

Resource discovery

Heterogeneity

HardwareUsers (15% of users have 94% of �les)

Distribution of the resources

Trust

[email protected] 14/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

What's not new

Partial solutions

Scalability : Farm of web servers

Dynamism : Cell phones

Fault tolerance : Redundant servers

[email protected] 15/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Current Peer to Peer systems

Available applications

File sharing

Distributed storage

Content delivery

Distributed computing

Telephony/Chat

Games

[email protected] 16/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Current Peer to Peer systems (cont)

Widely used

2004 : According to British Web analysis �rm CacheLogic, BitTorrent accounts for anastounding 35 percent of all the tra�c on the Internet � more than all otherpeer-to-peer programs combined � and dwarfs mainstream tra�c like Web pages

Start-ups

Skype (ok, no more a small start-up)

BitTorrent

UbiStorage

[email protected] 17/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Two worlds

Internet Users

Problem of security

Large scale

No control

Motivation needed

Private Area (Corp., Univ.)

Other mean of security

Medium to large scale

Total control

[email protected] 18/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Plan

1 What is P2P

2 First generation systems

3 Self-organized systems

4 Structured systems

5 Distributed Hash Table

6 Conclusion

[email protected] 19/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Index Method

Server

Client

Client

Client Client

Client

Static connexion

Sending the files list

File transfert

Sending a request for a file

Users send the list of their �les to a server

To �nd a �le, you send a request to the server

It answers with the list of clients owning the �le

You directly contact the owners for the transfer

[email protected] 20/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Index Method II

Systems

Napster, Mojonation, Yaga, Filetopia, Seti@Home

Problems

Scaling

Price

HotSpot

Attack

Single point of failure

[email protected] 21/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Useful when...

Small number of client

Need a total control of transfers (video game industry)

Performance is more important than cost

[email protected] 22/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

BitTorrent

Same approach as Napster, but :

Downloads are done in parallel

One server per �le

Server manages all the details of transfers

Server enforces the rule The more you share, the more you get

Di�erences

Specialized for large �les

Distributed due to the One server per �le rule

[email protected] 23/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Privacy

No privacy

Napster : The server knows all transfers

BitTorrent : For each �le, a server knows all transferts

[email protected] 24/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Flooding

ClientClient

ClientClient Client

Client

Client

You send your request to your neighbors

They forward it to their neighbors, and so on until reaching the Time To Live depth

Users with �les corresponding to the request answer

[email protected] 25/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Flooding II

Systems

Gnutella, Direct Connect

Characteristics

Distributed structure

No single point of failureDenial of service di�cult (but possible)

Not scalable

Resource consumption (network)Not complete answers

[email protected] 26/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Privacy

Average to good privacy

Onion routing (good privacy)

No global view of the system

Usually easy to obtain the shared list of a node

Di�cult to have a global impact

[email protected] 27/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Super Peers

SuperPeerSuperPeer SuperPeer

SuperPeerSuperPeer

SuperPeer

PeerPeer Peer

Super Peers act as local servers

Some reliable nodes act as super peers

Super peers are connected with a gnutella protocol

Each super peer acts as a local server for several peers

[email protected] 28/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Super Peers II

Systems

Gnutella2, Kazaa

Characteristics

Less distributed structure

Some nodes are more loadedSome nodes are more important

Scalable

Less resource consumption due to limits of number of answers

[email protected] 29/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Plan

1 What is P2P

2 First generation systems

3 Self-organized systems

4 Structured systems

5 Distributed Hash Table

6 Conclusion

[email protected] 30/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

A case study : Freenet

Ian Clarke, University of Edinbourgh, (1999)

Keywords

A peer-to-peer �le sharing system

Provide anonymity for authors and readers

A web of Freedom

Principle

Files are referenced by key

The key is obtained by SHA-1 on the �le

The key is routed to localize the �le

[email protected] 31/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Content Driven routing algorithm

Routing table contains a set of key/node pairs

Take the nearest key in the routing table to obtain the next node to consult.

Nearest key = by lexical comparison

Request

Data

3

2 5 6

4

1

78a b

c d

e

abc

node cacdabb

node [email protected] 32/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

On the path of the answer

File is replicated on the path in the cacheCache : variant of Last Recently UsedRouting tables are updated

→ the graph evolves (new links = new entries)

a b

c d

e

Old links New links (entries)node c

acdabb

node babc node d

[email protected] 33/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Anonymity

Reader

Impossible to know if a user is forwarding or initiating the request

Impossible to know if a user is the last to receive a �le

Writer

Once in the system, the writer can disconnect

Impossible to know if someone insert some �le or forward it

[email protected] 34/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Some properties

Self-organization of the graph

Nodes specialize in �les with close keys (learning process)

Good properties (Small World)

File are automatically replicated in function of their popularity

Hot-spots are limited

Tolerant against attacks

[email protected] 35/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Drawbacks

Counterpart

Files might disappear (LRU cache)

The network is heavily loaded

Di�cult to update a value

Impossible to know what is hosted locally

[email protected] 36/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Plan

1 What is P2P

2 First generation systems

3 Self-organized systems

4 Structured systems

5 Distributed Hash Table

6 Conclusion

[email protected] 37/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Pastry

Principle

Each �le has a key

Each node has an identi�er

Node with identi�er Id manages keys whose values are near Id

Queries

Content driven queries

Su�x forwarding

[email protected] 38/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Pastry II

05981598259835984598559865987598

x098x198x298x398x498x598x698x798

xx08xx18xx28xx38xx48xx58xx68xx78

xxx0xxx1xxx2xxx3xxx4xxx5xxx6xxx7

Links to the neighbor

Table of the node 4598

87CA

D598

1598

2118

09 98

8 F4B

0325

4598

3E98

00982BB8

598 8

Neighbors of Id are chosen as to have the su�x of their identi�er in common withId

[email protected] 39/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Pastry III

Pros

ln(n) messages guarantee

Good path redundancy

Cons

Di�cult to keep a synchronized neighbor table

Problem of data redundancy

No adaptation to data dynamicity

[email protected] 40/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Plan

1 What is P2P

2 First generation systems

3 Self-organized systems

4 Structured systems

5 Distributed Hash Table

6 Conclusion

[email protected] 41/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Current state of Peer to Peer systems

A lot of redundant systems

Typically File Sharing

Common basic component

Distributed index (Key, Value)

Key is typically the �lename

Value is typically the �le content or where to obtain it

Each Key is associated with a node

[email protected] 42/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Generic Interface

Node Id : k-bit identi�er (unique)

Key : k-bit identi�er (unique)

Value : bytes (can be a �le, an IP, ...)

Generic DHT (Distributed Hash Table)

put(key, value)

Stores (key, value) on the node responsible of key

value = get(key)

Retrieves the data associated with key

[email protected] 43/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Current implementations

Software

Kadmelia

Chord

CAN

Usage

File sharing

Naming

Chat service

Databases

[email protected] 44/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Still limited

Fundamental Problems

Complex request

Data coherence

Request with several answer

Implementation di�culties

Distribute workload evenly

KeysRequests

Only local information

Dynamic information

[email protected] 45/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Chord structure

Nodes are distributedon a circle

Keys are assigned tothe node with Id justbefore their value

0

128

64192

75

61Key in store of 61

62, 66 74

[email protected] 46/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Neighbors

Log(N) neighbors

Neighbors are nodesId + 1, Id + 2, Id +4, ..., Id + 2i , ..., Id +2k−1 (modulo 2k).

0

128

64192

[email protected] 47/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Routing algorithm

Forward to theneighbor which isprior to the key

Query needs at mostLog(N) messages

0

128

192 64

Query for 40

Node responsible of

Id between 35 and 50

[email protected] 48/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Chord characteristics

E�cient

If a (key, value) exists, the query will �nd it

Fast : Log2(1.000.000) = 23

Small neighbors table Log2(N)

[email protected] 49/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Chord characteristics

Some problems

Security and privacy

Attack

How to test and evaluate such system ?

Real performance (instead of number of messages)

[email protected] 50/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Physical overlay

Logical topology mapped in the physical network :

N2

N1

Query

Answer

[email protected] 51/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Plan

1 What is P2P

2 First generation systems

3 Self-organized systems

4 Structured systems

5 Distributed Hash Table

6 Conclusion

[email protected] 52/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Conclusion

Peer to Peer systems are e�cient for several uses (using border resources)

Recent systems are scalable

Low cost alternative to Client/Server

Field old enough to be used in real cases

Still not perfect

Trust & certi�cationAnonymitySecurityPerformanceLayers fees

[email protected] 53/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

When to use Peer to Peer systems

Limited budget

Large audience

Trusted users

Dynamic system, but not too much

Do not need guarantee

Do not need control

[email protected] 54/55

What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion

Vision of the future

User centered

No more serversAll content provided and served by users

Only cooperation of peers

WikipediaSocial networksYoutubeGood Ol' Time web-pages

[email protected] 55/55