Peer-to-Peer Systems · The eDonkey Network - Principle Distributed server(s) ! Set up and RUN BY...
Transcript of Peer-to-Peer Systems · The eDonkey Network - Principle Distributed server(s) ! Set up and RUN BY...
Peer-to-Peer Systems
Winter semester 2014 Jun.-Prof. Dr.-Ing. Kalman Graffi Heinrich Heine University Düsseldorf
Peer-to-Peer Systems
Unstructured P2P Overlay Networks – Unstructured Heterogeneous Overlays
This slide set is based on the lecture "Communication Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt
3 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Unstructured P2P Structured P2P
Centralized P2P Homogeneous P2P Heterogeneous P2P DHT-Based Heterogeneous P2P
1. All features of Peer-to-Peer included
2. Central entity is necessary to provide the service
3. Central entity is some kind of index/group database
Examples: § Napster
1. All features of Peer-to-Peer included
2. Any terminal entity can be removed without loss of functionality
3. ! no central entities
Examples: § Gnutella 0.4 § Freenet
1. All features of Peer-to-Peer included
2. Any terminal entity can be removed without loss of functionality
3. ! dynamic central entities
Examples: § Gnutella 0.6 § Fasttrack § eDonkey
1. All features of Peer-to-Peer included
2. Any terminal entity can be removed without loss of functionality
3. ! No central entities
4. Connections in the overlay are “fixed”
Examples: § Chord § CAN § Kademlia
1. All features of Peer-to-Peer included
2. Peers are organized in a hierarchical manner
3. Any terminal entity can be removed without loss of functionality
Examples: • AH-Chord • Globase.KOM
from R.Schollmeier and J.Eberspächer, TU München
Unstructured Heterogeneous P2P Overlays
4 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Principles – Hierarchical / Heterogeneous
Approach: to combine best of both worlds § Robustness by distributed
indexing § Fast searches by server
queries
Components § Supernodes
• Mini servers / super peers • Used as servers for queries
– To build a sub-network between supernodes
– Queries distributed at sub-network between supernodes
§ “Normal” peers • Have only overlay
connections to supernodes
++ Advantages § More robust than
centralized solutions § Faster searches than in
pure P2P systems
-- Disadvantages § Need of algorithms to
choose reliable supernodes
Picture from R.Schollmeier and J.Eberspächer, TU München
5 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
History of P2P Filesharing Networks
8 / 25
© F
rau
nh
ofe
r-G
esellschaft
20
12
Peer-to-Peer Filesharing
6 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Decentralized File Sharing with Distributed Servers
For example: eDonkey see e.g.
• http://www.overnet.org/ • http://www.emule-project.net/ • http://savannah.gnu.org/projects/mldonkey/
eDonkey file-sharing protocol
§ Most successful/used file-sharing protocol in • e.g. Germany & France in 2003 [see sandvine.org]
– 52% of generated P2P file sharing traffic – KaZaA only for 44% in Germany
§ Stopped by law • February 2006 largest server „Razorback 2.0“ disconnected be
Belgium police – http://www.heise.de/newsticker/eDonkey-Betreiber-wirft-endgueltig-das-
Handtuch--/meldung/78093
7 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
The eDonkey Network - Principle
Distributed server(s) § Set up and RUN BY POWER-USERS
§ à nearly impossible to shut down all servers § Exchange their server lists with other servers
• using UDP as transport protocol § Manages file indices
Client application
§ Connects to one random server and stays connected
§ Using a TCP connection § Searches are directed to the server § Clients can also extend their search
• by sending UDP search messages to additional servers
8 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Edonkey functionality
eDonkey hash can be used for several queries § eDonkey server
• Search for peers – Servers block requests if too many requests are sent
§ Kad network à additional structure p2p overlay • Search for peers (including peers behind a firewall)
– Very efficient (10 requests per second) Queries to peers – Finds more peers than found using servers
• Ratings and comments for all Kad peers – Not used very widely
§ Directly from the peer (requests to a specific file) • Query for the filename
– About 65 % of all peers answer with filename • Ratings and comments of the peer • Search for further peers
9 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
The eDonkey Network
Server List Exchange
Download
Extended Search
Search
TCP
UDP
Supernode
Node
10 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
The eDonkey Network
Procedure § New servers send
• their port + IP to other servers (UDP)
§ Servers send • server lists (other servers
they know) to the clients § Server lists can also be
downloaded on various websites
Server List Exchange
Download
Extended Search
Search
Supernode
TCP
UDP
Node
11 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
The eDonkey Network
Files are identified by § Unique MD4
• Message-Digest Algorithm4, RFC 1186 file hashes
• 16 byte long § Are not identified by
filenames
§ This helps in • Resuming a download
from a different source • Downloading the same file
from multiple sources at the same time
• Verification that the file has been correctly downloaded
10 / 25
© F
rau
nh
ofe
r-G
esellschaft
20
12
eDonkey
● Filesharing network with most files
● Centralized P2P network with many eDonkey servers
● Additional DHT: Kad
● eDonkey hash is created directly from file content
12 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
The eDonkey Network
à the SEARCH consists of two steps
1. Full text search to • Connected server (TCP) or • Extended search with UDP to other known servers.
§ Search result are the hashes of matching files 2. Query Sources
• Query servers for clients offering a file with a certain hash
Later • Download from these sources
Status: 1,229,568 users, 37,399,014 files (30.08.2012)
13 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
The eDonkey Network
à the alternate SEARCH consists of two steps
0. Participate in the KAD network 1. Know MD4 – hash of file 2. Query Sources in KAD
§ Send lookup to node responsible for file hash § Query responsible node for clients offering the
Later
• Download from these sources
Status: 600k-2M users, 200M-600M files (30.08.2012)
14 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Testing the Content in Edonkey Networks
Forensic Test set Consists out of about 1500
files § Images, music, videos,
documents and miscellaneous
§ Images: Fraunhofer, Windows 7, KDE, 4chan
§ Music: mainly three big music collections
§ Videos: YouTube, Open Source Films, P2P, ...
§ Documents: diverse PDFs, Fraunhofer, BitTorrent
§ Miscellaneaous: Zips, executables, Malware, ...
21 / 25
© F
rau
nh
ofe
r-G
esellschaft
20
12
Hits
● Hitrate: 385 / 1479 (26 %)
15 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
KaZaA: Decentralized File Sharing with Super Nodes
see § www.kazaa.com, gift.sourceforge.net, http://www.my-k-lite.com/
System § Developer: Fasttrack § Clients: KaZaA
Properties: § Most successful P2P network in USA in 2002/3
Architecture: neither completely central nor decentralized § Supernodes to reduce communication overhead
Ca. 525.00010528 Mio.120.000Gnutella600.000650-260013 Mio.230.000eDonkey4 Mio.3550472 Mio.2,6 Mio.Fasttrack
#downloads(from
download.comterabytes#files#usersP2P system
Ca. 525.00010528 Mio.120.000Gnutella600.000650-260013 Mio.230.000eDonkey4 Mio.3550472 Mio.2,6 Mio.Fasttrack
#downloads(from
download.comterabytes#files#usersP2P system
Numbers are from 10‘2002
16 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Decentralized File Sharing with Super Nodes
Examples: KaZaA, Gnutella 0.6 (Morpheus, mldonkey) Peers
§ Connected only to some super nodes § Send IP address and file names only to super peers
Super nodes - super peers: § Peers with high-performance network connections § Take the role of the central server and proxy for simple peers § Answer search messages for all peers (reduction of comm. load) § One or more supernodes can be removed without problems
Additionally, the communication between nodes is encrypted
Download
Service Delivery
Search
Superpeer
Peer
Search
17 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Example for KaZaA
18 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Decentralized File Sharing with Complete Files
At the beginning
Later
Drone 1 has § 50 KB/s upload rate § not utilized
until he has whole file
From www.wtata.com
Drone 1 receives § 25% of the file § at 12,5 KB/s rate
Queen Bee has § 100 MB file § 50 KB/s upload rate
in total
19 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Issues with KaZaA / Gnutella 0.6
Keyword-based search § You do not know what you get § Pollution a problem
• Music companies flooded the network with false files • Chance to get a “good” file ~ 10% • Problem for “small” files
Full file download before uploading
§ User go offline after download finished § Only few uploaders online § Problem for “large” files
20 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Google Trends for KaZaA, Limewire, Torrent, Emule
http://www.google.com/trends?q=kazaa,+limewire,+torrent,+emule
21 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Unstructured Hybrid Resource Sharing: Skype
Offered Services § IP Telephony features § File exchange § Instant Messaging
Features § KaZaA technology § Encrypted high media quality § Support for teleconferences § Multi-platform
Further Information § Very popular, low-cost IP
telephony § SkypeOut extension to call regular
phone numbers (not free) § Great business potential if
combined with free WIFIs Oct.2011 bought by Microsoft
§ Super nodes are now servers From www.skype.com
Very popular
22 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Skype
Network Architecture § formerly KaZaA based
Skype login server
message exchange at login
super node
regular node
23 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Exercise
21
3
5
4
7
6
8
5
43
5
1
6
2
8
1
7
43
52
6
2 1
7
910
21 3
4
4
76
8
5
910
43
536
2 1
78
... ... ...
... ... ...
... ... ...
... ... ...
24 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Problem 2.2 - Super Hyper Hierarchical Networks
Let us imagine a hierarchical overlay with three hierarchy steps: there are normal peers, super peers, and hyper peers.
a) Number of hyper peers needed § Assume that a super peer cares for 100 normal peers, and a
hyper peer is responsible for 100 super peers. How many hyper peers would we need in a network of 999 000 peers in total?
b) Querying in the hyper-super-overlay § Suggest a way how such a network could handle search queries.
Which information should a super peer maintain? What should a hyper peer know? How would a search query be processed?
Peer-to-Peer Systems
Structured Homogenous P2P Overlay Networks – Distributed Indexing and DHTs
This slide set is based on the lecture "Communication Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt
26 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Unstructured P2P Structured P2P
Centralized P2P Pure P2P Hybrid P2P DHT-Based Hybrid P2P
1. All features of Peer-to-Peer included
2. Central entity is necessary to provide the service
3. Central entity is some kind of index/group database
Examples: § Napster
1. All features of Peer-to-Peer included
2. Any terminal entity can be removed without loss of functionality
3. ! no central entities
Examples: § Gnutella 0.4 § Freenet
1. All features of Peer-to-Peer included
2. Any terminal entity can be removed without loss of functionality
3. ! dynamic central entities
Examples: § Gnutella 0.6 § Fasttrack § eDonkey
1. All features of Peer-to-Peer included
2. Any terminal entity can be removed without loss of functionality
3. ! No central entities
4. Connections in the overlay are “fixed”
Examples: § Chord § CAN § Kademlia
1. All features of Peer-to-Peer included
2. Peers are organized in a hierarchical manner
3. Any terminal entity can be removed without loss of functionality
Examples: • AH-Chord • Globase.KOM
from R.Schollmeier and J.Eberspächer, TU München
Structured P2P Overlays: Principles
27 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Structured Overlay Networks: Interconnection Networks
Structured Overlay Networks § Give peers and objects (unique) identifier
• PeerIDs and ObjectIDs shall be from the SAME key set • Each peer is responsible for a specific range of ObjectIDs
§ Indexing (knowledge on location of resources) to be distributed § No search needed anymore (local indexing) § No server knowing all (global indexing) available
New challenge: to find peer(s) with specific ID in overlay
§ Lookup: • “Route” queries across the overlay network to peer with specific ID
§ Once peer is found • Initiate direct communication • Upload / download resources
28 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Functions in a Structured P2P Overlay (all)
IsMyKey(K) à true if node is responsible for Key K Route(K, M, hint) à send message M to node responsible for K
§ Hint: Optional first hop GetNodeHandle (K, hint) à get contact details of responsible node Send(M, q) à Send Message M to node q
29 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Schematic View on Distributed Hash Table
30 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Additional Functions in a Distributed Hash Table
Put (Data D, Key K) à Copies Data to node responsible for K GetData (Key K) à Gets Data stored under the Key K Optional further functions:
§ Replication § Secure Communication § Access Control
H(„my data“)= 3107
2207
7.31.10.25
peer-to-peer.info
12.5.7.31
95.7.6.10
86.8.10.18
planet-lab.orgberkeley.edu
29063485
201116221008709
611
89.11.20.15
?
31 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Look up in Structured P2P Systems
Principle § Location of the objects is found via routing
• � Node A (provider) advertises object at responsible peer B » Advertisement is routed to B.
• � Node C looking for object sends query » Query is routed to responsible node.
• � Node B replies to C by sending contacting information of A
1. Publish link at responsible Peer
3. P2P com-munication. Get link to object.
2. “Routing” to / Lookup of desired Object
?
Node A
Node B
Node C
32 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Strategies for Data Retrieval: Distributed Indexing
Goal is scalable complexity for § Communication effort: O(log(N)) hops § Node state: O(log(N)) routing entries
H(„my data“)= 3107
2207
7.31.10.25
peer-to-peer.info
12.5.7.31
95.7.6.10
86.8.10.18
planet-lab.orgberkeley.edu
29063485
201116221008709
611
89.11.20.15
?
Routing in O(log(N)) steps to the node storing the data
Nodes store O(log(N)) routing information to other nodes
The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. by Steinmetz, Wehrle
33 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Recall Hash Function & Hash Table
Hash function H(x) § maps
• large input domain § onto
• smaller target domain/range (most often subset of integer)
§ such that • we get few collisions • i.e.
– it would be possible to uniquely identify most of these strings using this hash
Hash table
§ data structure that provides fast lookup
§ of a record indexed by a key § where
• the domain of the key is too large for simple indexing; as would occur if an array were used
Like arrays, hash tables can provide O(1) lookup § with respect to the number
of records in the table.
And .. Question
§ IF H(x) ≠ H(y) • THEN (implies) x ≠ y ?
(yes)
§ IF H(x) = H(y) • THEN (implies) x = y ?
(no)
34 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Recall Hash Tables & Hash Functions
Hash tables are a well-known data structure § A fixed-size array § Elements of array also called “hash buckets”
Properties
§ allow insertions § allow deletions § allow to find entry in constant (average) time
Hash functions
§ map keys to elements onto (in) the array Properties of good hash functions:
§ Fast to compute § Good distribution of keys into hash table § Example: SHA-1 algorithm
• SHA = Secure Hash Algorithm
35 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Hash Tables: An Example
Assume a network of N=10 nodes Hash function:
§ hash(x) = x mod 10 (=N)
Example
§ Insert numbers 0, 1, 4, 9, § 16, and 25
Properties
§ Easy to find if a given key is present in the table
Hash Table, an example
Keys Values
2 3 4 5 6 7 8 9
1 0 0
1
4 25 16
9
36 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Hash Tables: An Example
Drawback of the example § Collisions are likely to happen
§ Time to search grows linearly
with amount of peers
§ To insert and remove a peer scales also linearly
§ Hash function must be adapted to the amount of available peers and it is extremely time consuming
Distributed Hash Table DHT § Huge hash table (2^160 entries) § Assigns concatenated input
RANGE to peers • (instead of individual numbers)
Hash Table, an example
2 3 4 5 6 7 8 9
1 0 0
1
4 25 16
9
Keys Values
37 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Design Aspects for Distributed Hash Tables
1. Choice of an identifier space 2. Mapping of resources and peers to the identifier space 3. Management of the identifier space by the peers 4. Graph embedding (structure of the logical network) 5. Routing strategy 6. Maintenance strategy
38 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Group of peers P Group of resources R Overlay maps resources R and peers P on identifier space I Example I: § Chord: [0, 2^160[ § Pastry: [0, 2^128[ § CAN: multidimensional
Overlay Network: Design Decisions
IPFP →:
IRFR →:
39 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(1) Choice of Identifier Space
Importance of Identifier Space: § Identifier space needed for addressing resources and peers
• Often: ID (object ) = hash (object content) § Identifier space should be large to support large systems § Identifier space independent from physical location of peer à
mobility of peers § Clustering of resources due to closeness metric of identifier space § Message routing uses identifier space
40 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(1) Choice of Identifier Space
à Main addressing ID space The identifier space must posses closeness metric d: Which MUST satisfy the following conditions: And SHOULD satisfy the following conditions:
R: →× IId
( ) 0,:, ≥∈∀ yxdIyx( ) 0,: =∈∀ xxdIx( ) yxyxdIyx =→=∈∀ 0,:,
( ) ( )xydyxdIyx ,,:, =∈∀( ) ( ) ( )zydyxdzxdIzyx ,,,:,, +≤∈∀
41 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(2) Mapping to the Identifier Space
à Assigning ID addresses to peers (à Peer ID) Possible design decisions: Completeness: may be complete or partial Identifier space should be injective Dynamicity: may be fixed or change dynamically over time
)()(:, qFpFqpPqp PP ≠⇒≠∈∀
PF
PF
42 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(3) Management of Identifier Space
Peers are responsible for resource identifiers Identifier space I is managed by peers P: Responsibility function: Which associates
§ the identifiers of a resource § with a set of peers managing the resource
Through M § each peer p is assigned responsibility § for a set of identifier
Locating a resource corresponds to finding a peer p in
PIM 2: →
))(( IrFi R ∈=
)(1 pM −
))(( rFM R
Rr∈
43 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Responsibility Function
Basic properties of M: Completeness: Cardinality:
§ One or more peers are responsible for given identifier M § OFTEN induced by proximity: identifiers are associated with peers
that are numerically closest
Dynamicity:
§ the responsibility function typically changes as peers join and leave Uniformity / non-uniformity of replication
)(:: iMpPpIi ∈∈∃∈∀
)),((min)),(()( iqFdipFdiMp PPqp∈
=⇒∈
44 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(4) Graph Embedding
à Creating a graph / network with the peers Overlay can be modeled as a graph G=(P,E) A neighborhood function
defines the neighbor set N(p) of the peer p Which means that maintains all connected nodes as
neighbors: Typically: Here the overlays differ in the implementation
§ Which nodes to connect to § How to set up the routing table
PPN 2: →
Pp∈EqppNq ∈∈∃ ),(:)(
45 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(4) Graph Embedding – Desired properties
Desired properties of the graph are: Graph diameter:
§ a small graph diameter should provide lower bounds for latencies during routing
Connectivity: § the graph should be connected at any time
Local connectivity: § a peer should be connected to a subset of its immediate neighbors
Long-range connectivity: § overlay connectivity should be structurally similar to small world graphs and
should satisfy the condition:
§ with d denoting the dimension of the identifier space § (Many “close” contacts, few “far away” contacts)
[ ] dPP qFpFdpNqP −≈∈ ))(),((
1)(
46 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(5) Routing strategy
à Routing in the network to the queried identifier Routing is modeled as asynchronous message passing
§ which forwards a message m with identifier i to peer p
route(p,i,m) A routing strategy defined as a non-deterministic function:
Which selects at
§ A given element of I as destination ID § a given peer p § with neighborhood N(p) § the set of next peers
PIPR 2: →×
)(),( pNipR ∈
47 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Routing strategy: Greedy routing
Mostly used in structured overlays For a routing step from peer p to peer q with the follow condition holds: Which means that
§ the distance to the target § after one routing step is § less or equal to the distance before
• (Ideally: distance is halved in order to reach O(log(N)) routing steps)
)),(()),(( iqFdipFd Pp ≤
),( ipRq∈
48 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
(6) Maintenance Strategy
à Keeping the routing information up-to-date Maintenance strategies can be classified into:
§ Proactive correction (e.g. using heartbeat messages) § Reactive mechanisms
• Correction on use • Correction on failure • Correction on change
Goal for maintenance strategy: § Sufficient level of consistency § Minimize effort
49 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Design Concepts for P2P Overlays
Key design concepts : • choice of an identifier space • mapping of resources and peers to the identifier space • management of the identifier space by the peers • graph embedding (structure of the logical network, selection of
contacts) • routing strategy • maintenance strategy
Unstructured vs. structured § Structured: object IDs and peer IDs share same ID space
• Every object ID is assigned to one single peer • Lookup possible := routing to peer being responsible for desired
object ID
50 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Motivation Distributed Indexing
Communication overhead i.e.
§ no. of hops vs. § State(s) of node
• (i.e. amount of routing entries stored in node, e.g. server)
Com
mun
icat
ion
Ove
rhea
d
State(s) of Node
Flooding
Central Server
O(Nk)
O(N) O(1)
O(1)
O(log N)
O(log N)
Bottleneck: • Communication Overhead
• False negatives (viele Fehlmeldungen) Bottlenecks: • Memory, CPU, Network • Availability ?
Scalable solution between both extremes?
The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. by Steinmetz, Wehrle
51 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Motivation Distributed Indexing
Communication overhead vs. node state
Com
mun
icat
ion
Ove
rhea
d
Node State
Flooding
Central Server
O(N)
O(N) O(1)
O(1)
O(log N)
O(log N)
Bottleneck: • Communication Overhead
• False negatives Bottlenecks: • Memory, CPU, Network • Availability Distributed
Hash Table
Scalability: O(log N) No false negatives
§ i.e. Never ( answer YES .. if it is NOT there )
More resistant against changes § Failures, Attacks § Short time users
The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. by Steinmetz, Wehrle
52 HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
Fundamentals of Distributed Hash Tables
Challenges for designing DHTs: 1. Desired Characteristics
§ Flexibility § Reliability § Scalability
2. Load balancing
§ Equal content “load” for all nodes § vs. content load proportional to node capacity § vs. content load proportional to content consumption
3. Permanent adaptation to faults, join, leave of nodes
§ Assignment of responsibilities to new nodes § Re-assignment and re-distribution of responsibilities
in case of node failure or departure