-1- Peer-to-Peer (P2P) Computing Tutorial April 29, 2003 한재선 ([email protected]) SSL...
-
Upload
nathan-wilson -
Category
Documents
-
view
215 -
download
0
Transcript of -1- Peer-to-Peer (P2P) Computing Tutorial April 29, 2003 한재선 ([email protected]) SSL...
-1-
Peer-to-Peer (P2P) Computing
Tutorial
April 29, 2003
한재선 ([email protected])
SSL (System Software Lab), EECS, KAIST
-2-
Roadmap
▣ The Past of P2P Computing◈ P2P File Sharing System◈ P2P Distributed Computing◈ P2P Collaboration
▣ The Present of P2P Computing
▣ The Future of P2P Computing
-3-
P2P Fils Sharing : Napster
▣ Centralized lookup◈ Centralized directory service◈ Single point of failure
CentralizedDirectoryServer
(napster.com)
N1
N2N3
N4
N5
… …a.mp3 N5
… …
file node
Search(a.mp3)
N5 IP addr.
Request(a.mp3)
File(a.mp3)
-4-
P2P Fils Sharing : Gnutella
▣ Flooding-based lookup◈ Completely decentralized lookup◈ High network overhead
N1
N2
N3
N5
N4
N7
N6N8
N9
Search(a.mp3)
Search Result N3, N5, N8Selected Node N5
-5-
P2P Fils Sharing : Freenet
Node N120
130 N128142 N140149 N145
key node
Lookup Key = 325
Routing table
Node N145
Node N200
Node N315
164 N160185 N182204 N200
key node
240 N228288 N280321 N315
key node
340 N338378 N372399 N387
key node
Value(325)
▣ Key-based lookup◈ Key components : key, nodeId, routting table◈ File is not stored in the node of creator◈ Cannot guarantee the lookup hop count
-6-
P2P Distributed Computing
CentralizedServer
N1
N2N3
N4
N5
code + parameters
results
◈ Sharing computing resources (CPU time) Cycle-stealing◈ @home projects (SETI@home, folding@home, genome@home etc), distributed.net
-7-
P2P Collaboration
N1
N2N3
N4
N5
N6
CollaborativeDesign
Instant Messaging
◈ Instant messaging, P2P Groupware (e.g., Groove) etc
-8-
Discussion on the past of P2P
▣ Summary◈ P2P File Sharing System sharing files◈ P2P Distributed Computing sharing CPU time◈ P2P Collaboration
▣ Questions◈ What are the common features among three types of P2P
applications?
◈ When comparing to traditional client/server computing, which advantages and disadvantages does P2P computing
involve?
◈ Which challenges or issues must P2P computing address?
-9-
Roadmap
▣ The Past of P2P Computing◈ P2P File Sharing System◈ P2P Distributed Computing◈ P2P Collaboration
▣ The Present of P2P Computing◈ The characteristics of P2P computing◈ The issues of P2P computing - P2P algorithms - P2P applications - P2P platforms
▣ The Future of P2P Computing
-10-
What is P2P Computing?
▣ What is P2P computing?◈ New computing paradigm◈ Every node is both client and server◈ Sharing computing resources(e.g., storage, CPU power etc)
among peer nodes
CentralizedServer
Client
ClientClient
Client
Peer node
Client/Server Computing Peer-to-Peer Computing
N1
N2
N3
N4
N5
N6
Search(a.mp3)
Search(b.mp3)
a.mp3b.mp3
-11-Characteristics of P2P Computing▣ Resource sharing : Storage, Processing, Networking, Information etc▣ Fully decentralized : no distinction between client and server▣ Node characteristics : heterogeneous, unreliable, insecure
▣ Scalability ◈ vs. determinism and performance guarantee◈ C/S Napster & SETI@home Gnutella&Freenet DHT
▣ Fault Resilience◈ No single point of failure, but disconnections/unreachability, partitions, nod
e failures, and non-availability of resources▣ Self-organization
◈ Needed for scalability, fault resilience, ad hoc resources, cost of ownership▣ Ad-Hoc connectivity
◈ How can guarantee quality of service? e.g., redundancy▣ Anonymity
◈ Author, Publisher, Reader, Server, Document, Query etc◈ Freenet (Identity spoofing), Free Heaven&Publius (Covert path)
-12-
Issues of P2P Computing▣ Issues on P2P Algorithm
◈ Lookup mechanism : structured & unstructured◈ Locality-aware or topology-aware routing◈ Load balancing & Replication & Caching◈ Security◈ Search◈ Incentive & reputation◈ Measurement & Theoretic analysis
▣ Issues on P2P applications◈ Wide-area file system : CFS, PAST, Ivy, OceanStore, Farsite etc◈ Overlay Multicast : Scribe, CAN multicast, Bayeux etc◈ Web applications : Squirrel, Backslash etc◈ Many other applications : Naming System(P2P DNS, INS), Comm. (I3) etc
▣ Issues on P2P platform◈ JXTA
-13-Structured Lookup : DHT-based(1)▣ Distributed Hash Table (DHT)-based lookup
◈ Lookup Key hashing the file name◈ NodeId hashing the node’ IP address◈ Each node stores a certain range of keys according to nodeId◈ lookup(key) returns the value or the IP address of the node stori
ng the object with that key▣ Examples of DHT-based lookup
◈ Chord (MIT), CAN (UCB), Pastry (Rice&MS), Tapestry (UCB), etc
Distributed hash tables
Distributed applications
Lookup (key) data
node node node….
Insert(key, data)
-14-
Structured Lookup : Chord (1)
▣ Key and Node are uniformly distributed and exist in the same ID space▣ Each node is responsible to keys between predecessor node and itself▣ Simple Lookup algorithm - Naïve version
◈ Lookup request is forwarded to the successor until (node ID < key < successor)
◈ In worst case, the number of nodes that must be contacted : O(N)
N32
N10N5
N20
N110
N99
N80N60
Lookup(K19)
K19
Lookup using successor
-15-
Structured Lookup : Chord (2)
N32
N10N5
N20
N110
N99
N80N60
Lookup(K19)
K19
▣ Finger table : m entries (2m ID space), Allows logN-time lookup▣ Lookup using finger table
◈ O(N) O(logN)
Lookup using finger table
-16-
Structured Lookup : Chord (3)
Node Join Node Leave
▣ Consistency problem : finger table must be updated consistently
-17-
Structured Lookup : CAN
▣ Virtual d-dimensional Cartesian coordinate space▣ hold information about a small number of “adjacent”
zones : 2d▣ Average routing path length : (d/4)(N 1/d)▣ Node join, departure soft-state style updates and refreshes
-18-
Structured Lookup : Pastry
▣ Lookup cost : log2bN
▣ Leaf set : Numerically closest nodeIds▣ Neighborhood set : Node set that are closest to local node
Lookup from node 65a1fc with key d46a1c
-19-Structured Lookup : DHT-based(2)▣ Advantages
◈ Highly scalable : small routing table, small pathlength (O(logN))◈ Guarantee lookup performance (?)◈ Good Interface easy to support a wide-range of applications - minimal interface : insert(key, data), lookup(key) - Keys have no semantic meaning◈ Robust in the face of failures and attacks◈ Self-organizing across administrative domains
-20-
Unstructured Lookup : SuperNode
▣ Relieve flooding overhead▣ Small-world network vs. Scale-free network▣ Supernode election
◈ Not firewalled, Sufficient CPU, Sufficient BW, Sufficient uptime etc▣ Gnutella’s Ultrapeer, FastTrack’s SuperNode, JXTA’s Rendezvous Peer
N11
N13
N14
N12
N21 N22
N23N42
N43
N41
N31
N32
…
…
…
a.mp3
Lookup(a.mp3)
Supernode
-21-Discussion : Structured vs. Unstructured▣ What are the strong and weak points of each approach?
◈ Performance considering determinism vs. Overhead◈ Scalability◈ Fault Resilience◈ Security – easily attacked?◈ Easiness to use and deploy◈ …
▣ What type of application each approach is suitable for?
◈ File sharing system◈ Distributed computing◈ Collaboration◈ Another new type of applications
▣ What is the intermediate approach between structured and unstructured?
-22-
Locality-aware routing
▣ Nodes close on ring, but far away in Internet Implicit problem of overlay network▣ Goal
◈ put nodes in routing table that result in few hops and low latency
▣ Landmark scheme (CAN), Client clustering scheme etc
CA-T1CCIArosUtah
CMU
To vu.nlLulea.se
MITMA-CableCisco
Cornell
NYU
OR-DSLN20
N40N80N41
L2
L1
L3
49
105
87
“L1L2L3
”
Landmarks
“L3L2L1
”
“L3L2L1
”
A
B
Landmark schemeInefficient routing
-23-
Security
▣ What’s new in P2P for security?◈ Fully Decentralized
– No Authorization and authentication◈ Nodes are much more powerful
– Assign nodeID themselves– Act as a router (routing table, forwarding messages)
◈ Malicious node may mis-route, corrupt, or drop messages and routing information
▣ Attacks and Solution◈ Secure nodeID assignment – certified nodeIds◈ Secure routing table maintenance – constrained routing table◈ Secure message forwarding – failure test, redundant routing
-24-
Incentive for sharing
▣ The problem of free-riding◈ Users that only download files for themselves without providing files for dow
nload by others, or provide files that are of low quality ◈ 70% Gnutella users shared no files, nearly 50% of all responses were retur
ned by the top 1% of sharing hosts. [August, 2000]◈ 25% Gnutella users shared no files, 75% 100 files or less, just 7% more th
an 1000 files. [May, 2001]▣ Solutions
◈ Micro-Payment charge download & reward upload◈ Point System rewards for sharing - accounting component - award component : level of service - enrollment and aggregation agent - e.g., KaZaa : Participation Level = (upload / download) * 100 (0 ~ 1000) the higher participation level, the higher priority of downloading
-25-
Wide-area file system
▣ Wide-area file system◈ CFS(Chord), Ivy(Chord), PAST(Pastry), OceanStore(Tapestry) etc
▣ CFS
995:key=901key=732Signature
File System key=995
……
“a.txt” ID=144
key=431key=795
…
…
(root block)(directory blocks)
(i-node block)
(data)
901= SHA-1 144 = SHA-1431=SHA-1
-26-
Overlay Multicast
▣ Overlay Multicast◈ Scribe (Pastry), CAN Multicast, Bayeux (Tapestry) etc
▣ Scribe
1100
1111
1101
1001
0111
0100
root
Joining member
Joining member
gID1
gID2
…
GroupList chID1
chID2…
chID4chID7chID1
…
ChildrenTable
Forwarder Node
gID:1100Ch :0100 0111
gID:1100Ch :1001
gID:1100Ch :1101
-27-
Web Applications (1)
▣ Web Applications◈ Squirrel (client-side web caching), Backslash (Backup CDN)
▣ Squirrel
server
client
otherother
req
home
req
req
a : object or notmod from home
b : object or notmod from origin3
1
b2
(WAN)(LAN)
origin
b : req
Home-Store Protocol
-28-
Web Applications (2)
dir
server
servere : cGET req
origin
origin
otherother
req
home
req
client
req
2
b : not-modified
3
e3
21c ,e : req
c ,e : object1
4a , d
2a , d : req 1a : no dir, go to origin. Also d2
3
1
not-modifiedobject or
dele-gate
Directory Protocol
-29-
P2P Platform▣ Project JXTA
◈ Key concepts: peer, peer group, advertisement, service, module◈ Key protocols: Peer Discovery Protocol, Peer Resolver Protocol,
Pipe Binding Protocol, Endpoint Routing Protocol, Peer Membership Protocol, Peer Information Protocol
•Interoperability•Platform Independence•Ubiquity
JXTA Objectives
-30-Discussion on The Present of P2P▣ Another P2P algorithm issues ?
▣ Another P2P applications ?
-31-
Rodemap
▣ The Past of P2P Computing◈ P2P File Sharing System◈ P2P Distributed Computing◈ P2P Collaboration
▣ The Present of P2P Computing◈ The characteristics of P2P computing◈ The issues of P2P computing - P2P algorithms - P2P applications - P2P platforms
▣ The Future of P2P Computing
-32-
AP2PC 2003▣ Second International Workshop on
Agents and Peer-to-Peer Computing
◈ Intelligent agent techniques for P2P computing ◈ P2P computing techniques for multi-agent systems ◈ The Semantic Web, Semantic Coordination Mechanisms and P2P systems ◈ Scalability, coordination, robustness and adaptability in P2P systems ◈ Self-organization and emergent behavior in P2P networks ◈ E-commerce and P2P computing ◈ Participation and Contract Incentive Mechanisms in P2P Systems ◈ Computational Models of Trust and Reputation ◈ Community of interest building and regulation, and behavioral norms ◈ Intellectual property rights in P2P systems ◈ P2P architectures ◈ Scalable Data Structures for P2P systems ◈ Services in P2P systems (service definition languages, service discovery, filtering and compo
sition etc.) ◈ Knowledge Discovery and P2P Data Mining Agents ◈ P2P oriented information systems ◈ Information ecosystems and P2P systems ◈ Security issues in P2P networks ◈ ad-hoc networks and pervasive computing based on P2P architectures and wireless commun
ication devices
-33-
GP2P 2003▣ Third International Workshop on Grid and Peer-to-Peer Computing
◈ * Global Computing and Peer-to-Peer computing platforms * P2P Merging / Interoperability with Grid systems (OGSA/I) * Software technologies evaluation: Web Services, Grid services * Middleware, programming models, environments and toolkits * Protocols for resource management/discovery/reservation/scheduling
* Economic considerations of resource usage (protocols, accounting) * Storage in Global Computing Infrastructures (strategies, protocols) * Performance monitoring, benchmarking, evaluation and modeling of Global Computing and Peer-to-Peer systems and/or components thereof * Security, management and monitoring of resources * Result certification (detection/tolerance of corrupted results) * Parallel computing on large scale distributed systems * Compute & I/O driven applications (scientific, engineering, business) * Global and P2P computing applications (programmed from scratch, ported from sequential, or parallel version, adaptations to fit a global computing environment)
-34-
DBISP2P 2003
▣ International Workshop On Databases, Information Systems and Peer-to-Peer Computing
◈ Data models and query languages for P2P systems ◈ Data placement and query answering in P2P systems ◈ Indexing, caching and replication techniques for P2P systems ◈ Transaction management for P2P Systems ◈ Metadata management in P2P systems ◈ Dynamic schema integration ◈ P2P systems and the Semantic Web ◈ Self-organisation and emergent behaviour in information systems and d
atabases ◈ Complex Adaptive Information Systems ◈ Resource allocation in P2P systems
-35-
What’s next?
▣ P2P is New Type of ??
◈ New type of algorithm (X)◈ New type of protocol (X)
◈ New type of application (X) ◈ New type of network (X)
P2P is New Computing Paradigm !!!
▣ See broader Merge with other emerging technologies
◈ Mobile computing : mobile P2P, mobile infrastructure using P2P◈ Ubiquitous computing : ubiquitous P2P, ubiquitous infrastructure
using P2P◈ Grid computing : desktop grid, global computing system◈ Internet Distributed Computing (Intel) : P2P + Grid + Web Service