1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing...
![Page 1: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/1.jpg)
1
Seminar: Information Management in the Web
Gnutella, Freenet and more: an overview of file sharing
architectures
Thomas Zahn
![Page 2: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/2.jpg)
2
Peer-to-Peer - Introduction
• "opposite" of Client/Server
• no central servers information highly distributed
• every peer acts as a client AND server
• -> can query, reply to queries and route messages at the same time
• every peer can directly "talk" to any other peer
![Page 3: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/3.jpg)
3
Popular Peer-to-Peer Networks
• Napster
• Gnutella
• Freenet
• FastTrack (Kazaa)
• CHORD, CAN, PASTRY, TAPESTRY
![Page 4: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/4.jpg)
4
Napster
• was used primarily for file sharing
• NOT a pure peer-to-peer network
• => hybrid system
• peer turns to central DB for querying (client/server)
• peer downloads directly from other peer(s) (peer-to-peer)
![Page 5: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/5.jpg)
5
Napster
central DB6
5
1 2
4
3
1. Query2. Response
3. DownloadRequest
4. File
Peer
![Page 6: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/6.jpg)
6
Gnutella - overview
• pure peer-to-peer
• used for file sharing
• very popular => practically proven ?
• very simple protocol
• no routing "intelligence"
• messages are always broadcast
![Page 7: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/7.jpg)
7
Gnutella - PING/PONG
1 52
4
3
6
7
8
Ping 1
Ping 1
Ping 1
Ping 1
Ping 1
Ping 1
Ping 1Known Hosts:2
3,4,5
6,7,8
Pong 2
Pong 4
Pong 3
Pong 5Pong 3,4,5
Pong 6,7,8 Pong 6
Pong 7
Pong 8
Pong 6,7,8
Query/Response analogous
![Page 8: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/8.jpg)
8
Gnutella - Pro & Con
• VERY simple protocol => easy to implement
• very little overhead
• practically proven functionality (?)
• message broadcasts flood network
• =>heavy network traffic
• => bad, bad scalibility
![Page 9: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/9.jpg)
9
Gnutella – Reachable Peers
T=1 T=2 T=3 T=4 T=5 T=6 T=7 T=8
N=2 2 4 6 8 10 12 14 16
N=3 3 9 21 45 93 189 381 765
N=4 4 16 52 160 484 1,456 4,372 13,120
N=5 5 25 105 425 1,705 6,825 27,305 109,225
N=6 6 36 186 936 4,686 23,436 117,186 585,936
N=7 7 49 301 1,813 10,885 65,317 391,909 2,351,461
N=8 8 64 456 3,200 22,408 156,864 1,098,056 7,686,400
![Page 10: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/10.jpg)
10
Gnutella – Generated Traffic in Bytes (1)
T=1 T=2 T=3 T=4 T=5 T=6 T=7 T=8
N=2 166 332 498 664 830 996 1,162 1,328
N=3 249 747 1,743 3,735 7,719 15,687 31,623 63,495
N=4 332 1,328 4,316 13,28 40,172 120,848 362,876 1,088,960
N=5 415 2,075 8,715 35,275 141,515 566,475 2,266,315 9,065,675
N=6 498 2,988 15,438 77,688 388,938 1,945,188 9,726,438 48,632,688
N=7 581 4,067 24,983 150,479 903,455 5,421,311 32,528,447 195,171,263
N=8 664 5,312 37,848 265,600 1,859,864 13,019,712 91,138,648 637,971,200
• query message length: 83 bytes• simple query relaying (no responses)
![Page 11: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/11.jpg)
11
Gnutella – Generated Traffic in Bytes (2)
T=1 T=2 T=3 T=4 T=5 T=6 T=7 T=8
N=3283.68 1,418.4 4,822.56 13,900.3 36,594.7 91,061.3 218,15 508.638
N=4378.24 2,647.68 12,860.2 53,710.1 206,897 758,371 2,688,530 9,306,220
N=5472.8 4,255.2 26,949.6 147,986 753,17 3,658,050 17,214,200 79,185,000
N=6567.36 6,240.96 48,793 332,473 2,105,470 12,743,500 74,798,500 429,398,000
N=7661.92 8,604.96 80,092.3 651,991 4,941,123 35,823,800 252,002,000 1,734,360,000
N=8756.48 11,347.2 122,55 1,160,440 10,242,000 86,526,900 709,521,000 5,693,470,000
• Mean percentage of users who typically share content: 30%• Mean perctg. of users who typically have responses to search queries: 40%• Mean number of search responses the typical respondent offers: 10• Mean length of search responses the typical respondent offers: 60 "Standard client settings yield a whopping 17MB generated in response to […] search query "
![Page 12: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/12.jpg)
12
Freenet - Concepts
• peer-to-peer file storage & retrieval system
• every document has a globally unique ID
• efficient (?) retrieval algorithm– documents are retrieved with sublinear effort
• routing based on likelihood of answer capability
• focus on security
![Page 13: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/13.jpg)
13
Freenet – Query Routing (1)
• every peer maintains routing table
• table contains known peers along with the IDs of the documents their are storing
• a request is routed to the peer most likely to have an answer (closest matching ID)
• responses are sent back upstream
• intermediate peers also store document and augment their routing tables
![Page 14: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/14.jpg)
14
Freenet – Query Routing (2)
Routing TableB: 14, 20Doc Cache19, 30
A B
C
D
Routing TableC: 19, 30D: 45, 51Doc Cache14,20
Routing TableB: 14, 20X: 47, 60Doc Cache5, 89
Routing TableB: 14, 20Z: 105, 110Doc Cache17, 45, 51, 102, 205
1. Query for doc 17 3. C has no match -> backtrack
2. Forward to best match
4. Forward query to 2nd best match
5. Send back doc 17
Routing TableC: 19, 30D: 17, 45, 51Doc Cache14, 17, 20
6. Route back response
Routing TableB: 14, 17, 20X: 47, 60Doc Cache5, 17, 89
![Page 15: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/15.jpg)
15
Freenet – Document Insert
• analogous to query routing
• insert is routed to the peer most likely to be interested in new doc (closest matching ID)
• intermediate peers cache document and augment routing tables
• until TTL is reached
![Page 16: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/16.jpg)
16
Freenet - Discussion
• efficient routing algorithm (compared to Gnutella)
• adequate security features/heuristics (the more popular a document, the more frequently it gets cached)
• no metasearch
• no updates, deletes possible
• worst case query routing = DFS
![Page 17: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/17.jpg)
17
FUtella – Concepts
• peer-to-peer platform for general knowledge sharing
• tries to model learning style of humans
• content-based routing
• combines and extends approaches from:– Gnutella (message format)– JXTA (peer groups)– JXTA Search (queryspaces and registrations)– FreeNet (routing of registration discoveries)
![Page 18: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/18.jpg)
18
FUtella - Knowledge Groups
E
MiM1 . . .
Group Head: Peer E
Members M1 - Mi
FUtella NetKnowledge Group:Queryspace "Computer Architecture"
Inserts Registration
![Page 19: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/19.jpg)
19
FUtella - Knowledge Group Discovery 1
Routing Table"computer" -> B"computer analysis" -> YRegistration Cache"computer": B"computer analysis": Y
A B
C
D
Routing Table"computer analysis" -> C"computer systems" -> D"data base" -> ARegistration Cache"computer analysis" : Y"computer systems": Z"data base" : X
Routing Table"computer" -> B"data base" -> XRegistration Cache"computer": B"data base": X
Routing Table"computer" -> B"computer systems" -> Z"computer architecture" -> ERegistration Cache"computer systems": Z"computer": B"computer architecture": E
1. Discovery request "computer architecture"
3. C has no cached registration for "computer architecture -> backtrack
2. Forward discovery request
4. Forward discovery request to 2nd best match
![Page 20: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/20.jpg)
20
FUtella - Knowledge Group Discovery 2
A B D
Routing Table"computer analysis" -> C"computer architecture" -> D"computer systems" -> D"data base" -> ARegistration Cache"computer analysis" : Y"computer architecture": E"computer systems": Z"data base" : X
Routing Table"computer" -> B"computer architecture" -> D"data base" -> XRegistration Cache"computer": B"computer architecture": E"data base": X
Routing Table"computer" -> B"computer systems" -> Z"computer architecture" -> ERegistration Cache"computer systems": Z"computer": B"computer architecture": E
5. Discovery response
Containing registration "computer architecture": E
6. Forward discovery response
![Page 21: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/21.jpg)
21
Futella - Query Processing
A B
C
D
1. Discovery request "computer architecture"
2. Forward discovery request
3. C has no cached registration for "computer architecture -> backtrack4. Forward discovery
request to 2nd best match
5. Discovery response containing cached registration
6. Forward discovery response
E M1
Mi
.
.
.
8. Forward query to member
8.Forward query to member
9. Query response
9. Query response
Knowledge group "computer architecture"
7. Send query
![Page 22: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/22.jpg)
22
Futella - Test Results (1)
Total Number of Messages
dynamic peersstatic peers semi-dynamic peers
0
50000
100000
150000
200000
250000
# m
sg
threshold 2
no threshold
Gnutella
![Page 23: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/23.jpg)
23
FUtella - Test Results (2)
Average Hit Ratio
dynamic peersstatic peers semi-dynamic peers
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
threshold 2
no threshold
Gnutella
![Page 24: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/24.jpg)
24
Conclusion
• first and second generation P2P systems still most widely used
• practically proven
• very flexible in terms of topology
• bad scalibility (Gnutella)
• no guaranteed lower bound on query effort (Freenet)
• (scientificly) far better approach: DHTs (see next presentation)
![Page 25: 1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.](https://reader036.fdocuments.net/reader036/viewer/2022062516/56649d6a5503460f94a48e6b/html5/thumbnails/25.jpg)
25
Questions ?
?