The Index Poisoning Attack in P2P File Sharing Systems
-
Upload
milagro-unzues -
Category
Documents
-
view
27 -
download
0
description
Transcript of The Index Poisoning Attack in P2P File Sharing Systems
The Index Poisoning Attack in
P2P File Sharing Systems
Keith W. RossPolytechnic University
• Jian Liang • Naoum Naoumov
Joint work with:
Internet Traffic
CF: CacheLogic
File Distribution Systems: 2005
Attacks on P2P: Decoying
Two types:• File corruption: pollution• Index poisoningInvestigated in two networks:• FastTrack/Kazaa
– Unstructured P2P network
• Overnet– Structured (DHT) P2P network– Part of eDonkey
File Pollution
pollution company
polluted content
original content
File Pollution
pollution company
pollution server
pollution server
pollution server
pollution server
file sharingnetwork
File Pollution
Unsuspecting usersspread pollution !
File Pollution
Unsuspecting usersspread pollution !
Yuck
Index Poisoning
index title location bigparty 123.12.7.98smallfun 23.123.78.6heyhey 234.8.89.20
file sharingnetwork
123.12.7.98
23.123.78.6
234.8.89.20
Index Poisoning
index title location bigparty 123.12.7.98smallfun 23.123.78.6heyhey 234.8.89.20123.12.7.98
23.123.78.6
234.8.89.20
index title location bigparty 123.12.7.98smallfun 23.123.78.6heyhey 234.8.89.20bighit 111.22.22.22
111.22.22.22
Overnet: DHT
• (version_id, location) stored in nodes with ids close to version_id
• (hash_title, version_id) stored in nodes with ids close to hash_title
• First search hash_title, get version_id and metada
• Then search version_id, get location
Overnet0001
0011
0100
0101
10001010
1100
1111
PublishQuery
Download
FastTrack Overlay
Each SN maintains a local index
ON =ordinary node
SN = super node
SN
ON
ONON
FastTrack Query
ON =ordinary node
SN = super node
SN
ON
ONON
FastTrack Download
ON =ordinary node
SN = super node
SN
ON
ONON
HTTP requestfor hash value
FastTrack Download
ON =ordinary node
SN = super node
SN
ON
ONON
P2P file transfer
Attacks: How Effective?
• For a given title, what fraction of the “copies” are– Clean ?– Poisoned?– Polluted?
• Brute-force approach:– attempt download all versions– For those versions that download, listen/watch
each one
• How do we determine pollution levels without downloading?
Titles, versions, hashes & copies
• The title is the title of song/movie/software
• A given title can have thousands of versions
• Each version has its own hash
• Each version can have thousands of copies
• A title can also have non-existent versions, each identified by a hash
Definition of Pollution and Poisoning Levels
• (t, t+ Δ): investigation interval
• V: set of all versions of title T
• V1, V2, V3: sets of poisoned, polluted, clean versions
• Cv: number of advertised copies of version v
Vvv
Vvv
i C
C
L i
How to Estimate?
• Need Cv, vєV
• Need V1, V2, V3
– Don’t want to download and listen to files!
Solution:
• Harvest Cv, vєV, and copy locations– Overnet: Insert node, receive publish msg’s– FastTrack: Crawl
• Heuristic for V1, V2, V3
Copies at Users
FastTrack Overnet
Heuristic
• Identify heavy and light publishers
• Hh = set of hashes from heavy publishers
• Hl = set of hashes from light publishers
polluted versions
cleanversionspoisoned
versions
Hh
Hl
Heuristic: More
Evaluation #Download # Success #Accuracy # False
Polluted 8,450 8,400 99.4% 0.6% (positive)
Poisoned 33,186 1,156 96.5% 3.5% (negative)
Heuristic is accurate & does not involve any downloading!
FastTrack Versions
FastTrack Copies
Overnet Copies
Blacklisting
• Assign reputations to /n subnets– Bad reputation to subnets with large
number of advertised copies of any title
• Obtain reputations locally; share with distributed algorithm
• Locally blacklist /n subnets with bad reputations
Blacklisting: More
The Inverse Attack• Attacks on P2P systems:
• But can also exploit P2P sytems for DDoS attacks against innocent host:
Summary&
Thank You!