TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social...
-
Upload
jonah-ford -
Category
Documents
-
view
213 -
download
0
Transcript of TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social...
TECS Week, Pune, 5-9 January 2009 1
The User is the Computer: From Decentralized Systems to Social Computing
Peter Druschel
Course overview
Today’s computer systems augment a wide range of human activity, including cooperation among individuals, organizations, businesses
This course deals with some of the technology underlying this trend, as well as the challenges and opportunities that come with it
2TECS Week, Pune, 5-9 January 2009
3
Course overview1. Decentralized systems (~2 hours)
Overlays, object lookup, routing Shared state and coordination Applications Challenges
2. Accountability for distributed systems (~1.5 hours) Why and what is accountability? How can we implement it? How well does it work?
3. Social computing and applications (~1.5 hours) Exploiting social networks for distributed computing Example: enhancing Web search Example: thwarting unwanted communication
TECS Week, Pune, 5-9 January 2009
4
Credits
Group members: Andreas Haeberlen Jeff Hoye Petr Kuznetsov Alan Mislove Animesh Nandi Ansley Post Atul Singh Jim Stewart
Colleagues: Krishna Gummadi, MPI-SWS Rodrigo Rodrigues, MPI-SWS Anne-Marie Kermarrec, INRIA Ant Rowstron, MSRC Miguel Castro, MSRC Ion Stoica, UC Berkeley John Kubiatowicz, UC Berkeley Frank Dabek, Google Y. Charlie Hu, Purdue
Funding: Max Planck Society National Science Foundation Intel Research Microsoft Research Texas ATP
TECS Week, Pune, 5-9 January 2009
5
Decentralized (p2p) systems
Distributed computer system with Symmetric components Decentralized control and state Self-organization
Promise “Organic” growth Low barrier to deployment Resilience to faults, attack Resource abundance, diversity
TECS Week, Pune, 5-9 January 2009
Partly vs. fully decentralized systems
Partly decentralized systems have a dedicated controller node Organic growth, abundant/diverse resources Limited scalability, resilience
Fully decentralized systems Some fully decentralized systems have
powerful supernodes Increased efficiency, but reduced resilience
6TECS Week, Pune, 5-9 January 2009
7
Decentralized systems: deployment
Self-organization enables deployment in dynamic networks
Ad hoc wireless networks Mobile wireless devices
Delay-tolerant networks Devices with intermittent connectivity
Overlay networks (most common) Internet-connected devices
TECS Week, Pune, 5-9 January 2009
8
Outline
1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs
and Scribe/DOLR Challenges Putting it all together: ePOST
2. Accountability for distributed systems3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
9
Overlay networks
Internet
Overlay links rely on unicast service in the Internet Topology can be “structured” or “unstructured”
Overlaynetwork
TECS Week, Pune, 5-9 January 2009
10
Why overlays?
Overcome limitations of Internet architecture group communication, content-oriented networking enable innovation
Low barrier to deployment resource sharing enables “organic” growth self-organization simplifies operation
Robustness to faults, attacks, unexpected workloads decentralization resource diversity, wealth
TECS Week, Pune, 5-9 January 2009
11
Decentralized (p2p) systems: What do they enable?
Cooperative computing Content sharing/distribution (Kazaa, BitTorrent) Streaming media (SOPcast, PPLive, Joost, iPlayer) Telephony (Skype), popular scientific computing Low barrier to deployment, market entry: Innovation
Digital preservation Diversity, abundance of resources provides
durability
Autonomous distributed systems Self-managing networks of little or mobile devices Decentralization is necessary for autonomy
TECS Week, Pune, 5-9 January 2009
Popular decentralized systems
File sharing, bulk content distribution BitTorrent, eDonkey dominate Internet
traffic Streaming media distribution
PPLive, CoolStreaming, Joost, iPlayer, LiveStation
Skype Volunteer computing
BOINC apps perform 1 PFLOPS on average
12TECS Week, Pune, 5-9 January 2009
13
Decentralized (p2p) systems: State-of-the-art
Decentralized state management Object location Replication Availability, Durability Load balancing
Efficient, consistent lookup routing in Internet overlays
Efficient cooperative content distribution Dependable storage from untrusted components Security: secure routing, content integrity, incentives
TECS Week, Pune, 5-9 January 2009
14
Key problem: Object location
Objects partitioned among participating nodes Mapping from objects to nodes is dynamic
Unicast routing doesn’t help don’t know who to talk to don’t know where to store objects want to address (data) objects, not nodes !
TECS Week, Pune, 5-9 January 2009
15
Solution 1: Unstructured overlay
No assumptions about overlay graph structure New node is assumed to know one participant Performs random walk to find more nodes to attach
to
Object placement Inserting node or random walk target May leave references along random path
Object lookup Scoped flooding or random walk
Examples: Gnutella, Kazaa, eDonkey
TECS Week, Pune, 5-9 January 2009
Unstructured object location
16TECS Week, Pune, 5-9 January 2009
I inserts an object Leave reference on R
S floods a request Finds reference at R Tradeoff between scalability and recall Popular object easy to find
17
Solution 2: structured overlay networks
Overlay graph conforms to a specific graph structure
Key-based routing primitive (KBR):
KBR(M, X): route message M to the live node that is currently responsible for the object associated with numerical id X
Basis for content-oriented networking
Examples: Chord, CAN, Pastry, Tapestry, Bamboo, Kademlia, SkipNet, Kelips, Accordeon, etc.
TECS Week, Pune, 5-9 January 2009
18
Structured vs. unstructured overlays
Unstructured Simple overlay
formation Tradeoff between
recall and efficiency Robust to churn
Structured Pre-determined
routes Efficient identity
lookup, tree formation
More susceptible to churnCan be combined:
Stable nodes form structure Others attach randomly
TECS Week, Pune, 5-9 January 2009
19
Outline
1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs
and Scribe/DOLR Challenges Putting it all together: ePOST
2. Accountability for distributed systems3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
20
Pastry: Identifier space
key
Consistent hashing [Karger et al. ‘97]
160 bit circular id space
nodeIds (uniform random)
keys (uniform random)
Each key is mapped to the live node with “closest” nodeId
nodeIds
O2160-1
TECS Week, Pune, 5-9 January 2009
21
Pastry: lookup
X
KBR(M, X)
Msg with key X is routed to live node with nodeId closest to X
Problem:
complete routing table not scalable
O 2160-1
TECS Week, Pune, 5-9 January 2009
22
Pastry: prefix-based routing
Properties• log16 N steps • O(log N) state
d46a1c
KBR(M, d46a1c)
d462ba
d4213f
d13da3
65a1fc
d467c4d471f1
TECS Week, Pune, 5-9 January 2009
23
Pastry: routing table (node 65a1fcx)0x
1x
2x
3x
4x
5x
7x
8x
9x
ax
bx
cx
dx
ex
fx
60x
61x
62x
63x
64x
66x
67x
68x
69x
6ax
6bx
6cx
6dx
6ex
6fx
650x
651x
652x
653x
654x
655x
656x
657x
658x
659x
65bx
65cx
65dx
65ex
65fx
65a0x
65a2x
65a3x
65a4x
65a5x
65a6x
65a7x
65a8x
65a9x
65aax
65abx
65acx
65adx
65aex
65afxlog16 N
rows
Row 0
Row 1
Row 2
Row 3
TECS Week, Pune, 5-9 January 2009
24
Pastry: prefix-based routing
Similar to Plaxton Trees [Plaxton et al. ‘97]
But added Neigbor sets for consistency, robustness,
security Consistent routing Self-organization (dynamic joins, fault
tolerance) Proximity neighbor selection for efficiency Secure routing to defend against malicious
nodes
TECS Week, Pune, 5-9 January 2009
Neighbor sets
Stabilization protocol ensures eventual consistency aids routing consistency enables secure routing localizes fault detection within neighbor sets enables application-specific local coordination (e.g., object replica management)
AB
25TECS Week, Pune, 5-9 January 2009
26
Challenge: Inconsistent routing
Routing consistency: “At any time, at most one
overlay node accepts messages with a given key”
Necessary for consistency of mutable data
Complicated by Internet routing anomalies
key
New node N has informed X, but not yet Y of its arrival
N
X
Y
TECS Week, Pune, 5-9 January 2009
28
Challenge: Self-organization
Initializing and maintaining node state (overlay construction and maintenance)
Node addition Node departure (failure)
TECS Week, Pune, 5-9 January 2009
29
Pastry: Node join
d46a1c
KBR(Join,d46a1c)
d462ba
d4213f
d13da3
65a1fc
d467c4d471f1
New node: d46a1c
TECS Week, Pune, 5-9 January 2009
30
Pastry: Node departure (failure)
Neighbor set members exchange keep-alive messages (failure detection, neighbor set stabilization)
Neighbor set repair (eager): request set from farthest live node in set
Routing table repair (lazy): get table from peers in the same row, then higher rows
TECS Week, Pune, 5-9 January 2009
31
Challenge: Overlay route efficiency
Nodes close in id space, but far away in Internet
Goal: choose routing table entries that yield few hops and low latency
CA-T1CCIArosUtah
CMUMIT
MA-CableCisco
Cornell
NYU
OR-DSL20x
80x
89x81x
TECS Week, Pune, 5-9 January 2009
32
Proximity neighbor selection (PNS)
Assumptions: scalar proximity metric (e.g., RTT) a node can probe distance to any other
node
Proximity invariant:
Each routing table entry refers to a node close to the local node (in the physical network), among all nodes with the appropriate nodeId prefix.
TECS Week, Pune, 5-9 January 2009
33
PNS: Routes in delay space
d46a1c
Route(d46a1c)
d462ba
d4213f
d13da3
65a1fc
d467c4d471f1
NodeId space
d467c4
65a1fcd13da3
d4213f
d462ba
Delay space
TECS Week, Pune, 5-9 January 2009
34
PNS Properties
1) Low-delay routes: Average delay stretch, relative to IP, is a small constant (1.3 - 2.2) and can be derived from the physical network’s delay distribution
2) Route convergence: Routes of messages sent by nearby nodes with the same key converge at a node near the source nodes
Details in [Castro et al. MSR-TR-2002-82]
TECS Week, Pune, 5-9 January 2009
35
Outline
1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs
and Scribe/DOLR Challenges Putting it all together: ePOST
2. Accountability for distributed systems3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
Sharing state: Distributed hash tables (DHT)
Hashtable API: put(obj,key), obj <- get(key)
Layered on top of a structured overlay Scalability, Robustness Persistent storage High availability
Examples: Chord/CFS, Pastry/PAST, Bamboo, Kelips, Kademlia
36TECS Week, Pune, 5-9 January 2009
37
Distributed hash table (DHT)
k6,v6
k1,v1
k5,v5
k2,v2
k4,v4
k3,v3
nodes
Operations:insert(k,v)v=lookup(k)
Overlay
network
Overlay
network
• Structured overlay maps keys to nodes• Decentralized and self-organizing• Scalable, robust
TECS Week, Pune, 5-9 January 2009
38
DHT: Insertion and replication
Storage Invariant: Tuple replicas are stored on r nodes with nodeIdsclosest to key
key
Insert(key,value,r)
r=4
TECS Week, Pune, 5-9 January 2009
39
DHT: Lookup
Key Object located in log16 N steps (expected)
usually locates replica nearest client C
Lookup(key)
r replicasC
TECS Week, Pune, 5-9 January 2009
40
DHT: Dynamic caching
Nodes cache tuples in the unused portion of their allocated disk space
Tuples cached on nodes along the route of lookup and insert messages
Goals: maximize query xput for popular tuples balance query load improve client latency
TECS Week, Pune, 5-9 January 2009
41
DHT: Dynamic caching
Key
Lookup(key)
Delay space
TECS Week, Pune, 5-9 January 2009
42
Coordination: Decentralized group management
E.g., SCRIBE [Rowstron et al., JSAC ’02] Spanning trees embedded in
structured overlay Multicast, anycast primitives Scalable: large numbers of groups,
members, wide range of members/group, dynamic membership
TECS Week, Pune, 5-9 January 2009
43
Cooperative group communication
n2
n1
n0g:n1,n2
g:n3,n4
g
nodes
Operations:create(g)join(g)leave(g)multicast(g,m)anycast(g,m)
• groupId g mapped to n0• decentralized membership • robust, scalable
n3g
n4g
TECS Week, Pune, 5-9 January 2009
44
Scribe
groupId
Join(groupId)
Delay space
TECS Week, Pune, 5-9 January 2009
45
Structured overlay APIs
KBR
DHT SCRIBE / DOLR
[Dabek et al., IPTPS ’05]
route(M, X)
insert(k,v)v=lookup(k)
create(g)join(g)leave(g)multicast(g,m)anycast(g,m
TECS Week, Pune, 5-9 January 2009
46
Outline
1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs
and Scribe/DOLR Challenges: malicious participants Putting it all together: ePOST
2. Accountability for distributed systems3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
47
Malicious participants: threats
Prevent messages from reaching root drop or corrupt bias routing tables
Cause objects to be placed on faulty nodes
choose nodeId values use many identities (Sybil attack) impersonate root
key
A
B
C
F
IJL
TECS Week, Pune, 5-9 January 2009
48
Malicious participants: threats
Prevent messages from reaching root drop or corrupt bias routing tables
Cause objects to be placed on faulty nodes
choose nodeId values use many identities (Sybil attack) impersonate root
A
B
C
F
IJL
TECS Week, Pune, 5-9 January 2009
49
Malicious participants: threats
Prevent messages from reaching root drop or corrupt bias routing tables
Cause objects to be placed on faulty nodes
choose nodeId values use many identities (Sybil attack) impersonate root
keyAB
C
F
IJL
TECS Week, Pune, 5-9 January 2009
50
Malicious participants: threats
Prevent messages from reaching root drop or corrupt bias routing tables
Cause objects to be placed on faulty nodes
choose nodeId values use many identities (Sybil attack) impersonate root
keyA
E
D
B
C
FG
HI
JKL
TECS Week, Pune, 5-9 January 2009
51
Malicious participants: threats
Prevent messages from reaching root drop or corrupt bias routing tables
Cause objects to be placed on faulty nodes
choose nodeId values use many identities (Sybil attack) impersonate root
key
A
B
C
F
IJ
KL
“F is my neighbor”
TECS Week, Pune, 5-9 January 2009
52
Securing routing
Secure node identifier assignment thwarts Sybil and id choosing attacks
Secure membership protocol Prevents routing table bias attacks
Secure routing primitive Prevents root impersonation
Can tolerate up to 25% malicious nodes
key
A
B
C
F
IJ
TECS Week, Pune, 5-9 January 2009
53
Securing routing
Secure routing primitive Prevents root impersonation
key
A
B
C
F
IJ
KL
“F is my neighbor”
M
[Castro et al., OSDI’ 02 ]
TECS Week, Pune, 5-9 January 2009
54
Other threats
Freeloading: incentives mechanisms Data corruption: crypto Denial-of-service Several defenses needed
TECS Week, Pune, 5-9 January 2009
55
Outline
1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs
and Scribe/DOLR Challenges: malicious participants Putting it all together: ePOST
2. Accountability for distributed systems3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
56
Putting it all together: ePOST
Decentralized, cooperative email service Based on users’ desktops/notebooks Messages transmitted and stored
securely Standard mail clients (IMAP/POP) Interoperability via SMTP Nodes may fail arbitrarily Users only trust their local node
[Mislove et al., EuroSys 06]
TECS Week, Pune, 5-9 January 2009
HPDC-15, June 21, 2006
Why Email?
Demanding user expectations Privacy Integrity Durability Availability
Goal: Demonstrate that a decentralized, cooperative email service can be built that users can entrust with their production email
58
ePOST: Single-copy store
Emails split into MIME components, stored in the DHT
Using its content-hash as the key Self-certifying (integrity) Identical items stored once Convergent encryption
Items replicated thrice for availability Additional erasure-coded
replicas for durability (Glacier [Haeberlen et al., NSDI’05])
Header
Body
Attachment
Attachment
Email Data
59
ePOST: Single-writer log
Per-user metadata (folders, inbox, etc.) stored as an update log
All updates performed by owner Stored in the DHT
Entries form a hash chain Log head is signed with owner’s key Periodic snapshots stored in logHeader
Body
Attachment
Attachment
Email DataLog HeadLog Entry
Insert msg x
Mark msg y read
Insert msg y
60
ePOST: Message Delivery
Message notifications are signed and contain encrypted headers and keys to the message’s components
Each user has a Scribe group Node joins user’s group if it has a message for
the user User announces to the group when online Pending notifications delivered
ePOST: Security
Users have certificates (public key, node id)
Secure communication (SSL) All content stored in the DHT is protected
Authenticity Integrity Privacy
Incentives to prevent freeloading (Scrivener [Nandi, Middleware’05])
Secure KBR
61TECS Week, Pune, 5-9 January 2009
62
Deployment and Experience Rice / MPI rings: reserved for internal members PlanetLab ring: open membership ring, backed by
Planetlab Usage
26 internal users (16 used ePOST as primary email) over more than two years
40 DHT nodes (Rice / MPI ring), 350 nodes (PlanetLab ring)
Several times, ePOST was available when Rice or MPI-SWS email had failed
No system-wide outages after initial testing phase Shut down due to overhead of tracking spam filtering
63
Decentralized systems challenges
Maintaining mutable distributed state remains hard Fortunately, lots of useful applications don’t
require it
Incentives are basis for cooperation Strategy-prove protocols (e.g. tit-for-tat) Accountability
Need to control membership Certified identities (background check or fee) proof-of-work, social networks?
TECS Week, Pune, 5-9 January 2009
64
Decentralized systems challenges
Need to protect data Durability requires non-decreasing membership Scalable storage, high availability, churn
resilience: pick two [Blake&Rodrigues, HotOS-IX]
Manageability Self-organization reduces administrative effort Hardware management is decentralized BUT: Evidence that lack of centralized control
may make it difficult to manage system-wide disruptions
TECS Week, Pune, 5-9 January 2009
65
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems
Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines
3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 66
Byzantine faults occur in practice
Not all faults cause a node to stop The faulty node continues to operate, but its
behavior deviates from that of a correct node
Examples: Hardware malfunction Misconfiguration Software error External security attack Intentional software modification
PODC, Toronto, 18 August 2008 67
Example: LAX airport outage
Aug 2007: 17,000 passengers stranded at LAX Cause: intermittent fault of a network card
Admin
PODC, Toronto, 18 August 2008 68
Example: Botnets in the Internet
Compromised computer targets different domain Admin A must localize fault, then convince admin B that
her machine is faulty
Domain A Domain B
Administrative domain
PODC, Toronto, 18 August 2008 69
Example: Insider attack
Mar 2002: UBS PaineWebber admin disrupts trade for days to weeks
Difficult to detect, defuse logical bombs
Administrative domain
PODC, Toronto, 18 August 2008 70
Why is detecting faults difficult?
How to detect faults? How to identify the faulty node? How to convince others that a node is (not) faulty?
Incorrectmessage
Responsibleadmin
71
Learning from the 'offline' world Relies on accountability Example: Banks
Record can be used to (manually) detect, identify and convince
Is accountability useful in distributed systems? Is it practical?
Requirement Solution
Commitment Signed receipts
Tamper-evident record
Double-entry bookkeeping
Inspections Audits
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 72
What does accountability mean?
Accountability := tamper-evident record + automated, reliable fault detection
PODC, Toronto, 18 August 2008 73
Is accountability alone useful?
No, if faults are severe and irrecoverable need byzantine fault tolerance (see Lorenzo‘s
course)
Yes, for systems that provide „best-effort“ service systems that assume crash failures systems that mask severe/irrecoverable faults
Accountability reliably detects and localizes faults provides incentives to avoid faults builds trust, reputation
PODC, Toronto, 18 August 2008 74
Which Systems can benefit?
Internet services (BGP, DNS, NTP, NNTP, SMTP)
Web services Content distribution networks (CDN) Grid computing Peer-to-peer systems Multi-player games Cloud computing
PODC, Toronto, 18 August 2008 75
Butler Lampson on accountability
"Don’t forget that in the real world, security depends more on police than on locks, so detecting attacks, recovering from them, and punishing the bad guys are more important than prevention." -- Butler Lampson, "Computer Security in the Real World", ACSAC 2000
76
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems
Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines
3. What’s next? Social computing and applications
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 77
Ideal accountability
Whenever a node is faulty in any way, the system generates a proof of misbehavior against that node
Fault := Node deviates from expected behavior Our goal is to automatically
detect faults identify the faulty nodes convince others that a node is (or is not) faulty
Can we build a system that provides the following guarantee?
PODC, Toronto, 18 August 2008 78
Can we detect all faults? Problem: Faults that
affect only a node's internal state
Would require online trusted probes at each node
Focus on observable faults: Faults that affect a correct node
Can detect observable faults without requiring trusted components
AA
X
CC
100101011000101101011100100100
0
PODC, Toronto, 18 August 2008 79
Can we always get a proof?
Problem: He-said-she-said Three possible causes:
A never sent X B refuses to acknowledge X X was lost by the network
Cannot get proof of misbehavior! Generalize to verifiable evidence:
a proof of misbehavior, or a challenge that a faulty node cannot answer
What if the challenged node does not respond? Does not prove a fault, but node is suspected until it
responds
AA
X
BB
CC
?
I sent X!
I neverreceived
X!
?!
PODC, Toronto, 18 August 2008 80
Practical accountability We propose the following requirement for an
accountable distributed system:
This is useful Any (!) fault that affects a correct node is
eventually detected and linked to a faulty node
It can be implemented in practice
Whenever a fault is observed by a correct node, the system eventually generates verifiable evidence against a faulty node
81
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems
Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines
3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 82
Adds accountability to a given system Implemented as a library Provides tamper-evident record Detects faults via state-machine replay
Assumptions:
An implementation: PeerReview
1. Nodes can be modeled as deterministic state machines
2. Nodes have reference implementations of the state machines
3. Correct nodes can eventually communicate4. Nodes can sign messages
PODC, Toronto, 18 August 2008 83
M
PeerReview from 10,000 feet All nodes keep logs of
their inputs & outputs Including all messages
Each node has a set of witnesses, which audit the node periodically
If the witnesses detect misbehavior, they
generate evidence make the evidence
avai-lable to other nodes
Other nodes check evi-dence, report fault
A's log
B's log
AA
BB
M
CCDD
EE
A's witnesses
M
A is faulty
84
PeerReview detects tampering
A B
Message Has
h ch
ain
Send(X)
Recv(Y)
Send(Z)
Recv(M)
H0
H1
H2
H3
H4
B's log
ACK
What if a node modifies its log entries?
Log entries form a hash chain
Inspired by secure histories [Maniatis02]
Hash is included with every message authenticator Node commits to its current state Changes are evident
Hash(log)
Hash(log)
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 85
PeerReview detects omission What if a node omits
log entries? While inspecting A’s
log, A’s witnesses send msg authenticators signed by B to B’s witnesses
Thus, witnesses learn about all messages their node has ever sent or acknowleged
Omission of a message from the log is a fault
A's log
AA
BB
A's witnesses
B's witnesses
MB
MB M
B
MB
MB
MB
86
PeerReview detects inconsistencies
What if a node keeps multiple logs? forks its log?
Witnesses check whether all msg authenticators form a single hash chain
Two authenticators not connected by a log segment indicate a fault
H3'
Read X
H4'
Not found
Read Z
OK
Create X
H0
H1
H2
H3
H4
OK
"View #1""View #2"
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 87
Module B
PeerReview detects faults How to recognize
faults? Assumption:
Nodes can be modeled as deterministic state machines
To audit a node, witness Fetches signed log Replays inputs to a
trusted copy of the state machine
Checks outputs against the log
Module A
Module B
=?
LogNetwork
Input
Output
Sta
te m
ach
ine
if ≠
Module A
PODC, Toronto, 18 August 2008 88
PeerReview guarantees
1) Observable faults will be detected
2) Good nodes cannot be accused
Formal definitions and proof in the TR
If node commits a fault + has a correct witness,
then witness obtains a proof of misbehavior (PoM), or a challenge that the faulty node cannot answer
If node is correct there can never be a PoM,
and it can answer any
challenge
PODC, Toronto, 18 August 2008 89
PeerReview is widely applicable App #1: NFS server in the Linux kernel
Many small, latency-sensitive requests Tampering with files Lost updates
App #2: Overlay multicast Transfers large volume of data
Freeloading Tampering with content
App #3: P2P email Complex, large, decentralized
Denial of service Attacks on DHT routing
More information in [Haeberlen et al., SOSP’07]
Metadata corruption Incorrect access
control
Censorship
90
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems
Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines
3. Social computing and applications
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 91
How much does PeerReview cost?
Log storage 10 – 100 GByte per month, depending on
application
Message signatures Message latency (e.g. 1.5ms RTT with RSA-1024) CPU overhead (embarrassingly parallel)
Log/authenticator transfer, replay overhead Depends on # witnesses Can be deferred to exploit bursty/diurnal load
patterns
PODC, Toronto, 18 August 2008 92
P2p email, dedicated witnesses
Dominant cost depends on number of witnesses W
O(W2) component
Baseline 1 2 3 4 5
100
80
60
40
20
0
Avg t
raffi
c (K
bps/
node)
Number of witnesses
Baseline traffic
Signaturesand ACKs
Checking logs
W dedicatedwitnesses
PODC, Toronto, 18 August 2008 93
P2p email, mutual auditing
Small probability of error is inevitable Example: Replication
Can use this to optimize PeerReview Accept that an instance of a fault is found
only with high probability Asymptotic complexity: O(N2) O(log N)
Small randomsample of peers
chosen as witnesses
Node
PODC, Toronto, 18 August 2008 94
PeerReview is scalable
Assumption: up to 10% of nodes can be faulty Probabilistic guarantees provide scalability
Example: email system scales to over 10,000 nodeswith P=0.999999
DSL/cableupstream
Email systemw/o accountability
O((log N)2)
O(log N)
Email system+ PeerReview(P=0.999999)
Email system + PeerReview(P=1.0)
System size (nodes)
Avg
traf
fic (
Kbp
s/no
de)
95
PeerReview summary Accountability is a new approach to
handlingfaults in distributed systems detects faults identifies the faulty nodes produces evidence
PeerReview: A system that enforces accountability Offers provable guarantees and is widely
applicable
Details in [Haeberlen et al., SOSP ‘07 ]
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 96
Challenges
Tension between accountability and privacy PeerReview (PR) requires disclosure to
witnesses Zero-knowledge proofs?
Fault detection PR uses state-machine replay for fault
detection Can‘t detect deterministic software bugs Different implementations of underspecified
protocols may diverge Protocol specification or abstract model?
PODC, Toronto, 18 August 2008 97
Challenges (cont‘d)
Message signatures PR assumes a public-key infrastructure Web-of-trust (physical network, social network)
?
Partial deployment Accountability zones, gateways ?
PR requires source code modifications To enable deterministic replay Accountable virtual machines?
NetReview
Accountability applied to inter-domain routing
Fault detection based on a spec of the routing policy
Web-of-trust-based certificates Auditing limited to peering partners Partial deployment: accountability zones
Details in [Haeberlen et. al., NSDI’09]
98TECS Week, Pune, 5-9 January 2009
99
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems
Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines
3. What’s next? Social computing and applications
TECS Week, Pune, 5-9 January 2009
PODC, Toronto, 18 August 2008 100
Accountable virtual machines (AVM)
Make unmodified binary VMs accountable
VMM provides deterministic logging/replay
Accountable VMM
AVMVM
Log
Unmodified binary
Packets Authenticator
PODC, Toronto, 18 August 2008 101
What are AVMs good for?
Accountability for proprietary/legacy software
Accountable cloud computing Customer can verify correct execution
Making an entire host computer accountable Check for compromised software Forensics
PODC, Toronto, 18 August 2008 102
Trusted network probes
Making the Internet accountable, one host at a time
Secure log
Cable/DSL modemor ISP’s DSLAM
Internet AccountableWorkstation
AuthenticatorPacket
Chain of authenticatorsvalidates log
PODC, Toronto, 18 August 2008 103
Related Work
Accountability [Lampson ’00, Yumerefendi&Chase ’05, Yemerefendi et al. ’07, Argyraki et al. ’07, Michalakis et al. ‘07]
Practical byzantine fault tolerance [Castro&Liskov ‘00, Ramasamy ‘07]
General fault detection [Kihlstrom et al. ’07, Doudou et al. ’99, Malkhi&Reiter ‘97]
Intrusion detection, reputation systems [Denning ’87, Ko et al. ’94, Kamvar et al. ‘03]
Trusted computing [Garfinkel et al. ’02] Fault-specific defenses [Cox&Noble ‘03,
Waldman&Mazieres ’03] Tamper-evident logs [Schneier&Kelsey ’98, Maniatis&Baker
‘02]
PODC, Toronto, 18 August 2008 104
Conclusion Byzantine faults in distributed systems are real
Accountability is a new approach to handling faults detects observable faults identifies the faulty node produces verifiable evidence
Presented a practical definition of accountability
Practical implementations exist
Many challenges remain
105
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. Social computing and applications
Exploiting social networks for distributed computing
Example: enhancing Web search Example: thwarting unwanted
communication
TECS Week, Pune, 5-9 January 2009
106
From service-centric to user-centric computing
Collaborative, social computing and communication
In peer-to-peer, users share technical resources In social computing, users share knowledge,
opinions, referrals, ratingsTECS Week, Pune, 5-9 January 2009
107
User-centric, social computing
Mass collaboration, enabled by technology
Human intelligence aggregated through technology
User contribution is the most important resource(Underutilized resource of enormous scale?)
BUT: Outcome depends on user behavior depends on cooperation, good will vulnerable to spoilers
TECS Week, Pune, 5-9 January 2009
108
Social networks: two concepts
Users contribute Content Opinions, recommendations, ratings (ex- or
implicit)
Users form social networks Graph connecting users (ex- or implicit) Links imply shared interest or trust
TECS Week, Pune, 5-9 January 2009
109
What are social networks?
Graphs connecting people Edges connect “friends” Imply shared interest or trust Online friends may have
never met in real life E.g., email, Skype, IM
Online social networking sites Network hosted by a Web
site Often used to share
opinions, advice, ratings, multimedia content
Social Network
Online Social Network
TECS Week, Pune, 5-9 January 2009
110
Huge opportunity…
…to leverage collective user input, e.g.
to deal with unwanted communication to thwart security attacks to enable better organization, filtering,
search, ranking, and distribution of content may provide an answer to the ever-
increasing flood of information
TECS Week, Pune, 5-9 January 2009
111
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. What’s next? Social computing and
applications Exploiting social networks for
distributed computing Example: enhancing Web search Example: thwarting unwanted
communication
TECS Week, Pune, 5-9 January 2009
112
What’s it got to do with Systems?
Social networks enhance distributed systems Sybil attacks Unwanted communication Personalization
Social computing may need distribution Privacy Avoid dependence on a single provider
TECS Week, Pune, 5-9 January 2009
113
Leveraging social networks to enhance systems
Trust can help thwart security problems Sybil attacks: SybilGuard [SIGCOMM’06] Clones unlikely to have diverse links
Trust can help block unwanted communication Friends unlikely to send SPAM: RE [NSDI’06] Using social networks to thwart SPAM
(Ostra)
Shared interest can improve search Web search: PeerSpective [HotNets’06] Related users likely to visit relevant content
TECS Week, Pune, 5-9 January 2009
114
Leveraging social networks: More ideas
Sharing solutions and problem fixes Configurations that work Fixes that others have found “Copy what works for others”
Combine technology and social networks to truly “stand on the shoulders of giants”
Answer to the increasing complexity of the information age?
TECS Week, Pune, 5-9 January 2009
115
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. What’s next? Social computing and
applications Exploiting social networks for
distributed computing Example: enhancing Web search Example: thwarting unwanted
communication
TECS Week, Pune, 5-9 January 2009
116
Example: social network based Web search
PeerSpectiveGoogl
e
PeerSpective experiment Idea: users can query their friends’ previously viewed pages Results from friends appear alongside Google results
TECS Week, Pune, 5-9 January 2009
117
PeerSpective implementation
Prototype is a lightweight HTTP proxy Runs on users’ desktop and indexes all browsed content When Google search is performed,
query other PeerSpective proxies in parallel with Google present PeerSpective results alongside Google results
PeerSpectivePeerSpective
PeerSpectivePeerSpective
PeerSpectivePeerSpectiveTECS Week, Pune, 5-9 January 2009
118
PeerSpective results summary
Explored potential of integrating Web and social network search
Evidence that PeerSpective added value Additional coverage for viewed sites Improved ranking of results Aided in finding content serendipitously
However, just an experiment Many challenges remain Opportunities as well
Details in [Mislove et al., HotNets ’06]TECS Week, Pune, 5-9 January 2009
119
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. What’s next? Social computing and
applications Exploiting social networks for
distributed computing Example: enhancing Web search Example: thwarting unwanted
communication
TECS Week, Pune, 5-9 January 2009
120
Unwanted communication
Well-known problem Email spam
Increasingly affects other systems Search-engine spam Mislabeled videos plaguing YouTube Unwanted invitations in Skype
Existing solutions insufficient Content filtering for videos?
TECS Week, Pune, 5-9 January 2009
121
Known defenses
Content filtering Works very well for email, but False positives reduce communication
reliability Doesn’t work for multimedia
Holding senders accountable Requires strong user identities
Imposing a per-communication cost Refunded if communication is wanted Requires micro-payments/quota market
TECS Week, Pune, 5-9 January 2009
122
Ostra: Using social relationships
Assumptions
Cost for acquiring and maintaining social links Cannot create links arbitrarily fast Cannot maintain arbitrary number of links
Receivers are willing to classify content Explicit (Junk button) Implicit (Deletion, response)
TECS Week, Pune, 5-9 January 2009
123
Ostra: Pair-wise credit exchange
• Credit balance/bound associated with each link • Credit balances decay at constant rate (10%/day)• Sum of all credit = 0 (invariant)
-202
TECS Week, Pune, 5-9 January 2009
124
Ostra: Pair-wise credit exchange
Receiver
Sender
Message unwanted -> sender pays receiver one credit Sending spam exhausts sender’s link balance
-202
TECS Week, Pune, 5-9 January 2009
125
Ostra: End-to-end credit exchange
Sender
Receiver
Rate of spam a user can send is proportional to number of links (s)he has
-202
-212 -202-2-12
TECS Week, Pune, 5-9 January 2009
126
Sybil attacks are not effective
{Sybils
Total unwanted communication by Sybils is bounded by the number of links with other users
TECS Week, Pune, 5-9 January 2009
127
Ostra
Thwarts unwanted communication existing systems Examples: Email, Skype, IM, YouTube
Uses existing relationships among users Online social networks Graph of email/IM/Skype users
Does not require strong user identities Does not rely on automatic content classification Respects recipient’s idea of wanted/unwanted
communication
Details in [Mislove et al., NSDI ’08 ]
TECS Week, Pune, 5-9 January 2009
128
SN and applications research agenda
Measurement/Analysis Theory of complex networks Empirical study of social networks
Understanding SN evolution Understanding SN information flow
Design Personalized search, filtering, content distribution Using social networks to thwart unwanted behavior Online social networks and privacy
TECS Week, Pune, 5-9 January 2009
129
Outline
1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. Social computing and applications
Exploiting social networks for distributed computing
Example: enhancing Web search Example: thwarting unwanted
communication
TECS Week, Pune, 5-9 January 2009
130
Max Planck Institute for Software Systems(MPI-SWS)
Part of Max Planck Society Academic research institute, pub.
funded Focus on basic research Kick-off in Aug 2005 17 faculty positions (tenure-track) ~100 doctoral/post-doc positions Administrative and technical support
staff Top international research institution
TECS Week, Pune, 5-9 January 2009
131
MPI-SWS Faculty
Distributed systems
Peter Druschel
Krishna Gummadi
Program analysis
and verificati
on
Andrey Rybalchenko
Derek Dreyer
Functional Programming
Networked systems
Michael Backes (Fellow)(Fellow)
Security andCryptography
Rodrigo Rodrigues
Dependable systems
TECS Week, Pune, 5-9 January 2009
Paul Francis
Large scale
Internet systems
132
Graduate program (MS/PhD)
Advised by MPI-SWS faculty Stimulating, competitive environment International, diverse student body
(80%) English language Financial aid Internships available
http://www.mpi-sws.org
TECS Week, Pune, 5-9 January 2009
133
Thanks for your attention!
TECS Week, Pune, 5-9 January 2009