Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of...

33
Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring 2008 Alex Shraer
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of...

Page 1: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20081

Principles of Reliable Distributed Systems

Tutorial 4: SkipNet

Spring 2008

Alex Shraer

Page 2: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20082

Reading Material

• SkipNet: A Scalable Overlay Network with Practical Locality PropertiesHarvey, Jones, Saroiu, Theimer, WolmanMicrosoft Research

Page 3: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20083

Reminder: DHT Advantages

• Peer-to-peer: no centralized control or infrastructure

• Scalability: O(log N) routing, routing tables, join time

• Load-balancing

• Overlay robustness

Page 4: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20084

DHT Disadvantages: SkipNet Motivation

• No control where data is stored– Data may be stored far from its users– Data may be stored outside its administrative domain

• hard to administer privileges• invites different security attacks

– Local accesses leave local organization• In practice, organizations want:

– Content Locality – explicitly place data where we want (inside the organization)

– Path Locality – guarantee that local traffic (a user in the organization looks for a file of the organization) remains local

• No prefix search– Search(key) returns file whose name has key as prefix

Page 5: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20085

Practical Requirements

• Data Controllability:– Organizations want control over their own data– Even if local data is globally available

• Manageability:– Data control allows for data administration,

provisioning and manageability

Page 6: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20086

Practical Requirements (cont’d)

• Security:– Content and path locality are key building blocks for

dealing with certain external attacks (DoS, Traffic analysis)

• Data availability– Local data survives network partitions.

• Performance– Data can be stored near clients that use it

Page 7: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20087

SkipNet Content Locality

• Place files at nodes according to names

• Name ID space (DNS-like)– for files and nodes– node name = reverse DNS name of the host

(com.microsoft.host1)– file names have same prefix

• Problem?

Page 8: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20088

Constrained Load-Balancing

• Data uniformly distributed in designated subset of nodes – e.g., inside organization

• How can this be achieved?• Numeric ID space!

– similar to Chord, Pastry and others– nodes are randomly distributed– Hashes of the node names and content identifiers mapped into the

numeric ID.– Content is stored on the node with id closest to content’s hashed

name.

• Key property of SkipNet: two address spaces

Page 9: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 20089

Skip Lists - Reminder• In-memory dictionary data structure.

– Sorted linked list with a subset of nodes having additional links to skip over many list elements

• Perfect (deterministic) skip list:

– Pointer at level h skips over 2h elements– Search: O (log N), N – number of nodes in the list.– Insertion/deletion: expensive/awkward

Page 10: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200810

Skip Lists - Reminder

• Probabilistic skip list:

– Node at level h with probability 1/2h

– Search, Insert, Delete: O (log N) w.h.p.

Page 11: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200811

Skip List: Good for Us?

• The Good: – Sorted list: path locality for name-based search– O(log N) search with skip pointers– Up to log(N) skip pointers: O(log N) instertion

• The Bad:– Lookup starts from root only– Unequal load

• nodes on the top levels have high chance to be in routing path

Page 12: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200812

SkipNet Global ViewRing000

Ring001

Ring010

Ring011

Ring100

Ring101

Ring110

Ring111

A

D M O

T

Z X V

A

M

T

X

D O

Z V

A T

M

X

O

Z

D

V

A T

M

X Z

O D

V

Ring 00 Ring 01 Ring 10 Ring 11

Ring 0 Ring 1

Root Ring Level L = 0

L = 1

L = 2

L = 3

The full SkipNet routing infrastructure for an 8 node system, including the ring labels.

Page 13: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200813

SkipNet Structure

• Skip Graph = Distributed Skip List– Every node belongs to rings at all levels – Search can start at any node– Use doubly linked lists at each level to account for absence of head and

tail nodes.• Perfect vs. Probabilistic

– Perfect : Pointers at level h point to nodes that are exactly 2h nodes to the left and right.

– Probabilistic : A node in level h probabilistically determines which ring it belongs to.

• All rings are sorted according to Name IDs• Ring membership is according to Numeric IDs

– All nodes sharing the same prefix of Numeric IDs of length h are members of the same ring at level h

Page 14: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200814

SkipNet Routing Tables

Level: L = 0

L = 1

L = 2Ring 00 Ring 01 Ring 10 Ring 11

Ring000

Ring001

Ring010

Ring011

Ring100

Ring101

Ring110

Ring111

A Root RingD M O

TVXZ

Ring 0A

M

T

X

Ring 1D

Z V

O

OZA T

M

X

D

V

A TM

X

DV

Z

OL = 3

Node A’s Routing

Table

Page 15: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200815

An Alternative View

Level

2 T T

1 M X

0 D Z

SkipNet nodes ordered by name ID. Routing tables of nodes A and V shown.

A

DM

O

T

ZX

V

Level

2 D D

1 Z O

0 X T

000 001

010

011100

101110

111

Page 16: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200816

Routing By Name ID

• Routing in Skip Graph = Search in Skip Lists • Simple Rule:

– Forward the message to node that is closest to destination, without going too far.

• Route either clockwise/counterclockwise• Terminates when messages arrives at a node

whose name ID is closest to destination. • Number of hops is O(log N) w.h.p.

Page 17: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200817

Example: Routing from A to V

Level: L = 0

L = 1

L = 2Ring 00 Ring 01 Ring 10 Ring 11

Ring000

Ring001

Ring010

Ring011

Ring100

Ring101

Ring110

Ring111

A Root RingD M O

TVXZ

Ring 0A

M

T

X

Ring 1D

Z V

O

OZA T

M

X

D

V

A TM

X

DV Z

O L = 3

Page 18: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200818

Example: Routing from A to V

Level: L = 0

L = 1

L = 2Ring 00 Ring 01 Ring 10 Ring 11

Ring000

Ring001

Ring010

Ring011

Ring100

Ring101

Ring110

Ring111

A Root RingD M O

TVXZ

Ring 0A

M

T

X

Ring 1D

Z V

O

OZA T

M

X

D

V

A TM

X

DV Z

O L = 3

Node T’sRoutingTable

Page 19: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200819

Example: Routing from A to V

Level: L = 0

L = 1

L = 2Ring 00 Ring 01 Ring 10 Ring 11

Ring000

Ring001

Ring010

Ring011

Ring100

Ring101

Ring110

Ring111

A Root RingD M O

TVXZ

Ring 0A

M

T

X

Ring 1D

Z V

O

OZA T

M

X

DV

A TM

X

DV Z

O L = 3

Page 20: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200820

Example: Routing to Object

Level: L = 0

L = 1

L = 2

Route from A to F -> Terminates at E

Ring 00 Ring 01 Ring 10 Ring 11

Ring000

Ring001

Ring010

Ring011

Ring100

Ring101

Ring110

Ring111

A Root RingD E O

VXZ

Ring 0A

E

T

X

Ring 1D

Z V

O

OZA T

E

X

DV

A TE

X

DV Z

O L = 3

T

Page 21: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200821

Name ID Routing Algorithm

SendMsg(nameID, msg) {

if( LongestPrefix(nameID,localNode.nameID)==0 )

msg.dir = RandomDirection();

else if( nameID<localNode.nameID )

msg.dir = counterClockwise;

else

msg.dir = clockwise;

msg.nameID = nameID;

RouteByNameID(msg);

}

// Invoked at all nodes (including the source and// destination nodes) along the routing path.RouteByNameID(msg) { // Forward along the longest pointer // that is between us and msg.nameID. h = localNode.maxHeight; while (h >= 0) { nbr = localNode.RouteTable[msg.dir][h]; if (LiesBetween(localNode.nameID, nbr.nameID, msg.nameID, msg.dir)) { SendToNode(msg, nbr); return; } h = h - 1; } // h<0 implies we are the closest node. DeliverMessage(msg.msg);}

Load Balancing

Path Locality

Page 22: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200822

Routing By Numeric ID

• Numeric id’s are random, no ring is sorted by them– We can’t route top-down!

• Bottom-up Routing– Routing begins at level 0 ring until a node is found whose

numeric ID matches the destination numeric ID in the first digit.

– Messages forwarded from ring in level h, Rh, to a ring in level h+1, Rh+1, such that nodes in Rh+1 share h+1 digits with destination numeric ID.

– Terminates when message delivered, or none the nodes in Rh share h+1 digits with destination numeric ID, at a node in Rh with closest possible numeric id.

Page 23: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200823

Example: Routing by Numeric ID

– Hash(“Foo.c”) = 101

Level: L = 0

L = 1

L = 2Ring 00Ring 00 Ring 01Ring 01 Ring 10Ring 10 Ring 11Ring 11

RingRing000000RingRing000000

Ring001Ring001

Ring010Ring010

Ring011Ring011

Ring100Ring100

Ring101Ring101

Ring110Ring110

Ring111Ring111

Root RingRoot RingD M O

TVXZ

Ring 0Ring 0

M

T

X

Ring 1Ring 1D

Z V

O

OZA T

M

X

D

V

A TM

X

DV Z

O L = 3

Foo.c

A

A

Page 24: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200824

Routing by Numeric ID

• The same routing tables are used for routing by nameID and numericID

• When Numeric IDs are binary: in each ring Rh, in expectation only 2 nodes visited before encountering one belonging to the next ring Rh+1

– The number of message hops is O(log N) w.h.p.

Page 25: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200826

Base (k) for Numeric IDs• If a higher base k>2 is used for Numeric IDs

the routing is O(klogkN) w.h.p.

• When we increase kmore rings in each level less levels less pointers in routing table less state but more hops…

• Optimization - dense routing table (R-Table)• Normal (sparse) R-Table + k-1 pointers to contiguous

nodes in both directions at each level. More state but less hops

Page 26: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200827

Node Join

• Two-stage process: (1) bottom-up + (2) top-down• Bottom-up: find the top level ring that matches the

node’s numeric ID.• Top-down: build the new node’s routing table

– Find a neighbor in the top ring using name ID search.– Starting from this neighbor, search for the name ID at the next

lower level and thus find neighbors at lower level. – Repeated until the search reaches the root.

• Update of the existing nodes’ routing tables:– after the new node has joined the root ring.

Page 27: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200828

Node join illustrated

Ring P0 Ring P1

Ring P

Only a few in expectation

Joining node

Page 28: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200829

Node Join - Analysis

• Key ideas:– Climb to a weakly populated ring.– Search for the node’s neighbors at the lower levels only

after finding the neighbors at the higher levels.– The range of traversed nodes at the level = the range of

neighbors at the next higher level.

• Insertion traverses O(log N) hops whp– Expected O(log N) levels, constant number of

neighbors at each level.

Page 29: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200830

Node Departure/Failure

• Graceful (notified) vs crash departure• Key issue –routing tables’ update• Key idea – separate vital info from optimizations

– Routing is correct as long as the root level ring is maintained.

– Other levels regarded as optimization hints – Does this remind something?

• Upper-ring membership maintained through a background repair process.

Page 30: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200831

Leaf Sets

• Idea = use redundant pointers at level 0:

• Store L/2 pointers in each direction • SkipNet uses L=16

– Not an original SkipNet idea – used in Pastry.

• Protect from independent failures• Improve the search performance

– rout directly using leaf set if got within L/2 of the target

Page 31: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200832

Constrained Load Balancing (CLB)

• Multiple DHTs with differing scopes using a single SkipNet structure– A result of the ability to route in both address spaces

• Divide data object names into two parts with ! CLB Domain CLB Suffix

microsoft.com!skipnet.htmlNumeric RoutingName Routing

• microsoft.com/skipnet.html! – controlled placement• !microsoft.com/skipnet.html – Global DHT

Page 32: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200833

CLB Example

• File ID = “com.microsoft!skipnet.html”– Route by name ID to com.microsoft– Inside com.microsoft, route by numeric ID to hash(“skipnet.html”)

com.sun

edu.ucbgov.irs

com.microsoft

skipnet.html

Page 33: Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Tutorial 4: SkipNet Spring.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 200834

SkipNet Path Locality

• Organizations correspond to contiguous SkipNet segments– Internal routing by NameID remains internal

• Nodes have left / right pointers

com.sun

edu.ucbgov.irs

com.microsoft

com.microsoft.research