University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

22
niversity of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks

description

Internet Topology Measurement: Internet topology measurement studies Involves topology collection / construction / analysis Current state of the research activities Distributed topology data collection studies/platforms – Skitter, AMP, iPlane, Dimes, DipZoom, … – 20M path traces with over 20M nodes Issues in topology construction 1.Verifying accuracy of path traces 2.IP alias resolution 3.Subnet inference 4.Anonymous router resolution CS 790g: Complex Networks 3

Transcript of University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Page 1: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

University of Nevada, Reno

Resolving Anonymous Routers

Hakan KARDES

CS 790gComplex Networks

Page 2: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Outline

• Introduction• Anonymous router resolution – Problem– Previous approaches

• Anonymity types• Anonymity resolution via graph-based

induction (GBI)• Conclusions

2CS 790g: Complex Networks

Page 3: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Internet Topology Measurement:

Internet topology measurement studies• Involves topology collection / construction / analysis

• Current state of the research activities• Distributed topology data collection studies/platforms

– Skitter, AMP, iPlane, Dimes, DipZoom, …– 20M path traces with over 20M nodes

• Issues in topology construction1. Verifying accuracy of path traces2. IP alias resolution3. Subnet inference4. Anonymous router resolution

CS 790g: Complex Networks 3

Page 4: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Topology Collection (traceroute)

• Probe packets are carefully constructed to elicit intended response from a probe destination

• traceroute probes all nodes on a path towards a given destination– TTL-scoped probes obtain ICMP error messages from routers on the path– ICMP messages includes the IP address of intermediate routers as its source

• Merging end-to-end path traces yields the network map

Internet Topology Discovery 4

S DA B C

DestinationTTL=1

IPA

TTL=2

IPB

TTL=3

IPC

TTL=4

IPD

Vantage Point

Details

Page 5: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Outline

• Introduction• Anonymous router resolution – Problem– Previous approaches

• Anonymity types• Anonymity resolution via graph-based

induction (GBI)• Conclusions

5CS 790g: Complex Networks

Page 6: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

• Anonymous routers do not respond to traceroute probes and appear as in traceroute output– Same router may appear as in multiple traces.

Internet Topology Discovery 6

y: S – L – H – x

x: H – L – S – y

y: S – – H – x

x: H – – S – y

S

L

H

y

x

S

L

H

y

x

y

S

1 2

H

x

Current daily raw topology data sets include• ~ 20 million path traces with• ~ 20 million occurrences of s along with• ~ 500K public IP addresses

The raw topology data is far from representing the underlying sampled network topology

Problem

Page 7: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

7

Internet2 backbone

Traces• x - H - L - S - y• x - H - A - W - N - z• y - S - L - H - x• y - S - U - K - C - N - z• z - N - C - K- H - x• z - N - C - K - U - S - y

S

L

U

K

C

H

A

W

Ny

x

z

CS 790g: Complex Networks

Problem

Page 8: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Internet2 backboneS

L

U

K

C

H

A

W

Ny

x

z

Traces• x - - L - S - y• x - - A - W - - z• y - S - L - - x• y - S - U - - C - - z• z - - C - - - x• z - - C - - U - S - y

CS 790g: Complex Networks 6

Problem

Page 9: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Internet Topology Discovery 9

U K C N

L H A W

S

d

e

f

Sampled network

d

e

fS U

L

C

AW

Resulting network

Traces• d - - L - S - e• d - - A - W - - f• e - S - L - - d• e - S - U - - C - - f• f - - C - - - d• f - - C - - U - S - e

Problem

Page 10: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

• Basic heuristics– IP: Combine anonymous nodes between same known nodes [Bilir 05]

• Limited resolution

– NM: Combine all anonymous neighbors of a known node [Jin 06]• High false positives

• More theoretic approaches– Graph minimization approach [Yao 03]

• Combine s as long as they do not violate two accuracy conditions:(1) Trace preservation condition and (2) distance preservation condition

• High complexity O(n5) – n is number of s

– ISOMAP based dimensionality reduction approach [Jin 06]• Build an nxn distance matrix then use ISOMAP to reduce it to a nx5 matrix

Distance: (1) hop count or (2) link delay• High complexity O(n3) – n is number of nodes

10

U K C N

L H A W

S

xy

z

Sampled network

x

y

zS U

L

C

A W

After resolution

x

y

zS U

L

C

A

After resolution

WH

x

y

zS U

L

C

A

W

Resulting networkCS 790g: Complex Networks

Previous Approaches

Page 11: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Outline

• Introduction• Anonymous router resolution – Problem– Previous approaches

• Anonymity types• Anonymity resolution via graph-based

induction (GBI)• Conclusions

11CS 790g: Complex Networks

Page 12: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Anonymity Types

• Type 1: Do not send any ICMP responses• Type 2: Rate limit ICMP responses• Type 3: Do not send ICMP responses when

congested• Type 4: Filtered ICMP responses at border

routers• Type 5: ICMP responses with private source

IP address

12CS 790g: Complex Networks

Page 13: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Graph Based Induction (GBI) - Our Approach

• Graph based induction– A graph data mining technique

• Find frequent substructures in a graph data• Commonly used in mining biological and chemical graph data

• Use of GBI for anonymous router resolution– Observe common graph structures due to anonymous routers– Develop localized algorithms with manageable computational

and storage overhead – Trace Preservation Condition

• Merge anonymous nodes as long as they cause no loops in path traces

13CS 790g: Complex Networks

Page 14: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Common Structures

14

Ax C y2Ax C y2

Parallel -substring

y1

y3

y1

y3

DA wx

C y

E z

DA wx

C y

E z

Star

A

C

x

y

D w

F v

E z

A

C

x

y

D w

F v

E z

Complete Bipartite

A

C

x

y

D w

E z

A

C

x

y

D w

E z

Clique

CS 790g: Complex Networks

Page 15: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Parallel -substring

• Algorithm• For each -substrings (a,i,c), represent it as a tuple (a||c, i)

– a||c is the tuple identifier and a<c• Read path traces and build the sorted list L of two tuples• Subsequently read tuples are compared to the ones in the list based on tuple

identifiers and duplicates are excluded from L

• Handling anonymity due to ICMP rate limiting or congestion• A second scan of path traces looking for substrings of the form (a,b,c)

corresponding to (a,i,c) in L

15

a c

b

a cb

CS 790g: Complex Networks

Page 16: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Clique

• Generate a new graph G* = (V*,E*)– For each -substring of type (a, e, b),

• V* ← V* U {a, b}• E* ← E* U {e(a,b)}

• First identify 4-cliques and grow them by adding nodes that are connected to at least 4 nodes of the structure– Helps in tolerating few missing links in large cliques

• Then, process all 3-cliques

16

a

c

d

e

a

c

d

e

a

c

d

e

CS 790g: Complex Networks

Page 17: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Complete Bipartite

• First search for a small size, i.e., K2,3, complete bipartite structure in G* and then grow it to a larger one– Take each pair of nodes and look whether they are in a K2,3

– Identifying a K2,3, look for larger complete bipartite graphs K2,m and then Kn,m that contain the identified K2,3.

• Then, process all K2,2’s

17

A

C

D

F

E

A

C

D

F

E

In G

C

D

F

E

In G* In G

A

CS 790g: Complex Networks

Page 18: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Star

• Combine anonymous neighbors of a known node under trace preservation condition– Starting from ones with smallest number of

anonymous neighbors

18

DA w

C y

E z

DA w

C y

E z

Note: Operate on G and not on G*

CS 790g: Complex Networks

Page 19: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Outline

• Introduction• Anonymous router resolution – Problem– Previous approaches

• Anonymity types• Anonymity resolution via graph-based

induction (GBI)• Conclusions

CS 790g: Complex Networks 19

Page 20: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

Summary

Internet Topology Discovery 20

DA

C

E

GBI

DA

C

E

Underlying

DA

C

E

Collected

DA

C

E

Neighbor Matching

Responsiveness reduced in the last decade NP-hard problem Graph Based Induction Technique

Practical approach for anonymous router resolution Identifies common structures Handles all anonymity types Helpful in resolving multiple anonymous routers in a locality Uses subnet info to reduce the false postives

Page 21: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

References• M. H. Gunes and K. Sarac. Resolving anonymous routers• in internet topology measurement studies. In IEEE INFOCOM,• Apr. 2008.• S. Bilir, K. Sarac, and T. Korkmaz. Intersection characteristics of end-to-end Internet paths• and trees. IEEE International Conference on Network Protocols (ICNP), Boston, MA, USA,• November 2005.• A. Broido and K. Claffy. Internet topology: Connectivity of IP graphs. Proceedings of SPIE ITCom Conference, Denver, CO, USA, August 2001.• B. Cheswick, H. Burch, and S. Branigan. Mapping and visualizing the Internet. ACM USENIX,San Diego, CA, USA, June 2000.• B. Yao, R. Viswanathan, F. Chang, and D. Waddington. Topology inference in the presence of anonymous routers. IEEE INFOCOM, San Francisco, CA,

USA, March 2003.• P. Tan, M. Steinbach, and V. Kumar. Introduction to data mining. Addison-Wesley, Reading,• MA, USA, 2005.• X. Jin, W.-P. K. Yiu, S.-H. G. Chan, and Y. Wang. Network topology inference based on end-to-end measurements. IEEE Journal on Selected Areas in

Communications special issue on Sampling the Internet, 24(12):2182{2195, Dec. 2006.• D. Cook and L. Holder. Mining graph data. John Wiley & Sons, 2006.• T. Matsuda, H. Motoda, and T.Washio. Graph-based induction and its applications. Advanced• Engineering Informatics, 16(2):135{1434, April 2002.• Michihiro Kuramochi, George Karypis, "Frequent Subgraph Discovery," Data Mining, IEEE International Conference on, pp. 313, First IEEE

International Conference on Data Mining (ICDM'01), 2001. • Michihiro Kuramochi, George Karypis, "An Efficient Algorithm for Discovering Frequent Subgraphs," IEEE Transactions on Knowledge and Data

Engineering, vol. 16, no. 9, pp. 1038-1051, September, 2004.• Inokuchi, A., Washio, T., and Motoda, H. 2003. Complete Mining of Frequent Patterns from Graphs: Mining Graph Data.Mach. Learn. 50, 3

(Mar.2003),321-354.DOI=http://dx.doi.org/10.1023/A:1021726221443• Inokuchi, A., Washio, T., and Motoda, H. 2004. A General Framework for Mining Frequent Subgraphs from Labeled Graphs.Fundam. Inf. 66, 1-2

(Nov. 2004), 53-82.

Page 22: University of Nevada, Reno Resolving Anonymous Routers Hakan KARDES CS 790g Complex Networks.

QUESTIONS