Optimization-Based Approaches to Understanding and Modeling Internet Topology

80
Optimization-Based Approaches to Understanding and Modeling Internet Topology David Alderson California Institute of Technology INFORMS Telecom Conference 2004 March 8, 2004

description

Optimization-Based Approaches to Understanding and Modeling Internet Topology. David Alderson California Institute of Technology INFORMS Telecom Conference 2004 March 8, 2004. Acknowledgments. - PowerPoint PPT Presentation

Transcript of Optimization-Based Approaches to Understanding and Modeling Internet Topology

Page 1: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Optimization-Based Approaches to Understanding and Modeling

Internet Topology

David AldersonCalifornia Institute of Technology

INFORMS Telecom Conference 2004March 8, 2004

Page 2: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Acknowledgments• This talk represents joint work with John Doyle

(Caltech), Lun Li (Caltech), and Walter Willinger (AT&T—Research)

• Many others have contributed to this story– Reiko Tanaka (Caltech)

– Steven Low (Caltech)

– Ramesh Govindan (USC)

– Neil Spring (U.Washington)

– Stanislav Shalunov (Abilene)

– Heather Sherman (CENIC)

– John Dundas (Caltech)

Page 3: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Today’s Agenda• Review recent work of empiricists and theoreticians to

understand the router-level topology of the Internet• Understand the causes and implications of “heavy tails”

in the complex structure of the Internet• Illustrate how recently popular “scale-free” models of

Internet topology are not just wrong, but wildly so• Describe the importance of optimization in the

development of explanatory models of Internet topology• Present the HOT framework as an alternative means to

understanding the “robust, yet fragile” structure of the Internet and other complex engineering systems

• Highlight some open research problems and areas where contributions can be made by the OR/MS community

Page 4: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

First Question:

Why should we care about modeling

the topology of the Internet?

Page 5: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Understanding the topology of the current (and future) Internet is important for many reasons

• Design and evaluation of networking protocols– Topology affects performance, not correctness

• Understanding large-scale network behavior– Closed-loop feedback: topology design vs. protocols

• Is the current design a result of the dominant routing protocols?

• Or are the presently used routing protocols the result of some prevailing network design principles?

– Ability to study what-if scenarios• Operating policy shifts

• Economic changes in the ISP market

• Implications for tomorrow’s networks– Provisioning requirements, Traffic engineering, Operating

and management policies

Page 6: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Performance Evaluation via Simulation

networksimulator(e.g. ns2)

• application-specific

traffic demandinfo

protocolinfo

performancemeasures

networktopology

info• connectivity• bandwidths• delays

(traditional) topology generators provide only connectivity

information!

Page 7: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Performance Evaluation via Simulation

networksimulator(e.g. ns2)

traffic demandinfo

protocolinfo

performancemeasures

• connectivity• bandwidths• delays

annotated networkgraph!

Page 8: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

The Internet as an InfrastructureAs the Internet has grown in capability and importance,

we have become increasingly dependent on it.• Directly: communication (email, instant messenger,

VOIP), information and entertainment, e-commerce• Indirectly: business, education, government have

(permanently) replaced physical/manual methods with electronic processes, many of which rely on the Internet.

The central importance, open architecture, and evolving technology landscape make the Internet an attractive target for asymmetric attack.

Page 9: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

The Internet as a Case StudyThe Internet is a great starting point for the study of

other highly engineered complex network systems:• To the user, it creates the illusion of a simple, robust,

homogeneous resource enabling endless varieties and types of technologies, physical infrastructures, virtual networks, and applications (heterogeneous).

• Its complexity is starting to approach that of simple biological systems

• Our understanding of the underlying technology together with the ability to perform detailed measurements means that most conjectures about its large-scale properties can be unambiguously resolved, though often not without substantial effort.

Page 10: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Question Two:

Why is research on Internet topology interesting/difficult?

Page 11: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

A Challenging ProblemSince the decommissioning of the NSFNet in 1995, it has

been difficult to obtain comprehensive knowledge about the topology of the Internet

• The network has grown dramatically (number of hosts, amount of traffic, number of ISPs, etc.)

• There have been economic incentives for ISPs to maintain secrecy about their topologies

• Direct inspection usually not allowed

Page 12: Optimization-Based Approaches  to Understanding and Modeling Internet Topology
Page 13: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Qwest US Fiber Map – June 2001

(to Europe)

(to Hawaii)

(to Japan)

Source: www.qwest.com

Page 14: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Sprint backbone

Page 15: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Measuring Topology• Marketing documents not very helpful• The task of “discovering” the network has been left to

experimentalists – Must develop sophisticated methods to infer this topology

from appropriate network measurements.

– Many possible measurements that can be made.

Page 16: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Router-Level Topology

Hosts

Routers

• Nodes are machines (routers or hosts) running IP protocol

• Measurements taken from traceroute experiments that infer topology from traffic sent over network

• Subject to sampling errors and bias

• Requires careful interpretation

Page 17: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

AS Topology• Nodes are entire

networks (ASes)• Links = peering

relationships between ASes

• Relationships inferred from Border Gateway Protocol (BGP) information

• Really a measure of business relationships, not network structure

AS1

AS3

AS4

AS2

Page 18: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Measuring Topology• Marketing documents not very helpful• The task of “discovering” the network has been left to

experimentalists – Must develop sophisticated methods to infer this topology

from appropriate network measurements.

– Many possible measurements that can be made.

– Each type of measurement has its own strengths, weaknesses, and idiosyncrasies, and results in a distinct view of the network topology.

• Hard to know what “matters”…

Page 19: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

“Standard” Approach1. Choose a sequence of well-understood metrics or

observed features of interest, such as – hierarchy– node-degree distributions – clustering coefficients

2. Develop a method that matches these metrics

Pros:– Always possible to obtain a good “fit” on a chosen metric.

Cons:– Hard to choose the “right” metric. What is “right” is apt to vary,

depending on the intended use of the topology. – A method that does a good job of matching the chosen metric often

does not fit other metrics well.– No predictive power.

We call this approach descriptive (evocative) modeling.

Page 20: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

An Alternative Approach1. Identify the causal forces at work in the design and

evolution of real topologies.

2. Develop methods that generate and evolve topologies in a manner consistent with these forces.

Pros: – Ability to generate (and design!) topologies at different levels of

hierarchy

– More realistic topology

– Greater predictive power

– Possibly reveal some relationship with routing protocols

Cons: – Difficult to identify the causal forces

– Requires careful development and diligent validation

We call this approach explanatory modeling.

Page 21: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Our Approach: Focus on the ISP

• Capture and represent realistic drivers of Internet deployment and operation at the level of the single ISP

• Many important networking issues are relevant at the level of the ISP (e.g. configuration, management, pricing, provisioning)

• Common topologies represented in terms of ISPs

– Router-level graphs connectivity within the ISP

– AS graphs connectivity between ISPs

• First-Order Objective: the ability to generate a “realistic, but fictitious” ISP topology at different levels of hierarchy

Page 22: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

ISP Driving Forces• Economic Factors:

– Cost of procuring, installing, and maintaining the necessary facilities and equipment

– Limited budget for capital expenditures

– Need to balance expenditures with revenue streams

– Need to leverage investment in existing infrastructure

– Location of customers

• Technological Factors:– Hardware constraints (e.g. router speeds, limited # interfaces

or line cards per router)

– Level 2 Technologies (Sonet, ATM, WDM)

– Existing legacy infrastructure

– Location and availability of dark fiber

Page 23: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Mathematical Framework1. Use combinatorial optimization to represent the problem and

its constraints– Objectives (min cost, max profitability, satisfy demand)

– Constraints (equipment costs/capacities, legacy infrastructure)

– Parameters (pricing, provisioning, facility location)

2. Study and explore how this framework allows for a range of ISP behavior

– Effect of objectives and constraints are the most important

3. Part of a more general framework– Highly Optimized Tolerance (HOT), Carlson and

Doyle, 1999

Page 24: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Question Three:

How is research on Internet topology different from what OR/MS researchers are used to doing

on other complex engineering networks?

Page 25: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

A Different Type of Problem• Researchers in network optimization are used to

problems in which the objectives and constraints are well-defined

• Here, we are trying to uncover the “most significant” drivers of topology evolution so that we can create “fictitious, yet realistic” network counterparts, and also so that we can design and build improved networks

• We will need to iterate between modeling, measurement, and analysis to get it right

• In this sense, this is a bit more like biology• Recent progress has given hope that a comprehensive

theory for the Internet may be possible

Page 26: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Heterogeneity of the Internet• Full of “high variability”

– Link bandwidth: Kbps – Gbps– File sizes: a few bytes – Mega/Gigabytes– Flows: a few packets – 100,000+ packets– In/out-degree (Web graph): 1 – 100,000+– Delay: Milliseconds – seconds and beyond

• Diversity in the technologies that comprise the physical and link layers

• Diversity in the applications and services that are supported

• This heterogeneity has evolved organically from an architecture that was designed to be robust to changes (failures or innovation) and is permanent

Page 27: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

The Internet hourglass

IP

Web FTP Mail News Video Audio ping Kazaa

Applications

TCP SCTP UDP ICMP

Transport protocols

Ethernet 802.11 SatelliteOpticalPower lines BluetoothATM

Link technologies

Page 28: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

IP

The Internet hourglass

Web FTP Mail News Video Audio ping Kazaa

Applications

TCP

Ethernet 802.11 SatelliteOpticalPower lines BluetoothATM

Link technologies

Page 29: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

The Internet hourglass

IP

Web FTP Mail News Video Audio ping Kazaa

Applications

TCP

Ethernet 802.11 SatelliteOpticalPower lines BluetoothATM

Link technologies

Everythingon IP

IP oneverything

Page 30: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

IP

TCP/AQM

Applications

Link

A Theory for the Internet?

• Vertical decomposition of the protocol stack allows for the treatment of layers in isolation (a separation theorem)

• Assume that layers not considered perform in a near-optimal manner

• Use an engineering design-based perspective in two ways:

– Analysis: explain the complex structure that is observed

– Synthesis: suggest changes or improvements to the current design

Page 31: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

IP

TCP/AQM

Applications

Link

A Theory for the Internet?

How to design an application that “performs

well” in meeting user demands subject to the

resources/constraints made available by TCP/IP?

Page 32: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

IP

TCP/AQM

Applications

Link

A Theory for the Internet?

How to design a network that “performs well” and satisfies traffic demands subject to the physical resources/constraints?

Page 33: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

IP

TCP/AQM

Applications

Link

A Theory for the Internet?

If TCP/AQM is the answer, what is the

question?

Primal/dual model of TCP/AQM congestion control…

cRx

xUs

ssxs

subject to

)( max0

gives

gives

??

Page 34: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

IP

TCP/AQM

Applications

Link

A Theory for the Internet?

??If the current topology of the Internet is the answer,

what is the question?

Page 35: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Next Question:

What’s been done to try and understand the large-scale structure of the Internet?

How should we think about the Internet’s router-level topology?

Page 36: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Trends in Topology ModelingObservation Modeling Approach

• Real networks are not random, but have obvious hierarchy (router-level).

• Structural models (GT-ITM Calvert/Zegura, 1996) generate connectivity-only topologies with inherent hierarchy

• Long-range links are expensive (router-level).

• Random graph models (Waxman, 1988) generate connectivity-only topologies

• Router-level and AS graphs exhibit heavy-tailed distributions (power laws) in characteristics such as node-degree.

• Degree-based models (including popular “scale-free” models) generate connectivity-only topologies with inherent power laws in node degree distribution

Page 37: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Power Laws and Internet Topology

Source: Faloutsos et al (1999)

Observed scaling in node degree and other statistics:– Autonomous System (AS) graph– Router-level graph

How to account for high variability in node degree?

Most nodes have few connections

A few nodes have lots of connectionsn

um

ber

of

con

nec

tion

s

rank rank

Page 38: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Power Laws in Topology Modeling• Recent emphasis has been on whether or not a

given topology model/generator can reproduce the same types of macroscopic statistics, especially power law-type degree distributions

• Lots of degree-based models have been proposed– All of them are based on random graphs, usually with

some form of preferential attachment

– All of them are connectivity-only models and tend to ignore engineering-specific system details

• Examples: BRITE, INET, Barabasi-Albert, GLP, PLRG, CMU-generator

Page 39: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Models of Internet Topology• These topology models are merely descriptive

– Measure some feature of interest (connectivity)– Develop a model that replicates that feature– Make claims about the similarity between the real system

and the model– A type of “curve fitting”?

• Unfortunately, by focusing exclusively on node degree distribution, these models that get the story wrong

• We seek something that is explanatory– Consistent with the drivers of topology design and

deployment– Consistent with the engineering-related details– Can be verified through the measurement of appropriate

system-specific details

Page 40: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Our Perspective• Must consider the explicit design of the Internet

– Protocol layers on top of a physical infrastructure– Physical infrastructure constrained by technological

and economic limitations– Emphasis on network performance– Critical role of feedback at all levels

• Consider the ability to match large scale statistics (e.g. power laws) as secondary evidence of having accounted for key factors affecting design

Page 41: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Trends in Topology ModelingObservation Modeling Approach

• Real networks are not random, but have obvious hierarchy (router-level).

• Structural models (GT-ITM Calvert/Zegura, 1996) generate connectivity-only topologies with inherent hierarchy

• Long-range links are expensive (router-level).

• Random graph models (Waxman, 1988) generate connectivity-only topologies

• Router-level and AS graphs exhibit heavy-tailed distributions (power laws) in characteristics such as node-degree.

• Degree-based models (including popular “scale-free” models) generate connectivity-only topologies with inherent power laws in node degree

• Physical networks have hard technological (and economic) constraints.

• Optimization-driven models generate annotated topologies consistent with design tradeoffs of network engineers

Page 42: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

HOTHighly HeavilyHeuristically

Optimized Organized

Tolerance Tradeoffs

• Based on ideas of Carlson and Doyle (1999)• Complex structure (including power laws) of highly

engineered technology (and biological) systems is viewed as the natural by-product of tradeoffs between system-specific objectives and constraints

• Non-generic, highly engineered configurations are extremely unlikely to occur by chance

• Result in “robust, yet fragile” system behavior

Page 43: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Heuristic Network DesignWhat factors dominate network design?

• Economic constraints – User demands – Link costs– Equipment costs

• Technology constraints – Router capacity– Link capacity

Page 44: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

1e-1

1e-2

1

1e1

1e2

1e3

1e4

1e21 1e4 1e6 1e8

Rank (number of users)

Dial-up~56Kbps

BroadbandCable/DSL~500Kbps

Ethernet10-100Mbps

POS/Ethernet1-10Gbps

Con

nec

tion

Sp

eed

(M

bp

s)

most users have low speed

connections

a few users have very high speed

connections

high performancecomputing

academic and corporate

residential and small business

Internet End-User Bandwidths

How to build a network that

satisfies these end user demands?

Page 45: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Economic Constraints• Network operators have a limited budget to

construct and maintain their networks

• Links are tremendously expensive

• Tremendous drive to operate network so that traffic shares the same links– Enabling technology: multiplexing– Resulting feature: traffic aggregation at edges– Diversity of technologies at network edge (Ethernet,

DSL, broadband cable, wireless) is evidence of the drive to provide connectivity and aggregation using many media types

Page 46: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Heuristically Optimal Network

Hosts

Edges

CoresMesh-like core of fast,

low degree routers

High degree nodes are at the edges.

Page 47: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Heuristically Optimal NetworkClaim: economic considerations alone suggest

a structure having– Mesh-like core of high-speed, low degree routers– High degree, low-speed nodes at the edge

• Is this consistent with technology capability?

• Is this consistent with real network design?

Page 48: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Cisco 12000 Series Routers

Chassis Rack size SlotsSwitching Capacity

12416 Full 16 320 Gbps

12410 1/2 10 200 Gbps

12406 1/4 6 120 Gbps

12404 1/8 4 80 Gbps

• Modular in design, creating flexibility in configuration.

• Router capacity is constrained by the number and speed of line cards inserted in each slot.

Source: www.cisco.com

Page 49: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Cisco 12000 Series RoutersTechnology constrains the number and capacity of line cards that can be installed, creating a feasible region.

Page 50: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Cisco 12000 Series RoutersPricing info: State of Washington Master Contract, June 2002

(http://techmall.dis.wa.gov/master_contracts/intranet/routers_switches.asp)

$602,500

$381,500

$212,400

$128,500

$2,762,500

$1,667,500

$932,400

$560,500

Page 51: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

bandwidth

degree1 16

10Gb

155Mb

256log/log

625Mb

2.5GbTechnically

feasible

160Gb

Technological advance

Page 52: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Technologically Feasible Region

0.01

0.1

1

10

100

1000

10000

100000

1000000

1 10 100 1000 10000

degree

Ban

dw

idth

(M

bp

s) cisco 12416

cisco 12410

cisco 12406

cisco 12404

cisco 7500

cisco 7200

cisco 3600/3700

cisco 2600

linksys 4-port router

uBR7246 cmts(cable)cisco 6260 dslam(DSL)cisco AS5850(dialup)

Edge Shared media(LAN, DSL,

Cable, Wireless,Dial-up)

Corebackbone

High-end gateways

Older/cheapertechnology

Page 53: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

SOX

SFGP/AMPATH

U. Florida

U. So. Florida

Miss StateGigaPoP

WiscREN

SURFNet

Rutgers

MANLAN

NorthernCrossroads

Mid-AtlanticCrossroads

Drexel

U. Delaware

PSC

NCNI/MCNC

MAGPI

UMD NGIX

DARPABossNet

GEANT

Seattle

Sunnyvale

Los Angeles

Houston

Denver

KansasCity

Indian-apolis

Atlanta

Wash D.C.

Chicago

New York

OARNET

Northern LightsIndiana GigaPoP

MeritU. Louisville

NYSERNet

U. Memphis

Great Plains

OneNetArizona St.

U. Arizona

Qwest Labs

UNM

OregonGigaPoP

Front RangeGigaPoP

Texas Tech

Tulane U.

North TexasGigaPoP

TexasGigaPoP

LaNet

UT Austin

CENICUniNet

WIDE

AMES NGIX

OC-3 (155 Mb/s)OC-12 (622 Mb/s)GE (1 Gb/s)OC-48 (2.5 Gb/s)OC-192/10GE (10 Gb/s)

Abilene BackbonePhysical Connectivity(as of December 16, 2003)

PacificNorthwestGigaPoP

U. Hawaii

PacificWave

ESnet

TransPAC/APAN

Iowa St.

Florida A&MUT-SWMed Ctr.

NCSA

MREN

SINet

WPI

StarLight

IntermountainGigaPoP

Page 54: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Cisco 750X

Cisco 12008

Cisco 12410

dc1dc2

dc3

hpr

dc1

dc3

hpr

dc2dc1

dc1dc2

hprhpr

SACOAK

SVL

LAX

SDG

SLOdc1

FRGdc1

FREdc1

BAKdc1

TUSdc1

SOLdc1

CORdc1

hprdc1

dc2

dc3

hpr

OC-3 (155 Mb/s)OC-12 (622 Mb/s)GE (1 Gb/s)OC-48 (2.5 Gb/s)10GE (10 Gb/s)

CENIC Backbone (as of January 2004)

AbileneSunnyvale

AbileneLos Angeles

Backbone topologies for both Abilene and CENIC are built as a mesh of high speed, low degree routers.

As one moves from the core out toward the edge, connectivity gets higher, and speeds get lower.

Corporation for Education Network Initiatives in California (CENIC) runs the educational backbone for the State of California

Page 55: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

dc1dc2

dc3

hpr

dc1

dc3

hpr

LAX

SDG

SLOdc1

BAKdc1

TUSdc1

hpr

Chaffey, Crafton Hills, Cypress, Fullerton CC,

Mt. San Jacinto, Rio Hondo, Riverside, San Bernardino CCD, San Bernardino Valley, N.

Orange Cty CCD, Santa Ana College

Chaffey Joint USD

LA USD

UC Irvine

UC San Diego

SDSC

UCLA

UC Riverside

San Diego CC, Soutwestern CC,

Grossmont, Cuyamaca, Imperial Valley, Mira Costa CC, Palomar

CollegeSan Diego COE Johnson & Johnson

CUDI Peer,ESNet Peer

LosNettos

LAAP

UCSSN(Las Vegas)

UC SantaBarbara

MonroviaUSD Gigaman

Antelope Valley CC, Cerritos, Citrus, College of

the Canyons, Compton, East LA, El Camino CC, Glendale, Long Beach City College, Pasadena

CC, Santa Monica, Ventura

College

LA CCD, LA City, LA Harbor, LA Mission, LA Pierce, LA Southwest, LA Trade Tech,

LA Valley, Moorpark, Mt. San Antonio, Oxnard

Los Angeles COE

San Bernardino CSS

Riverside COE

Orange COE

Caltech

Abilene

to Soledad

to Sunnyvale

to Sacramento

to Fremont

CENIC Backbone for Southern California

Page 56: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Heuristically Optimal Network• Mesh-like core of high-speed, low degree routers• High degree, low-speed nodes at the edge

• Claim: consistent with drivers of topology design– Economic considerations (traffic aggregation)

– End user demands

• Claim: consistent with technology constraints• Claim: consistent with real observed networks

Question: How could anyone imagine anything else?

Page 57: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Two opposite views of complexity

Physics:• Pattern formation by

reaction/diffusion• Edge-of-chaos• Order for free• Self-organized criticality• Phase transitions• Scale-free networks• Equilibrium, linear• Nonlinear, heavy tails as

exotica

Engineering and math:• Constraints• Tradeoffs• Structure • Organization• Optimality• Robustness/fragility• Verification• Far from equilibrium• Nonlinear, heavy tails

as tool

Principle Difference:

Random vs. Designed

Page 58: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Models of Internet Topology

• To physicists, scaling relationships are suggestive of critical phenomenon and phase transitions

• The starting assumption is that of randomness, and one looks for emergent behaviors

Page 59: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Random NetworksTwo methods for generating random networks having

power law distributions in node degree• Preferential attachment (“scale-free” networks)

– Inspired by statistical physics– Barabasi et al.; 1999

• Power Law Random Graph (PLRG)– Inspired by graph theory– Aiello, Chung, and Lu; 2000

Page 60: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Summary of “Scale-Free” Story• Fact: “Scale-free” networks have roughly power law

degree distributions• Claim:

– If the Internet has power law degree distribution

– Then it must be “scale-free” (oops)

– Therefore, it has the properties of a “scale-free” network

• Characteristic features of “scale free” networks– High degree central “hubs”

– Network connectivity is robust to loss of random nodes, but fragile to attack on central hubs

– Highly likely to result from various random constructions

Page 61: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

One ofthe most-read papers ever on

the Internet!

Page 62: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Scientists spot Achilles heel of the Internet

• "The reason this is so is because there are a couple of very big nodes and all messages are going through them. But if someone maliciously takes down the biggest nodes you can harm the system in incredible ways. You can very easily destroy the function of the Internet."

• These scientists compared the structure of the Internet to the airline network of the United States.

Page 63: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Key Points• The scale-free story is based critically on the implied

relationship between power laws and a network structure that has highly connected “central hubs”

• Not all networks with power law degree distributions have properties of scale free networks. (The Internet is just one example!)

• Building a model to replicate power law data is no more than curve fitting (descriptive, not explanatory)

• The ubiquity of heavy-tailed (power-law) relationships in highly variable phenomena is to be expected for statistical reasons alone and requires no “special” or “exotic” explanation

Page 64: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

End ResultThe “scale-free” claims of the Internet are not merely wrong, they suggest properties that are

the opposite of the real thing.

Fundamental difference:

random vs. designed

Page 65: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Internet topologies

nodes=routersedges=links

25 interior routers818 end systems

High degree hub-like coreLow degree

mesh-like core

101

102

100

101

degree

rank

identical power-law degrees

How to characterize / compare these two networks?

“scale-rich” vs. “scale-free”

Page 66: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Network PerformanceGiven realistic technology constraints on routers, how

well is the network able to carry traffic?

Step 1: Constrain to be feasible

Abstracted Technologically Feasible Region

1

10

100

1000

10000

100000

1000000

10 100 1000

degree

Ban

dw

idth

(M

bp

s)

Bi

Bj

xij

Step 2: Compute traffic demand

kCxts

BBx

ijrkjikij

ji jijiij

,..

maxmax

:,

, ,

Step 3: Compute max flow

Page 67: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Network LikelihoodHow likely is a particular graph (having given node

degree distribution) to be constructed?

• Notion of likelihood depends on defining an appropriate probability space for random graphs.

• Many methods (all based on probabilistic preferential attachment) for randomly generating graphs having power law degree distributions:– Power Law Random Graph (PLRG) [Aiello et al.]

– Random rewiring (Markov chains)

In both cases, LogLikelihood (LLH) j

connectedji

idd,

Page 68: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Likelihood

HighLow

Fast

Slow

Performance

Why such strikingdifferences with same

node degree distribution?

Page 69: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

1

10

100

1000

10000

100000

10 100

degree

Ban

dw

idth

(M

bp

s)

11

10

100

1000

10000

100000

10 100

degree1

Fast core

High-degree edge

Slow core

Slower edge

Performance LikelihoodLikelihood

HighLow

Fast

Slow

Page 70: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

“Scale-free”• Core: Hub-like, high degree • Edge: Low degree• Robust to random • Fragile to “attack”

HOT “scale-rich”• Core: Mesh-like, low degree • Edge: High degree• Robust to random • Robust to “attack”

• High performance• Low link costs• Unlikely, rare, designed• Destroyed by rewiring• Similar to real Internet

• Low performance• High link costs• Highly likely, generic• Preserved by rewiring• Opposite of real Internet

+ objectives and constraints

Page 71: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

HierarchicalScale-Free (HSF)

RandomHOT

Low LikelihoodLow Performance Most Likely

Page 72: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Key Points

1. High performance networks are extremely unlikely to be found by a random process.

2. Models that focus on “highly likely” constructions will result in graphs that are poorly performing and are not representative of highly engineered networks.

Page 73: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Recap• We do not claim that our “heuristically optimal topology” is

an accurate representation of the real Internet, simply that it captures some basic features observed in real networks.

• But it goes a long way to dispelling much of the confusion about heavy-tailed node degree distributions (i.e. “scale-free” models are fundamentally inconsistent with engineering design despite their ability to match macro-statistics).

• “Scale-free” models may be good representations of other systems, simply not the router-level of the Internet.

• This highlights the importance of “random vs. designed”.• It is remarkable how even simple models based on

fundamental technological and economic tradeoffs can go a long way to explaining large-scale network features.

• These models are a only starting point for a more detailed investigation of Internet topology.

Page 74: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Final Question:

What still needs to be done to understand the large-scale structure of the Internet?

How can researchers in OR/MS help to solve this problem?

Page 75: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

(to be done)• Understand the relationship between

optimization drivers and topology– Example: Papadimitriou’s HOT

• Description of more detailed optimization models that account for real economic and technological considerations (?)

• Tie-in to WDM network optimization currently in vogue

• (Need help thinking this through)

Page 76: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Today’s Agenda• Review recent work of empiricists and theoreticians to

understand the router-level topology of the Internet• Understand the causes and implications of “heavy tails”

in the complex structure of the Internet• Illustrate how recently popular “scale-free” models of

Internet topology are not just wrong, but wildly so• Describe the importance of optimization in the

development of explanatory models of Internet topology• Present the HOT framework as an alternative means to

understanding the “robust, yet fragile” structure of the Internet and other complex engineering systems

• Highlight some open research problems and areas where contributions can be made by the OR/MS community

Page 77: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Thank You

[email protected]

Page 78: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Appendix

Page 79: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Recent Measurement Experiments• R. Govindan and H. Tangmunarunkit. Heuristics for Internet Map

Discovery, Proceeding of IEEE INFOCOM (2000) [often known as Mercator Project]

• L. Gao. On inferring autonomous system relationships in the Internet, in Proc. IEEE Global Internet Symposium, November 2000.

• Route Views, University of Oregon Route Views Project, Available at http://www.antc.uoregon.edu/route-views/.

• A. Broido and k. Claffy. Internet Topology: Connectivity of IP Graphs, Proceeding of SPIE ITCom WWW Conf. (2001) [often known as Skitter Project]

• N. Spring, R. Mahajan, and D.Wetherall. Measuring ISP Topologies with Rocketfuel, Proc. ACM SIGCOMM (2002)

• L. Subramanian, S. Agarwal, J. Rexford, and R. Katz. Characterizing the Internet Hierarchy from Multiple Vantage Points, Proc. IEEE INFOCOM (2002)

• H. Chang, R. Govindan, S. Jamin, S. Shenker, and W. Willinger. Towards Capturing Representative AS-Level Internet Topologies Proc. Of ACM SIGMETRICS (2002)

Page 80: Optimization-Based Approaches  to Understanding and Modeling Internet Topology

Models based on preferential attachment:• INET: use both curve fitting and preferential attachment. It first calculates the

frequency-degree and rank-degree distributions. It then assigns degrees to each node according to these distributions. Finally it matches these degrees according to the linear preferential model.

• BRITE: BRITE incorporates recent preferential attachment and observations of skewed network placement and locality in network connections on the Internet.

• BA: preferential attachment• AB: preferential attachment + adding a third rewiring operation consisting of

choosing links randomly and re-wiring each end of them according to the same linear preference rule as used in BA generator.

• GLP (Generalized Linear Preference): change preferential probability from d_i/sum(d_i) to (d_i-beta)/(sum(d_i-beta)), and beta is a tunable paparmenter between (-\infinity, 1). With some probabilty, add links preferentially and with some other probability, add nodes preferentially.

Other models:• PLRG: given N nodes with expected degree distribution, assign links among

links with probability proprotional to the product of the expected degree of the two end points.

• CMU Power-law generator: two ways: 1. assign a power-law degree distribution to nodes and then place links in the adjacency matrix such that every node obtains the assigned degree. 2. Recusive way: define a probability distribution function that randomly selects a pair of nodes and use it to produce a network graph with real-valued edge weights.