Yashar Ganjali Sprint Advanced Technology Lab. August 5, 2003 Link Failures in IP Networks: A Closer...

27
Yashar Ganjali Sprint Advanced Technology Lab. August 5, 2003 Link Failures in IP Networks: A Closer Look
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Yashar Ganjali Sprint Advanced Technology Lab. August 5, 2003 Link Failures in IP Networks: A Closer...

Yashar Ganjali

Sprint Advanced Technology Lab.August 5, 2003

Link Failures in IP Networks:Link Failures in IP Networks:A Closer LookA Closer Look

August 5, 2003 Link Failures in IP Networks: A Closer Look 2

Previous Work

Link Failures

UnplannedMaintenance

Individual Link Failures

Shared: Router Related, Optical Related, …

70 % of Unplanned

August 5, 2003 Link Failures in IP Networks: A Closer Look 3

Our Work

• Goal: gaining a deeper understanding of the causes & characteristics of link failures

• How: – Using SONET alarm logs– Focusing on high failure

links/periods

August 5, 2003 Link Failures in IP Networks: A Closer Look 4

IS-IS Failures, Feb. 2003

August 5, 2003 Link Failures in IP Networks: A Closer Look 5

Number of Failures per Day

August 5, 2003 Link Failures in IP Networks: A Closer Look 6

Zoom In

August 5, 2003 Link Failures in IP Networks: A Closer Look 7

Zoom In (Cont’d)

August 5, 2003 Link Failures in IP Networks: A Closer Look 8

High Failure Periods

1. Periods with a large number of failures (HNP)

2. Periods in which a large percentage of links are down (HDP)

August 5, 2003 Link Failures in IP Networks: A Closer Look 9

Number vs. Duration of Failures

August 5, 2003 Link Failures in IP Networks: A Closer Look 10

High Failure Periods

August 5, 2003 Link Failures in IP Networks: A Closer Look 11

Periods with High Number of Failures

August 5, 2003 Link Failures in IP Networks: A Closer Look 12

High Failure Periods (Cont’d)

• We can also classify high failure periods based on spatial distribution of the links failing in those periods.

August 5, 2003 Link Failures in IP Networks: A Closer Look 13

Spatial DistributionJune 17, 2003

August 5, 2003 Link Failures in IP Networks: A Closer Look 14

Spatial DistributionFeb. 26, 2003

August 5, 2003 Link Failures in IP Networks: A Closer Look 15

Links with High Duration of Failures

August 5, 2003 Link Failures in IP Networks: A Closer Look 16

Links with High Number of Failures

August 5, 2003 Link Failures in IP Networks: A Closer Look 17

Matching IS-IS Failures with SONET alarms (SLOS)

IS-IS Failure % Matched IS-IS Failure % Matched

STK.Feb-Jun.2003 0.58 STK.Feb-Jun.2003-NM-NRR 0.65

STK.Feb-Jun.2003-HDL 0.68 STK.Feb-Jun.2003-NM-NRR-HDL 0.85

STK.Feb-Jun.2003-HDP 0.19 STK.Feb-Jun.2003-NM-NRR-HDP 0.65

STK.Feb-Jun.2003-HNL 0.77 STK.Feb-Jun.2003-NM-NRR-HNL 0.84

STK.Feb-Jun.2003-HNP 0.56 STK.Feb-Jun.2003-NM-NRR-HNP 0.78

STK.Feb-Jun.2003-NM 0.61 STK.Feb-Jun.2003-NM-RR 0.29

STK.Feb-Jun.2003-NM-HDL 0.69 STK.Feb-Jun.2003-NM-RR-HDL 0.00

STK.Feb-Jun.2003-NM-HDP 0.28 STK.Feb-Jun.2003-NM-RR-HDP 0.07

STK.Feb-Jun.2003-NM-HNL 0.81 STK.Feb-Jun.2003-NM-RR-HNL 0.16

STK.Feb-Jun.2003-NM-HNP 0.65 STK.Feb-Jun.2003-NM-RR-HNP 0.26

August 5, 2003 Link Failures in IP Networks: A Closer Look 18

Preliminary Results• About 58% of all failures match with a

SLOS alarm.• High Failure Links have a higher

correlation with SLOS alarms (85%)• Router Related Failures show much

less correlation than other classes (0-29%).

• Periods with high number of failures show more correlation than periods with high duration of failures.

August 5, 2003 Link Failures in IP Networks: A Closer Look 19

Unmatched SLOS alarms

August 5, 2003 Link Failures in IP Networks: A Closer Look 20

Matching SLOS alarms• Problem: A large percentage (45%) of

SLOS alarms do not correspond to any IS-IS failures

• Solution: Remove links which are not up or links which go down for ever

• Result: Less than 2% of SLOS alarms do not correspond to any IS-IS failures

August 5, 2003 Link Failures in IP Networks: A Closer Look 21

Matching IS-IS Failures & SONET alarms

August 5, 2003 Link Failures in IP Networks: A Closer Look 22

Unmatched SLOS alarms(considering removed links)

August 5, 2003 Link Failures in IP Networks: A Closer Look 23

Unmatched SLOS alarms• Spread over time

• Almost half of them have a matching alarm (SLOS <-> SLOS cleared)

• Minimum time between SLOS and SLOS cleared is 9 seconds.

• More investigation???

August 5, 2003 Link Failures in IP Networks: A Closer Look 24

Research Direction• High Failure Periods: Study cascading

effects, spatial distribution of failures• High Failure Links: Predicting future

failures based on passed ones.• Network Availability: How do failures

affect network availability?

• ???

August 5, 2003 Link Failures in IP Networks: A Closer Look 25

You always pass failures on the way

to success!

You always pass failures on the way

to success!

Thank you!

Thank you!

August 5, 2003 Link Failures in IP Networks: A Closer Look 26

Spatial-Temporal Correlation of Failures

August 5, 2003 Link Failures in IP Networks: A Closer Look 27

Spatial-Temporal Correlation of Failures