Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi...
Transcript of Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi...
![Page 1: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/1.jpg)
Network Troubleshooting:rcc and Beyond
Nick FeamsterGeorgia Tech
(joint with Russ Clark, Yiyi Huang, Anukool Lakhina)
![Page 2: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/2.jpg)
2
rcc: Router Configuration Checker• Proactive routing configuration analysis
• Idea: Analyze configuration before deployment
ConfigureDetectFaults
Deploy
rcc
Many faults can be detected with static analysis.
![Page 3: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/3.jpg)
3
rcc Implementation
Preprocessor Parser
Verifier
Distributed routerconfigurations Relational
Database(mySQL)
Constraints
Faults
(Cisco, Avici, Juniper, Procket, etc.)
http://nms.csail.mit.edu/rcc/
![Page 4: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/4.jpg)
4
rcc Interface
![Page 5: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/5.jpg)
5
Parsing Configuration
![Page 6: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/6.jpg)
6
List of Faults
![Page 7: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/7.jpg)
7
Yes, but Surprises Happen!
• Link failures• Node failures• Traffic volumes shift• Network devices “wedged”• …
• Two problems– Detection– Localization
![Page 8: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/8.jpg)
8
A Closer Look
• Proactive analysis– Fault avoidance– Policy conformance
• Reactive diagnosis– Correcting network faults
• Detection• Localization
– Active and passive measurements– Need user’s perspective
Idea: These analyses should inform each other
![Page 9: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/9.jpg)
9
Detection: Analyze Routing Dynamics
• Idea: Routers exhibit correlated behavior
Blips across signals may be more operationally interesting than any spike in one.
![Page 10: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/10.jpg)
10
Detection Three Types of Events
• Single-router bursts• Correlated bursts• Multi-router bursts
• Common• Commonly
missed using thresholds
![Page 11: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/11.jpg)
11
Localization: Joint Dynamic/Static
• Which routers are “border routers” for that burst• Topological properties of routers in the burst
Static Dynamic
Proactive Analysis
Deployment
Reactive Detection
Diagnosis/Correction
![Page 12: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/12.jpg)
12
Configuration Analysis: Next Steps
• BGP/MPLS Layer 3 VPNs– Need access to these configurations to do this!– Help needed!
• Firewall and switch configurations– Take high-level operator policy as input– Analyze static configuration to see whether
configuration matches policy– Perform active probing experiments to check
![Page 13: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/13.jpg)
13
Firewall configuration: Case Study
• Georgia Tech Campus Network– Research and Administrative Network– 180 buildings– 130+ firewalls– 1700+ switches– 55000+ ports
• Problem: Availability/Reachability– Flux in firewall, router, switch configurations– No common authority over changes made
![Page 14: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/14.jpg)
14
Specific Focus: Firewall Configuration
• Difficult to understand and audit configs
• Subject to continual modifications– Roughly 1-2 touches per day
• Federated policy, distributed dependencies– Each department has independent policies– Local changes may affect global behavior
![Page 15: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/15.jpg)
15
Firewall Configurations
• Georgia Tech Campus Network– Research and Administrative Network– 180 buildings– 130+ firewalls– 1700+ switches– 55000+ ports
• Problem: Availability/Reachability– Flux in firewall, router, switch configurations– No common authority over changes made
![Page 16: Network Troubleshooting: rcc and Beyond Nick Feamster Georgia Tech (joint with Russ Clark, Yiyi Huang, Anukool Lakhina)](https://reader036.fdocuments.net/reader036/viewer/2022082700/551498fa550346d36e8b562f/html5/thumbnails/16.jpg)
16
Specific Focus: Firewall Configuration
• Difficult to understand and audit configs
• Subject to continual modifications– Roughly 1-2 touches per day
• Federated policy, distributed dependencies– Each department has independent policies– Local changes may affect global behavior