Measuring and Troubleshooting Performance of Global Data Centers at ServiceNow

25
© 2016 ServiceNow All Rights Reserved Confidential © 2016 ServiceNow All Rights Reserved Confidential Measuring and Troubleshooting Performance of Global Data Centers at ServiceNow Geoff Wade Sr. Network Engineer

Transcript of Measuring and Troubleshooting Performance of Global Data Centers at ServiceNow

©2016ServiceNowAllRightsReservedConfidential ©2016ServiceNowAllRightsReservedConfidential

MeasuringandTroubleshootingPerformanceofGlobalDataCenters

atServiceNowGeoffWade

Sr.NetworkEngineer

©2016ServiceNowAllRightsReserved 2Confidential

SECURITY TRANSPARENCYWORLDWIDE SCALABILITYCLOUD ARCHITECTURE

Secure, Scalable and Always-On Enterprise Services

©2016ServiceNowAllRightsReserved 3Confidential

InfrastructureandMonitoring– Datacenters• 2NinfrastructureacrossallDatacenters– IndependentduplicatearchitectureataphysicallymirroredlocationforeachphysicalDatacentertoensurefaulttolerance

• AllDatacentersareTier3+sites– Theinfrastructureinthesedatacentersisconcurrentlymaintainableandwillresultinzerodowntimeforanyequipmentfailureslocatedwithinthedatacenter

• Enterprise-classinfrastructureandoperations

• Monitoring– Multipleinternalmonitoringsolutions;none showcustomerview– ThousandEyes letsusmonitor“TheInternet”andisolate thatasafault;ifitalertswithnootheralerts,weknowwheretostart

– Somehomebrew/lessertoolsshowlimitedviewsfromoutside– Externalmonitoringofoneortwodatacentersiseasy,but…

©2016ServiceNowAllRightsReserved 4Confidential

WorldwideScalability

©2016ServiceNowAllRightsReserved 5Confidential

Infrastructure– NetworkArchitecture

Becauseallofthis…

Site 1

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Site 2

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Core-C

Core-D

Core-C

Core-D

Border-BBorder-A Border Router-B

Border Router-A

RackServers RackServers

DSR-A DSR-B

TOR-A TOR-B

DSR-A DSR-B

TOR-A TOR-B

Internal traffic

External traffic

©2016ServiceNowAllRightsReserved 6Confidential

Becauseallofthisdoesn’tmatterifcustomers…

Site 1

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Site 2

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Core-C

Core-D

Core-C

Core-D

Border-BBorder-A Border Router-B

Border Router-A

RackServers RackServers

DSR-A DSR-B

TOR-A TOR-B

DSR-A DSR-B

TOR-A TOR-B

Internal traffic

External traffic

©2016ServiceNowAllRightsReserved 7Confidential

Becauseallofthisdoesn’tmatterifcustomerscan’treachyou!

Site 1

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Site 2

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Core-C

Core-D

Core-C

Core-D

Border-BBorder-A Border Router-B

Border Router-A

RackServers RackServers

DSR-A DSR-B

TOR-A TOR-B

DSR-A DSR-B

TOR-A TOR-B

Internal traffic

External traffic

©2016ServiceNowAllRightsReserved 8Confidential

Becauseallofthisdoesn’tmatterifcustomerscan’treachyou!

Knowingwhensomecustomerscan’treachyouiscrucialforfixingthingsbefore itbecomesamajorcrisis

Site 1

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Site 2

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Core-C

Core-D

Core-C

Core-D

Border-BBorder-A Border Router-B

Border Router-A

RackServers RackServers

DSR-A DSR-B

TOR-A TOR-B

DSR-A DSR-B

TOR-A TOR-B

Internal traffic

External traffic

©2016ServiceNowAllRightsReserved 9Confidential

©2016ServiceNowAllRightsReserved 10Confidential

Site 1

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Site 2

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Core-C

Core-D

Core-C

Core-D

Border-BBorder-A Border Router-B

Border Router-A

RackServers RackServers

DSR-A DSR-B

TOR-A TOR-B

DSR-A DSR-B

TOR-A TOR-B

Internal traffic

External traffic

©2016ServiceNowAllRightsReserved 11Confidential

Site 1

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Site 2

Internet

Pod 1

Core-A

Core-B

DC Interconnect-A

Customer VPN-A

Load Balancer-A

Firewall-A

DC Interconnect-B

Customer VPN-B

Load Balancer-B

Firewall-B

Core-C

Core-D

Core-C

Core-D

Border-BBorder-A Border Router-B

Border Router-A

RackServers RackServers

DSR-A DSR-B

TOR-A TOR-B

DSR-A DSR-B

TOR-A TOR-B

Internal traffic

External traffic

©2016ServiceNowAllRightsReserved 12Confidential

HowWeUseThousandEyes – SRE(Operations)Dashboard• ReducingSRESignal:Noise Ratio– SREhastoprocess*lots*ofinput– Loggingintomultiplescreensinagiventooldoesn’twork

– ThousandEyes providesasinglepagedashboardtoquicklyshowimmediatealerts

– Thesamepagealsoshowssome“normal”statistics,usefulincasecorrelationwithanotherproblemisneeded

– Ondisplayfull-timeinSOC– Caneasilydrilldownifsomethingshowsred

– Configurableincasemore/fewertestsareneeded

©2016ServiceNowAllRightsReserved 13Confidential

HowWeUseThousandEyes – ActiveAlertsviaE-Mail

• E-mailalerts– Back-uptowebGUI– CanshowasingleissueontheInternetor…

– Canbecompiledtoshowatrend

©2016ServiceNowAllRightsReserved 14Confidential

HowWeUseThousandEyes – OtherViewsforOtherGroups• Othergroups– controlledvialog-in– canhavetheirownspecifictests.– ICMPreachabilitytoourISPinterfaces– TCP(imageload)fromADCs– Specificpageloadingwithmoreadvancedtests– MXmonitoring(soon)– BGPprefixtesting&BGProutevisualization

©2016ServiceNowAllRightsReserved 15Confidential

HowWeUseThousandEyes – BGPRouteVisualization.• Usedbynetworkengineers

• YourownPOViseasytotrack,but…• TE=historicalPOVoftrafficinto yourASN

©2016ServiceNowAllRightsReserved 16Confidential

HowWeUseThousandEyes – BGPRouteVisualization.• Replayfeatureoveraselectedtimeline– Diagnosticsduringanevent– Forensicsimmediatelyafter theevent– MakesRFOexplanationseasy

©2016ServiceNowAllRightsReserved 17Confidential

HowWeUseThousandEyes – BGPRouteVisualization.• Replayfeatureoveraselectedtimeline– Diagnosticsduringanevent– Forensicsimmediatelyafter theevent– MakesRFOexplanationseasy

©2016ServiceNowAllRightsReserved 18Confidential

HowWeUseThousandEyes – One-hopview.

Sometimestheviewfromonehopawaydoesn’tclearlyshowwheretheproblemis

©2016ServiceNowAllRightsReserved 19Confidential

HowWeUseThousandEyes – BGPRouteVisualization.• Numberofhops(awayfromyou)canbeexpanded

• Sometimestheproblemisn’twithyourconnection,butupstream

• WelikethatThousandEyes nodesareoftenpublicly-accessiblerouteservers

©2016ServiceNowAllRightsReserved 20Confidential

HowWeUseThousandEyes – BGPRouteVisualization.• Numberofhops(awayfromyou)canbeexpanded

• …andthenreplayedagain(sametimeperiodasseenbefore)

©2016ServiceNowAllRightsReserved 21Confidential

HowWeUseThousandEyes – BGPRouteVisualization.

• Focus:Onlythenodesthatsawanissue… • …aswellasonlythelinksthatsawtheissue.

©2016ServiceNowAllRightsReserved 22Confidential

HowWeUseThousandEyes – BGPRouteVisualization.

• Provideslinktootherteststhatmightberelated

©2016ServiceNowAllRightsReserved 23Confidential

HowWeUseThousandEyes – BGPRouteVisualization.

• Canhelpisolateafailure,e.g.,iffailureisseenwithonedestinationprefixbutnottheother

©2016ServiceNowAllRightsReserved 24Confidential

HowWeUseThousandEyes – Summary;Future;Considerations• Summary– Providesexcellentviewintoyournetwork– Usedbymanygroups(butespeciallynetworkengineers!)– Allowsspecificmonitoringofwhatweneedwithoutextranoise– Helpsusisolateanddeterminetheurgencyofaneventduringtheevent(andreduceMTR)– Helpsusfigureoutwhathappenedaftertheevent;veryusefulforcustomerRFOs

• Future– InternetOutageDetection– Metricsfordowntime,etc.

• Considerations– InternalThousandEyes nodes

©2016ServiceNowAllRightsReserved 25Confidential

Thankyou

GeoffWadeSr.NetworkEngineer

[email protected]