The Resilient WAN -...

34
The Resilient WAN *Please visit polls in the ILTA Mobile App or http://ilta.cnf.io/sessions/289*

Transcript of The Resilient WAN -...

The Resilient WAN

*Please visit polls in the ILTA Mobile App or http://ilta.cnf.io/sessions/289*

• Introductions

• Brief discussion of WAN terminology and acronyms

• What does it mean to be “Resilient”

• Two Case Studies: WAN Upgrades - Before and After • Small-midsized firm: Zuckerman Spaeder • Midsized to large firm: Lathrop & Gage

• Wrap-up & Questions

The Resilient WAN: Session Overview

Presenters

Philip Finnerty CIO, Zuckerman Spaeder

Moderator:

Charlie Wise

IT Director, Manion Gaynor & Manning

David Alberico Network Manager, Lathrop & Gage

Tim Soto IT Infrastructure Manager, Lathrop & Gage

Poll Results

• BGP—Border Gateway Protocol • Protocol used on exterior Internet gateways for routing

• EIGRP—Enhanced Interior Gateway Routing Protocol • Protocol used in internal systems for routing

• N+1 or N+2 • The number you need plus the number of extras

• QoS—Quality of Service • Allows you to “tag” some traffic as a higher priority

• WAN – “Wide Area Network”

• Last Mile Carrier/Local Loop • Physical connection between the customer premises (demarcation

point ) and the edge of the carrier 's network • POP/Layer 2 Route Path

• The path data traverses across the physical network; can include diverse carrier networks.

What acronyms will you use today? (layman’s explanations)

• WAN – “Wide Area Network” • Highway that allows data to go from office to office

• Bandwidth • Number of lanes on the highway available for data

• Capacity or congestion • Amount of traffic at any given moment in time using the lanes

• QoS • HOV or Express lanes; Prioritized traffic

• Latency • The time it takes to travel from point A to point B

• Speed • Better definition—a function or all the above • Speed Limit--(theoretically, the max is the speed of light)

Elements of a WAN (Traffic analogy—oversimplified)

NOTE: WAN Acceleration/Optimization can change all of the above

Resilient

The capacity of a system to absorb disturbance and retain the same essential functions, structure, identity, and user experience

vs Redundant

The provisioning of additional or duplicate circuits, hardware, etc., that function in case a part of the system fails.

You can achieve resiliency by employing redundancy; however, redundant systems are not necessarily resilient.

What do you mean by Resilient?

• What is your tolerance for downtime?

• What applications are running on your WAN (VoIP, video, VDI)?

• What is your budget?

• How much complexity can you manage?

• Any special security requirements?

• What are your anticipated future needs?

• What carriers are in your area?

• Speed costs money; how fast does it need to be (throughput, bandwidth, latency)?

• Where are your locations? Is data centralized or distributed?

WAN Resiliency is like Enlightenment: there are no right answers, only questions.

Can you give me an example of a smaller firm?

• 100 attorneys/200 users total • 8 people in IT • 4 offices

• Washington, DC • Baltimore, MD • New York, NY • Tampa, FA

Vendor Shout Outs (Who helped) • ARG—Design, carrier selection,

contract/price negotiation, and project management of installation

• CDW/Cisco—Design and hardware selection

What were the issues Zuckerman faced?

• Frequent complaints of slowness from remote offices (confirmed by SolarWind high ping times)

• Project to move production environment to a new CoLo facility and move DR/offsite backup to home office

• Circuit Renewal

• Aging Firewalls that needed to be replaced/upgraded in near future

• Aging WAAS equipment

• Long term plans for VDI and video to the desktop

• Long term plan to improve security monitoring (possibly as managed service)

=‘s DR and offsite Backup

=‘s Production environment

=‘s WAN Acceleration (4)

Connectivity Before

DC

DR

TA

BA

NY

Internet

Internet

Internet

Internet

Internet

25Mbps

6Mbps

6Mbps

6Mbps 100Mbps

3Mbps

100Mbps

1024Mbps

20Mbps

=‘s Firewall (6)

HA PAIR

Carrier 1

Carrier 2

Goals for new Resilient WAN?

• Improve user experience (perception of performance)

• faster vs. the absence of slowness

• Seamless failover

• Increase Bandwidth

• Big pipes beats better management

• Reduce total cost of ownership

• Monthly recurring fees for circuits

• Maintenance on equipment

• Reduce equipment footprint/retire aging equipment

• Increase security

• Reduce points of entry

• Reduce time spent managing firewalls

Connectivity After

DC

CoLo

TA

BA

NY

Backup Internet

10

0M

bp

s

10

24

Mb

ps

1024Mbps

10

0M

bp

s

1024Mbps

Internet

40Mbps

30Mbps

30Mbps

30Mbps

3072 Mbps

1024Mbps

TW Teleco

m

ZAYO

BGP

=‘s DR and offsite Backup

=‘s Production environment

=‘s Firewall

Connectivity Today

Any regrets or lessons learned?

• Cisco and GNS3 has a great simulation tools that will allow you to design, test and export

• Survey fiber runs from building’s point of entry to your suites prior to ordering.

• 1 gig and 10 gig capabilities are nice, but “spendy”

• Sometimes the carriers are smarter than you—BGP challenges with 1 carrier

• Installation of circuits will take longer than you think—negotiate billing delays until all circuits are up

Can you give me an example of a larger firm? • 380 attorneys/680 users total

• 16 people in IT • 10 offices

Vendor Shout Outs (Who helped) • Strategic Telecom Partners –

Circuit Specs and consulting • Riverbed – Path Selection

Engineering • Dell – Compellent / Live-Volume • CDW / Cisco – Hardware Selection

and hardware sales

What were the issues Lathrop faced?

• DR Data Center was “Cold” and a plane flight away

• Expensive Point to Point replication circuit

• MPLS Circuit refresh

• “Primary” MPLS Circuits were maxed out while passive sat empty

• 180 second SLA for primary MPLS Circuit outage

• Egress QoS, phone quality, Video Conference Quality

• Long Term Plans for Proven DR/BC

• Long Term Plans for Centralized SIP trunking

• Centralized Data Governance

• Single Points of Failure

Issues Lathrop faced….continued Failure Scenarios

• Environmental – Primary DataCenter Down

• Failover or bring primary DC back up?

• If you failover can you fail back?

• Host Primary MPLS down

• 180 second failover for all regional offices

• Secondary MPLS performance only for all regional offices

• MPLS Cloud / BGP issue exposed

• At mercy of Provider Routing

Primary DataCenter

DR DataCenter “Cold”

Regional Office

Primary Data Path

Failed Cloud with BGP established

Primary DC DR DC

Goals for new Resilient WAN?

• Improve User experience

• Rock solid Voice and Video quality

• Seamless Failover for multiple failure scenarios

• Cost efficiency

• More Bandwidth

• Primary MPLS bandwidth

• DataCenter to DataCenter

• Present Secondary MPLS as aggregate with failover

• More WAN Optimization

• Disaster recovery / Business Continuity

• Active/Active DataCenter

• Seamless Production Migration

Compellent Live-Volume Technology

Compellent Live-Volume Technology

• Used to meet our Disaster recovery / Business Continuity Goals

• Active/Active DataCenter

• Seamless Production Migration

• Live-Volume technology is similar to EMC vPlex, NetApp MetroCluster, or HDS solutions.

• The main concept is that it allows for a LUN to be presented to hosts in both datacenters SIMULTANEOUSLY.

• This allows for vMotion between physical datacenters.

• This capability creates a paradigm shift from DR (Disaster Recovery) to DA (Disaster Avoidance)

• With our Diverse 10Gb WAN ring, this technology helps make our two datacenters look like a single datacenter to our users and upper layer applications.

Synchronous / Asynchronous Replication

10 Gig iSCSI

Lathrop’s Resilient WAN

Active/Active

Regional Office

Any regrets or lessons learned?

• GNS3 Network simulator was a critical tool, real IOS, Labs with YouTube how to videos

• Verify your Provider’s bandwidth

• Link Serialization (Data to Wire speeds)

• Make sure your QoS settings are correct and negotiated with your provider

• Shoot for 100% WAN optimization (it makes a BIG difference)

• Your SNMP 5 minutes average is full of LIES

• Research what is the best solution for YOUR environment, because it may or may not be what vendors/consultants will try to sell you. (IE OTV functionality without OTV). Use a design framework (Zachman).

From the considerations you mentioned earlier, what were your top 5 priorities?

Zuckerman Spaeder

1. Security

2. User Experience

3. Cost of circuits (monthly recurring)

4. Cost of other hardware & maintenance

5. Tolerance for outages or interruptions

Lathrop & Gage

1. DR/BC N+1 construction

2. Bandwidth / QoS

3. Seamless Fail-over; Disaster Avoidance

4. Cost / Utilizing passive circuits/equipment

5. Better Reporting / Avoiding problems beyond our control

We’ll now open it up for questions

Questions

Thank You