Download - Wide Area Network Optimization

®

Wide Area Network Optimization

IntroductionMany of today’s applications are performance-driven and demand everything that the network can provide. Unfortunately, the networks in use today are built upon protocols that were developed one or more decades ago. While technologies march forward, many of these protocols have remained stagnant, which can be the cause of performance issues within networks today.

The networking industry has found ways to mitigate many of the side-effects of the shortcomings of Transmission Control Protocol (TCP) as well as bandwidth-starved and high-latency WAN links. The solution is typically referred to as Wide Area Network (WAN) Optimization. Many different vendors have created WAN optimization solutions and have particular names for the product lines (Cisco WAAS, Citrix WANScaler, Juniper WX, BlueCoat ProxySG, Riverbed Steelhead, Silver Peak NX, etc.).

The Need for WAN Optimization Before delving into WAN optimization technologies, let’s discuss why WAN optimization is needed. TCP is the primary protocol in use on the Internet and LANs today. TCP was developed during the late ‘70s and early ‘80s to be used on ARPANET/DARPANET (the predecessor to the Internet). It has remained relatively static since its initial adoption, while networking demands have changed.

The effects mentioned below tend to be magnified on Long Fat Networks (LFNs), causing even greater user application performance issues.

TCP “Chatter” TCP requires that a connection be established before any data is transmitted. Because of the way that a connection is initiated (a three-way handshake process), this alone can introduce delays within network applications. Many applications will open/close several TCP connections during the transmission of data between the client and server. Modern web-based applications can open up hundreds of connections at a time for a single user.

GTRI WAN Optimization White Paper Issue: 001 Date: 28 APR 2008 Public Information Copyright © 2008 Global Technology Resources, Inc.. All rights reserved. Page 1

GLOBAL TECHNOLOGY RESOURCES INC.® White Paper: Wide Area Network Optimization

Figure 1: TCP three-way connection handshake process

Once the host has determined that it’s done talking, it will close the session via a four-way handshake process. This is additional time and traffic for each TCP connection.

Figure 2: TCP four-way connection teardown process

TCP Slow-start and Window Size Once a connection is established, TCP will send a predetermined amount of data and wait for an acknowledgement before sending any additional data. This is referred to as the TCP window size. This alone makes for slow and inefficient communications between hosts as the typical window sizes are remnants of the dial-up era.

Because of the size of the window field within the TCP header (16-bytes), the maximum window size is limited to 64KB (65,535 bytes). Using the maximum window size, hosts are only able to transfer up to 10Mbps of traffic on a single TCP connection (65,535 bytes * 8 = 524,280 bits / 5 ms = 10,485,600 bps). This means that whether you have a 1Gbps, 100Mbps or 10Gbps connection between the hosts, they’ll only get 10Mbps max out of each TCP connection.

TCP Window scaling overcomes the 64KB window size barrier, allowing for window sizes up to 1GB (1,073,725,440 bytes). Window scaling relies upon both endpoints supporting and properly negotiating this feature. This is unrealistic to rely on such a feature in WAN environment, as so many pieces are out of your control (unless you have control over each intermediary hop between your sites, you may be able to satisfactorily overcome this barrier).

TCP slow-start introduces additional inefficiencies in the way devices communicate. The way TCP was designed, if acknowledgements are received on an on-going basis, the window size is increased in a linear fashion (the congestion window size is increased for each acknowledgement received). If packet loss occurs, the device will reduce the transmission rate by 50%! Once traffic is flowing without any packet loss, it would linearly increase the transmission rate. If packet loss was encountered, it would cut the transmission rate again by 50%. While this design may have been suitable for a low-bandwidth environment, today’s networks offer a tremendous amount of bandwidth making the 50% transmission rate drop unacceptable.


Number of RTTs

Cong

estio

n W

indo

w

Figure 3: Example TCP transmission rate

TCP Acknowledgements Most TCP stacks will send the window-size worth of data and wait for an acknowledgement. If packet loss has occurred, the acknowledgement will include the last received packet. The sending host will resend all data since the last acknowledgement, thus wasting bandwidth. Typically the end-host will have most of the data but a small chunk. Resending the entire chunk is inefficient and unnecessary.

File-Sharing Protocol Overhead The most common protocol used today for accessing network files is the Common Internet File System (CIFS). CIFS utilizes the Server Message Block (SMB) protocol for communicating between hosts. The SMB protocol is extremely chatty, placing a tremendous amount of overhead on the network. As an example, a 47.5KB Word document was opened over a network connection. To open this small file, 64% of the packets were SMB, representing 87.9KB of traffic – for the 47.5KB file! The SMB overhead alone is nearly double the actual file size.


Figure 4: Example file transfer protocol statistics

Many other applications have a similar amount of “chatty” protocol overhead, either within the higher-layer protocol itself (ie. SMB) or inherited from TCP.

WAN Optimization Features While WAN optimization does decrease bandwidth requirements, the main goal is to increase network application performance by decreasing the response time. WAN optimization is about saving time, not bandwidth (although this is a side-effect). Remember that WAN optimization doesn’t create bandwidth, but it does add intelligence to what is sent.

To demonstrate different WAN optimization features, the following hypothetical network will be used.


Figure 5: Simplified Sample Network with WAN Optimization Devices

WAN optimization devices provide an emulation of a LAN-like environment across WAN connections.

WAN optimization devices offer some (or all) of the following benefits:

• TCP “tweaks”

Rather than go through the pain and misery associated with typical TCP slow-start behaviour and small initial window sizes, the WAN optimization devices set a very large initial TCP window size, allowing for greater amounts of data to be transferred without an acknowledgement. When packet loss is detected, the WAN optimization device doesn’t drop its transmission rate by 50%, but by a very small amount (typically 10% or less). The WAN optimization device also notes at what transmission rate the loss was experienced and attempts to “tread” this rate to keep a sustained transfer rate. This eliminates the “see-saw” effect seen on typical TCP connections.

Many WAN optimization devices will also use selective-acknowledgements, eliminating the need to resend entire chunks of data. When packet loss occurs, the WAN optimization device will simply re-send the missing piece of data (not the entire chunk). This decreases re-assembly time and final delivery of the data.

• TCP off-loading

The WAN optimization devices terminate the TCP connections for the local site. A single TCP connection (or a configurable number depending on vendor) is maintained and kept open between the WAN optimization devices, eliminating the typically-required new-TCP-connection-per-client. This is a tremendous time saver!

• Tokenization

As traffic passes through the WAN optimization device, it looks for common (recurring) binary sequences. When it finds a sequence, it will assign a token to the sequence and transmit the token (in place of the sequence) to the remote WAN optimization device, which will replace the received token with the previously-exchanged binary data (returning the packet to its original binary data). By reducing the amount of data sent over the WAN link (ratios differ amongst vendors, but can be from 100:1 up to 300:1 compression ratio), precious time is saved.

• Compression

The traffic can be compressed using proprietary or industry-standard algorithms, such as the Limpel-Ziv (LZ) compression algorithm to provide additional savings in the amount of data transmitted (which directly affects the response time).

• Caching


WAN optimization devices can accelerate the response times of CIFS clients by caching files, directory structures, etc. Unless the file has been pre-loaded on the WAN optimization device, caching won’t help the first user to request the file (this is when the WAN optimization device caches the file). Subsequent users will see a tremendous performance increase, as the file is being read off of the local WAN optimization device. There are numerous CIFS-specific acceleration features available on WAN optimization devices. Other office network services such as print servers can be accelerated with WAN optimization devices. All of this happens transparently to end-users – no proxies or default router changes need occur on the clients. They simply see a high performance increase in the applications they access!

• Buffering

WAN optimization devices offer similar functionality as that found with the Nagle algorithm, allowing the device to buffer short bursts of traffic so that they can be sent in a consistent and efficient manner (while still ensuring that packets are sent in order).

• Quality of Service

Many WAN optimization devices allow for transparent integration within an existing network infrastructure, allowing existing QoS policies to be maintained. Some devices offer QoS features, allowing classifying, marking and rate-shaping network traffic, typically supporting the network (layer-3), transport (layer-4) and application (layer-7), allowing for granular QoS policies. This can allow an organization to control how different applications perform on the WAN, allowing for “managed unfairness”.

• Traffic Avoidance

Why accelerate traffic that is prohibited at the other end of the WAN link? Many WAN optimization devices allow for filtering of traffic (such as web content filtering), allowing for local restriction of traffic that doesn’t comply with the network usage policies within the organization. This can typically be accomplished at the network, transport or application layers, allowing the organization to be very specific on restricting and avoiding prohibited traffic. This can include restricting access to selected files or network shares, URLs or other traffic.

• Mobile Client Support

So far the discussion has surrounded optimizing entire LANs. This is a good start, but much of the workforce today is increasing their mobile dependence and demands. Many WAN optimization devices offer client software that can be installed on mobile user systems, allowing for optimization and acceleration of remote users. Many of the mobile WAN optimization clients are offered on Microsoft Windows-based computers (typically used on mobile workforce laptops or SOHO computers).


• Secure Encryption

Several WAN optimization vendors support encryption of data that is stored on the local hard drive (HD) within the device. Support for encryption of site-to-site traffic is offered by certain vendors as well. This is a requirement for many high-security environments and can be satisfied with a properly-implemented and chosen WAN optimization solution.

• Device Auto-Discovery

Many WAN optimization vendors support auto-discovery of their devices. This is typically accomplished by marking a unique value in the options header field as well as altering the sequence number for traffic between WAN optimization devices. By analyzing these fields, the WAN optimization devices are able to see the presence of other devices in the path and intelligently make a decision on what features to use.

WAN Optimization Implementation Options Since WAN optimization devices affect the traffic between WAN optimization devices (reducing, compressing, tokenizing, etc.), it’s important to ensure that traffic flows through the WAN optimization devices. Keep in mind that WAN optimization devices are required at both ends of the WAN link for a fully-working system. The following outlines several of the typical WAN optimization implementation options:

• In-path (inline)

This is the simplest method. Simply place the WAN optimization device before or after your WAN gateway router so that the different optimizations can take effect. The way to have a redundant configuration is to daisy-chain two WAN optimization devices in serial (should one fail, the remaining device will take over). No load-balancing occurs in an inline implementation.

Figure 6: In-Path (Inline) Example

• Out-of-Path (WCCP)

WCCP was developed by Cisco, but is used across the industry as an ideal way of implementing WAN optimization and content-caching devices. The WCCP (and WCCPv2) protocol is used to communicate between the WAN optimization device(s) and the gateway router. The benefits that WCCP offers is that it’s very easy to have multiple WAN optimization devices at a single site for redundancy, as well as load-balance traffic across different WAN optimization devices. WCCP redirects traffic from the gateway router to the WAN optimization device, which processes the data, sending it to the WAN optimization device at the other


end of the WAN link (the traffic must pass back through the gateway router to the remote site). Some additional configuration steps (typically only required on the gateway WAN router) are required for this method, but they can provide additional features.

Figure 7: Out-of-Path (WCCP) Example

Some vendors require configuring a Generic Routing Encapsulation (GRE) tunnel between the WAN optimization device and the router running WCCP for return traffic from the WAN optimization device to the gateway router. This can add additional complexity and overhead to the network environment, but may be the best choice for certain environments. Many of the advantages and disadvantages of tunnels are discussed later in this document.

• Out-of-Path (Policy-Based Routing)

This is rather invasive and a difficult method to implement. It’s not possible to load-balance across different WAN optimization devices with this method, although it’s possible to have a redundant configuration. Because of the complexity and difficulty to maintain, this method typically isn’t recommended. The implementation design is very similar in “looks” to the WCCP example, but differs greatly in how traffic gets to the WAN optimization device. While this can be used for HA environments, other implementation options are typically recommended.

• Routed (Default Route for LAN using VRRP)

Some vendors support having the WAN optimization device act as the default gateway for the local LAN. This makes the WAN optimization device act as a router and typically uses a protocol such as VRRP to ensure that should the WAN optimization device fail, the real gateway router would takeover and act as the gateway for the LAN.


Figure 8: Routed (Default Route for LAN using VRRP) Example

WAN Optimization Limitations Types of Traffic The majority of WAN optimization vendors support optimization of TCP traffic only. While some vendors support optimization of UDP traffic, it’s imperative to evaluate the performance constraints closely of the WAN optimization device as most UDP traffic is delay-sensitive and might experience issues with the extremely minor delays that a WAN optimization solution might introduce. VoIP traffic is a perfect example of an application that may or may not receive any benefit from a WAN optimization solution.

Network Delays Depending on the method of implementation, minor delays may be introduced into the network. When a WAN optimization device is implemented in-path (inline), a small amount of delay (~10-15 µs) can be introduced for non-optimized traffic. The exact amount of delay varies between vendors and is so insignificant that it might not adversely impact your environment. Other implementation methods (such as WCCP) won’t introduce any additional delays for non-optimized traffic as the router sends only optimized-eligible traffic to the WAN optimization device.

Proper Device Sizing Since a great deal of traffic is processed by each WAN optimization device (if properly configured), the device must be sized appropriately for your environment. Each WAN optimization device is designed to support a number of TCP connections and is rated for a certain throughput. If the device isn’t sized properly and a device doesn’t support the required number of TCP connections or throughput rate, some of your traffic will receive the benefits of WAN optimization, while other traffic that exceeds the capacity of the WAN optimization device will not receive any benefits.

High-Availability WAN optimization can be implemented in high-availability (HA) environments. Depending on the selected implementation, fault-tolerance and high resiliency can be


achieved, allowing for redundancy should a WAN optimization device fail or exceed the number of supported TCP connections. It’s important for HA environments to ensure that the implementation method chosen is the best for their HA situation.

GRE and/or IPSec Tunnels Some WAN optimization devices support or require tunnels between devices (to/from the gateway router, to/from remote sites, etc.). There are pros and cons of tunnels, many of which are mentioned below. Most of these drawbacks have to do with the fact that the original headers are encapsulated within new headers (to tunnel the data).

• “Hidden” Payload

When tunnelling, the original packet is encapsulated within another header, requiring two decapsulations when reading the original payload. This can cause issues for traffic accounting systems, Intrusion Detection Systems (IDS)/Intrusion Prevention Systems (IPS) solutions as well as firewalls as these devices aren’t able to see within the encapsulated payload.

This may or may not be a real drawback as WAN optimization devices inherently change the original payload when it traverses the WAN (compression, tokenization, etc.). If an IDS/IPS device is between the WAN endpoints, it’s possible that the IDS/IPS won’t provide any benefit for optimized traffic (or worse yet, cause false-positives on the optimized traffic).

• Additional QoS configuration requirements

Since the original packet is encapsulated, the QoS markings are lost. These must either be replicated by the WAN optimization device or the device must support appropriately classifying and marking traffic to ensure that the QoS policies are enforced throughout the network environment.

• Complexity and management overhead

Typically two tunnels are required between each site that will be involved in WAN optimization. This means that for two sites, Site-to-site tunnel configurations require 2 x (number_of_sites - 1) tunnels for the environment. If the environment only involves two or three sites, this isn’t that much of an administrative burden (2 or 4 tunnels). Imagine an implementation involving a dozen sites – this would require 22 tunnels for a fully-meshed WAN optimization network. This tunnel requirement could be reduced by redesigning the network to be in a hub-and-spoke configuration, where each remote site talks only to a core (hub) site. With this type of a configuration the formula to calculate the number of tunnels is (number_of_sites – 1). In the example of a 12-site environment, only 11 tunnels would be required for a hub-and-spoke design. When it comes to troubleshooting, this adds in another layer of troubleshooting steps (to troubleshoot the tunnels).

If QoS is used, another point within the network will be involved in the classification and marking of traffic. QoS best-practices dictate classifying and


marking traffic as close to the source as possible. Typical WAN gateway routers will react upon the already-marked traffic utilizing QoS traffic rate management features. Introducing QoS classification and marking into a border device such as a WAN optimization device means that should the access-layer device QoS configuration change, the same changes will need to be replicated to the WAN optimization devices. If this occurs on different platforms from different vendors this could require additional skillsets and teams within the organization.

• Encryption

If site-to-site traffic encryption is a requirement for your environment, tunnels will be required at some point in your network. The WAN optimization devices that create tunnels to/from other WAN optimization devices can typically enable encryption very easily, as tunnels are already created and managed – simply tell the hardware (or software depending on the vendor) to start encrypting the traffic traversing the tunnel.

Without the option of creating tunnels terminating to/from the WAN optimization devices, it becomes necessary to provide encryption via other devices (firewalls, routers, etc. terminating IPSec tunnels).

WAN Optimization for Mobile Users If your environment has needs for optimizing mobile users (remote laptop or desktop users), it’s important to assess what kind of operating systems you’ll be optimizing. Many vendors today support only MS Windows as the client operating system, although most are working on clients for Apple OS X and Linux. Depending on the operating systems used within your organization, you may or may not be able to deploy support for your mobile users.

Partnering for Success with Global Technology Resources, Inc.

Global Technology Resources, Inc. (GTRI) was founded in 1998, and has quickly become one of the leading high-end solutions providers and technology consulting firms in the U.S. At GTRI we leverage our experience for your business. Our seasoned consultants have an average of 10 years of experience in IT and networking combined with project experience ranging from small business to Fortune 1000 and carrier class networks.

GTRI has developed a framework (called the Strategic Delivery Framework, or SDF for short) that eliminates risk for our customers and ensures a successful project. This framework has several different phases, a couple of phases key to a WAN optimization project being the envisioning, proposal and planning phases. WAN optimization projects benefit greatly not only from the GTRI real-world experience, but also from the holistic way that we approach the project using the SDF.


GTRI offers assistance in the selection of WAN optimization solutions, industry-leading implementation services as well as day-zero support (offered by the GTRI GlobalSure team). This full-cycle approach ensures the following:

• The best solution will be selected for your environment

• The implementation will meet and/or exceed industry best-practices

• Ongoing 24x7x356 support can be achieved through the GTRI GlobalSure program

• Future optimizations your organization may need can easily be accommodated by skilled GTRI professionals

GTRI WAN Optimization Solutions Since there are numerous WAN optimization vendors in the marketplace today, features differ amongst vendors and models. This can increase the complexity in choosing a suitable WAN optimization solution. It’s critical to choose a partner that thoroughly understands the technologies involved with a WAN optimization solution.

GTRI is well-versed with WAN optimization technologies from the leading vendors. By working with GTRI, you’ll be able to ensure that the solution you choose meets your needs and the benefits it can provide to your organization. Be sure to discuss your particular environment and needs with your local GTRI professional to begin the SDF process towards the selection of a WAN optimization solution that will meet your requirements. The SDF can reduce project risk by going through the proper steps (start at the envisioning phase, working into the proposal and planning phases).

GTRI maintains a testing lab within the GTRI headquarters office that can demo several different WAN optimization solutions. If you’ve decided that you’re ready to see a practical application of WAN optimization technologies within your actual environment, GTRI has a mobile demo kit containing WAN optimization devices and routers, allowing for simulation of typical WAN links and showing the benefits of WAN optimization technologies within your environment! No settings need be changed on your network – simply plug the kit into your network and it can obtain an IP address automatically. A laptop can be connected to the demo kit, simulating a client on the remote side of the simulated WAN circuit. The software you use in your environment can be installed and executed on this laptop (or laptops) to show what WAN optimization can provide for your environment. Contact GTRI today to have a free WAN optimization demo shown at your location today!

Summary User perception is king – if users are experiencing unacceptable application response times, both user productivity and morale will suffer. Before contacting your service provider to see about a larger WAN connection, evaluate WAN optimization. A good WAN optimization solution can make a T1 appear like a T3 connection, with


no additional recurring line costs. Remember that “big pipes” (LFNs) still suffer from the same protocol deficiencies as “smaller pipes”.

WAN optimization solutions address many different aspects of poor network application performance, without deviating from the TCP RFCs. WAN optimization can be configured in a non-intrusive manner, allowing your organizations’ existing QoS and firewall policies to remain in place (no major changes required). This simplistic, non-intrusive design also reduces the implementation effort, the complexity of the environment and the ongoing maintenance.

Other implementations may desire to alter the network topology slightly by implementing a solution that utilizes tunnels. Additional benefits can be gained with this type of setup or they may not be best for your environment. GTRI can help you determine the best solution for your environment.

The integration of WAN optimization components is greater than the sum of the features. This is why it’s critical to have a thorough knowledge of WAN optimization features and solutions as well as the underlying network topology and application requirements. GTRI has highly trained professionals who can provide guidance in the selection of a WAN optimization solution, industry-leading professional implementation services and unique operational tools (SDF) to ensure a successful project.

For more information, please contact:

990 South Broadway, Suite 400

Denver, CO 80209

Email: [email protected]

Phone: (877) 603-1984 (toll-free)

(303) 455-8800 (Colorado)


®