BlueCoat ISP

Best Practices for ISP Deployment >

White Paper

1 < >

Best Practices for ISP Deployment

Table of ContentsNetwork Design and Deployment Recommendations 2

1.1Design goals and assumptions 2

1.2 Cache farm components and design 3 1.2.1 Consumer Edge routers and Policy Based Routing 4 1.2.2 L4-7 Load Balancing Switches 4 1.2.3 L2 Switches 5 1.2.4 Proxy SG 5 1.2.5 Designing for Bandwidth Gain 5

Proof of Concept 6

2.1 PoC 6 Rollout and Service-In 8

Staging Environment and Change Management 9

4.1 Staging environment 9 4.2 Change management 9

Operations 9

5.1 Monitoring 9 5.2 Bypass Lists 10 5.3 Common issues and resolutions 10

Appendices 11

Appendix A. Recommended Settings for Service Providers 11

Trust Destination IP (SGOS version 5.2 or above) 11 HTTP Proxy 11 HTTP Proxy Profile 13 Connection Handling 13 SGOS Feature 17

Appendix B. Tuning Options 17

Investigation and Reporting 17 Tuning 17

Appendix C. Alternative designs 18

C.1 WCCP 18

Appendix D. Vendor Specific Options 20

D.1 F5 20 Overview 20 Cache Load Balancing 21 Optimization options 30

2 < >


Introduction

Blue Coat appliances are deployed primarily as a caching solution in Internet

Service Provider environments globally to improve user experience and

upstream bandwidth savings. Some ISPs also enable the content filtering

option for regulatory or cultural reasons. Deployments cover from small ISP to

Tier-1 providers that utilize dozens of Blue Coat ProxySG appliances.

This document presents recommended design and deployment best practices.

Configuration settings specific to ISPs are explained. Networks that deviate

from this best practice may result in less than optimal results in bandwidth

savings, stability, or overall performance. Some design alternatives are

presented in the appendix along with the caveats of such alternatives.

Network Design and Deployment Recommendations

1.1 Design goals and assumptions

Internet Service Providers deploy Blue Coat ProxySG devices in order to save

upstream bandwidth costs and improve the end-user experience. The goal of

this design and deployment best practice is to satisfy these requirements while

delivering a scalable and robust cache farm that requires minimal changes to

the existing network, complete transparency to end-users and servers, and can

be supported with the least possible impact to ongoing operations.

In most ISP networks bandwidth gain ranges from 20-30% but is highly

dependent on design decisions such as load balancing algorithms, the nature

of traffic and web access amongst an ISP’s users, as well as the sizing of the

ProxySG devices. Designing for bandwidth gain will be discussed later in this

section.

The assumption with this best practice design is that the ISP network follows

a topology similar to the diagram below. The cache farm should be deployed

at a point in the network that is a logical choke point to redirect web traffic

into the farm while minimizing the chances of asymmetric routing. Blue Coat’s

recommendation is to utilize Policy Based Routing on the core routers that

connect the consumer edge (CE) network. Port 80 (HTTP) egress and ingress

(bi-directional) traffic is routed from the CE routers to L4-7 load balancing

switches that then distribute the HTTP traffic to a “farm” of proxy/caches. The

CE routers are recommended because in most networks they provide a choke

point without Asymmetric routes. The cache farm can also be tuned to the

requirements of the specific consumer network behind the CE routers.

3 < >


PBR and connection to L4switches should be done onconsumer edge (CE) routers

Access Edge

Edge Router

ADSL

CE PE BGP Perring,IP Core,Internet Exchange

Core

1.2 Cache farm components and design

As noted above the best practice design utilizes Policy Based Routing to L4-7

load-balancing switches that are connected to a cache farm. The design should

include the following:

-> L4-7 switches should be deployed in a redundant configuration off the core switches/routers

-> PBR should be routed to a floating IP/network address on the L4-7 switches for both ingress and egress traffic

-> L2 switches should be used between the L4-7 switches and the ProxySGs

-> ProxySGs should be dual homed to the L2 switches

-> L4-7 switches should be configured with a virtual IP (as in VRRP) to be used as the default gateway for the ProxySGs in the cache farm

-> L4-7 switches should be configured to use destination based load balancing

Based on the ISP topology above the cache farm topology should follow the

design shown below. Solid lines represent primary links and dotted lines

represent backup links.

4 < >


PBR on consumeredge (CE) routersEdge

CE

VIP VIP

PE

Core

BGP Peering,IP Core,

Internet Exchange

1.2.1 Consumer Edge routers and Policy Based Routing

In order to eliminate ongoing changes or updates to configuration of core

routers the PBR setting should simply forward all port 80 traffic to the VIP of

the L4-7 switches. Any bypassing of port 80 traffic should be performed by the

L4-7 switches and/or the ProxySGs. PBR will need to be configured both for

egress and ingress.

In some deployments it may be required to only perform PBR on specific

subnets – for example, only the consumer DSL subnets but not the enterprise

customer subnets. Best practice is to perform PBR on the CE routers which

would avoid this issue since each set of CE routers would already be dedicated

to the different types of customers.

Important: Locally hosted Akamai servers or similar internal CDN

infrastructure should be excluded from the PBR.

Configuration notes for specific routers may be found in the Appendix.

1.2.2 L4-7 Load Balancing Switches

The L4-7 switches should be capable of load balancing based on destination.

Optimal caching performance will be obtained if destination is URI or

component of URI instead of destination IP.

L4-7 switch load balancing must be flow based instead of packet based.

Packets forwarded from the L4-7 switches must include the client IP address.

5 < >


Proxies should use the L4-7 switches as default gateway. L4-7 switch must be

capable of passing the return traffic to the correct proxy.

Configuration notes for specific L4-7 switches may be found in the Appendix.

1.2.3 L2 Switches

L2 switches are used between the L4-7 switches and Proxy SGs to provide the

active-standby connectivity, future expansion and cost savings. Spanning tree

should not be necessary on the L2 switches and not recommended due to the

extra overhead. This implies of course that the L2 switches are not directly

interconnected but that the L4-7 switches and Proxy SGs are dual-homed to

each L2 switch.

1.2.4 Proxy SG

SGOS version

Blue Coat recommends that all service providers run SGOS4.2 (the current

release as of 10/1/2008 is SGOS4.2.8.6).

ISP specific configuration

There are some configuration options of the Proxy SG recommended for ISPs

that veer from ProxySG default settings. Please refer to the Appendices for an

detailed review of the recommended changes.

1.2.5 Designing for Bandwidth Gain

Caching is a statistics game. The Internet contains billions and billions of

objects. Each proxy can only cache a finite amount of content; in order to

provide optimal performance and bandwidth savings the most popular content

must be effectively distributed across a largest number proxies possible.

Cache performance is a function of total number of disk spindles, memory, and

CPU. The bandwidth savings of the entire cache farm is tied to total number of

unique web objects the farm can cache, which in turn is tied to the total disk

spindles. Memory effects both how much data can be cached in RAM as well as

how many connections can be simultaneously handled. CPU dictates the total

throughput of an individual proxy. Sizing should be done to satisfy throughput

requirements as well as caching requirements. The solution must be capable

of handling peak HTTP load, and ideally there should be enough disk space to

store the equivalent of 2-3 days of HTTP traffic.

6 < >


Caching Strategy

The load balancing solution should be capable of load balancing based

on destination, ideally based on components of the URL rather than by IP

destination. This prevents duplicate content from being distributed across the

caching layer. Additionally, the load balancing layer can be used to implement

and refine how the caches are deployed to maximize bandwidth gain. Refer to

the tuning section of the Appendices for more details.

Platform Selection

Based on the above requirements there are two Blue Coat models that make

sense for service providers, the 810-25 and 8100-20. Each has different pros

and cons:

8100-20• Pros

˚ 8 disks in a single 4U chassis

˚ 4 CPU cores allow 2-3x the throughput of the 810-25

• Cons

˚ Half the disk density of the 810-25 – 8 disks in 4U vs. 16

810-25• Pros

˚ Allows for the greatest disk density – each system provides 4 spindles in a single rack unit

˚ 2 CPU cores allow peak throughput of 100-150 Mbps

• Cons

˚ Single Power supply

˚ More systems to manage as compared to 8100 platform

Proof of Concept

2.1 PoC

Proof of Concept testing should focus on validation of interoperability,

functionality, basic management, and troubleshooting. Performance testing

should be out of scope for Proof of Concept testing. Information regarding QA

and load testing done for Blue Coat’s products can be provided under NDA.

Due to the nature of testing Blue Coat functionality and features in an ISP

environment it is impractical to attempt to do any sort of simulated testing.

7 < >


It’s simply not possible to generate traffic which simulates real users and

real internet servers. A simulated test can only give a general sense of how

much load a platform might handle. Bandwidth gain and precise performance

measurements could never be evaluated. Additionally, every ISP has slightly

different environments, routers, switches, L4-7 redirection gear. Generally

speaking this equipment is very expensive equipment that cannot be easily or

effectively duplicated in a lab. Blue Coat regularly does simulated tests internally

and information on such tests can be provided to a customer as needed.

However, if part of the POC criteria requires testing under load, Blue Coat

strongly recommends using live traffic. The POC can follow the rough outline

illustrated here.

Lab Testing

In the event that a lab environment is maintained with identical equipment to the

production routing and switching environment then some basic lab integration

testing is recommended. The intent of this testing is to make sure that all the

pieces of equipment work together with the recommended software revisions.

This step is only useful if identical equipment exists in the lab, testing on smaller

versions of the same type of equipment will typically lead to misleading results.

It can be done, but the experience learned from that equipment should be

interpreted with caution; it should not be assumed that lessons learned on lower

classes of equipment apply directly to the high end.

Initial installation

The proxy should be installed into its final resting place in the network. Because

traffic must be redirected to the proxy it can do no harm in this location in the

network. At this point basic connectivity to the internet and client networks should

be established. Connections to the management network should also be set up at

this time.

Explicit Proxy Testing

The proxy should be configured with a rule set to deny all traffic except specific

source subnets. Explicit proxy should then be enabled. Basic load testing should

be done to ensure that the proxy has adequate connectivity to DNS, default and

backup routes, etc. Blue Coat typically recommends Web Timer as a simple tool to

do this sort of testing. This can also be used to illustrate performance gains and

bandwidth savings with a small group of web sites.

8 < >


L4-7 testing

Testing basic L4-7 functionality will have some minimal impact on the

production network; the degree of impact depends on the topology chosen. As

a best practice Blue Coat recommends using policy based routing functionality

to route traffic to the L4-7 switching devices. At this point a specific test

network or set of IP addresses should be identified. In this case a test network

means a network without any real user traffic on it, solely testing workstations.

A policy based route must be added to the production routers to redirect port

80 traffic to the test network to the L4-7 device.

This should generally be configured in stages:

1st stage redirects traffic with a single PBR for outbound port 80 traffic. IP

spoofing is not enabled on ProxySG. Basic L4-7 functionality and load balancing

is then tested, things like failover can also be tested at this stage.

2nd stage adds a second PBR for source port 80 return traffic. This tests to

make sure the L4-7 device is returning the L4-7 traffic to the correct Blue Coat

device and that there are no asymmetric routes.

End to End Functionality testing

At this stage the customer should test from their test network all the basic

functionalities expected from Blue Coat.

Customer testing:

Identify specific customer subnets – redirect them, if desired during off-peak

hours. Monitor via access log, statistics, and actual end user experience from a

real client if possible.

Rollout and Service-In

If the PoC was conducted on the production network the service-in is

straightforward as the equipment is already in place.

Deploying into the live network follows the same steps recommended above for

Proof of Concept.

During the PoC or the rollout it is recommended that operations and support

procedures be validated. Common items to test are:

-> Software upgrade and downgrade/rollback

-> Configuration changes (see the Staging Environment and Change Management section below)

9 < >


-> Troubleshoot and resolve a « problem website » using bypass lists (see the Operations section below)

-> SNMP traps for CPU load and memory pressure on the Proxy SGs

-> Failover and failback scenarios in the overall caching infrastructure (load balancer outage, proxy outage, etc.)

Staging Environment and Change Management

4.1 Staging environment

Blue Coat highly recommends that all service provider solution also have a

staging environment. This staging environment does not necessarily need

to replicate the scale and completeness of the production network but can

be a smaller Proxy SG where configuration changes can be validated before

deploying into the live network.

The Proxy SG used for staging can be:

-> standalone in a lab where a client PC is placed on the LAN-side port and transparently proxied to the Internet with the test configuration

-> deployed in the live network but only available to the tester using explicit proxy

-> deployed in the live network along side the production caches and the test host or network can be explicitly load balanced to it

4.2 Change management

Blue Coat highly recommends a strict change management policy combined

with the above staging environment to ensure minimal disruption to the

production network.

Change management should encompass:

-> SGOS version upgrades

-> Proxy SG configuration changes

-> Bypass list changes/updates

-> Load balancing configuration changes

Operations

5.1 Monitoring

ProxySG comes with industry standard SNMP v1, v2 and v3 support. Private

MIBs are available for download in BlueCoat website. Monitoring of ProxySG is

fully described in document “Critical Resource Monitoring of the ProxySG”.

10 < >


5.2 Bypass Lists

5.3 Common issues and resolutions

Content Refresh

The Adaptive Refresh algorithm within ProxySG is the only technology in the

industry that develops a “Model of change” for every Web object in its store.

It also develops a “Model of use” based upon that object’s history of being

requested by users. It then combines these two pieces of information to

determine the refresh pattern appropriate to that object. (The values derived by

the algorithms also adapt to changes produced by these Models over time.)

Using the Adaptive Refresh algorithm the accelerator automatically performs

“freshness checks” with the origin server to ensure that old content is

expunged and replaced with fresh content. For example, if the objects within the

www.nbcnews.com home page are popular among the population of users that

are accessing the accelerator, ProxySG will update the objects that change (e.g.,

“top story” object) but will not refresh those objects that do not change (e.g.,

“NBC logo” object). This ensures that current content will be delivered to end

users quickly.

If an end user does complain about stale content is served from certain

websites, ISP administrator can put website URLs onto local bypass list. The

down side of it is ProxySG will not cache content from those websites. Another

way to handle this situation is to develop CPL to verify each request to these

sites. If content is same as the cached one, ProxySG serves the cached content.

If not, the latest content is pulled from the website.

Sample CPL to verify content:

; ---------------start here-----------------

define condition CHECK-HOSTS url.domain=cnn.com url.domain=www.bbcnews.com url.domain=windowsupdate.microsoft.com end <cache> condition=CHECK-HOSTSalways_verify(yes)

; ---------------start here-----------------

Note: Use “Always Verify” only for problematic websites

11 < >


Appendices

Appendix A. Recommended Settings for Service Providers

Trust Destination IP (SGOS version 5.2 or above)

This is a feature to trust the client provided ip address provided in HTTP request

and not to perform DNS lookup on it. This will have impact on cache store

performance as it IP address is used as reference instead of domain name. It

means same web content in load-balanced website with multiple IP addresses

can be stored multiple times as it is considered different objects. Cache hit ratio

can also be affected for ISP’s customers that are used to input IP address in

browser instead of proper domain URL.

In the case where the load balancing solution is packet based (not flow based)

such as WCCP, this feature is required.

Best Practice:-> Do not turn on unless it is required.

HTTP Proxy

always-verify-source – Ensures that every object is always fresh upon access.

This has a significant impact on performance because HTTP proxy revalidates

requested cached objects with the OCS before serving them to the client,

resulting in a negative impact on response times and bandwidth gain. Therefore,

do not enable this configuration item unless absolutely required.

max-cache-size – The maximum size, in megabytes, of the largest object that

can stored on the ProxySG. The max-cache-size sets the maximum object size

for both HTTP and FTP.

Refresh Bandwidth – The ProxySG uses as much bandwidth as necessary for

refreshing to achieve the desired access freshness – 99.9% estimated freshness

of the next access. The amount of bandwidth used varies depending on client

demands. If you determine that the ProxySG is using too much bandwidth (by

reviewing the logged statistics and examining current bandwidth used shown in

the Refresh bandwidth field), you can specify a limit to the amount of bandwidth

the ProxySG uses to try to achieve the desired freshness.

This setting should only be changed from default (Let the ProxySG Appliance

manage refresh bandwidth) to limit bandwidth to if ISP upstream bandwidth is

at its premium.

12 < >


Best Practice• Always-verify-sourceshouldbedisabled(default)

• ForSG8100modelthatuses300Gharddrive.ChangeSetMax-cache-sizeto100MB.Thisallows more objects to be cached in each hard disk.

• ForSG8000modelthatuses37Gharddrive.ChangeSetMax-cache-sizeto10MB.

• UsedefaultforRefreshBandwidthbutmodifyifnecessary

Pragma-no-cache – The pragma-no-cache (PNC) header in a client’s request

can affect the efficiency of the proxy from a bandwidth gain perspective. If you

do not want to completely ignore PNC in client requests (which you can do

by using the Substitute Get for PNC configuration), you can lower the impact

of the PNC by enabling the revalidate-pragma-no-cache setting. When the

revalidate-pragma-no-cache setting is enabled, a client’s non-conditional

PNC-GET request results in a conditional GET request sent to the OCS if the

object is already in the cache. This gives the OCS a chance to return the 304

Not Modified response, thus consuming less server-side bandwidth, because

it has not been forced to return full content even though the contents have not

actually changed. By default, the revalidate PNC configuration is disabled and is

not affected by changes in the top-level profile. When the Substitute Get for PNC

configuration is enabled, the revalidate PNC configuration has no effect.

Note: The revalidate pragma-no-cache setting can only be configured through

the CLI.

Best Practice-> Enable revalidate-pragma-no-cache from CLI

Byte Range Support – With byte-range support enabled, if the object is already

cached and does not need to be reloaded from the OCS, the ProxySG serves the

byte-range request from the cache only. But if the object is not in the cache,

or if a reload of the object is required, the ProxySG might treat the byte-range

request as if byte-range support is disabled and serve the object from the

cache. If byte-range support is disabled, HTTP treats all byte-range requests

as non-cacheable. Such requests are never served from the cache, even if the

object exists in the cache. The client’s request is sent unaltered to the OCS and

the response is not cached. Thus a byte-range request has no effect on the

cache if byte-range support is disabled.

Best Practice-> Enable byte range support which is default

----------------------------------------------------------------------------------

13 < >


HTTP Proxy Profile

Three preset profiles, Normal, Portal, and Bandwidth Saving, are selectable

from SG. Each profile includes different settings on HTTP proxy behavior.

Bandwidth Saving profile is recommended for ISP environment.

Best Practice-> Enable Bandwidth Saving Profile

-> Disable Substitute GET for PNC (Pragma no cache)

-> Disable Substitute GET for IE (Internet Explorer) reload

----------------------------------------------------------------------------------

Connection Handling

Bypass Lists

A bypass list can be used to completely skip all ProxySG processing of requests

sent to specific destination hosts or subnets. This prevents the appliance

from enforcing any policy on these requests and disables any caching of the

corresponding responses, so it should be used with care. A bypass list allows

traffic to pass through to sites as-is when servers at the site are not properly

adhering to protocol standards or when the processing in the ProxySG is

otherwise causing problems. The bypass list contains IP addresses, subnet

masks, and gateways. When a request matches an IP address and subnet mask

specification in the bypass list, the request is sent to the designated gateway

and is not processed by the ProxySG.

Three types of bypass lists are available:

-> Local

-> Central

-> Dynamic

Local Bypass List

The gateways specified in the bypass list must be on the same subnet as the

ProxySG. The local bypass list limit is 10,000 entries. The local bypass list

contains a list of IP addresses, subnet masks, and gateways. It can also define

the default bypass gateway to be used by both the local bypass list and central

bypass list. The gateways specified in the bypass list must be on the same

subnet as the ProxySG. When you download a bypass list, the list is stored in the

appliance until it is replaced by downloading a new list.

14 < >


Central Bypass List

The central bypass list is usually a shared list of addresses that is used by

multiple ProxySG Appliances. Because each ProxySG appliance can be located

on a different subnet and can be using different gateways, the central bypass

list should not contain any gateway addresses.

The gateway used for matches in the central bypass list is the gateway specified

by the bypass_gateway command in the local bypass list. If there is no bypass_

gateway option, the ProxySG uses the default gateway defined by the network

configuration.

Dynamic Bypass

Dynamic bypass, available through policy (VPM or CPL), can automatically

compile a list of requested URLs that return various kinds of errors. The policy-

based bypass list is maintained in the Forward Policy file or Local Policy file. A

bypass list can be used to completely skip all ProxySG processing of requests

sent to specific destination hosts or subnets. This prevents the appliance

from enforcing any policy on these requests and disables any caching of the

corresponding responses, so it should be used with care. A bypass list allows

traffic to pass through to sites as-is when servers at the site are not properly

adhering to protocol standards or when the processing in the ProxySG is

otherwise causing problems.

Dynamic bypass is most useful in ISP environment. It keeps its own (dynamic)

list of which connections to bypass, where connections are identified by both

source and destination rather than just destination. Dynamic bypass can be

based on any combination of policy triggers. In addition, some global settings in

HTTP configuration can be used to selectively enable dynamic bypass based on

specific HTTP response codes. Once an entry exists in the dynamic bypass table

for a specific source/destination IP pair, all connections from that source IP to

that destination IP are bypassed in the same way as connections that match

against the static bypass lists.

With dynamic bypass, the ProxySG adds dynamic bypass entries containing the

specific source/destination IP pair for sites that have returned an error to the

appliance’s local bypass list. For a configured period of time, further requests

for the error-causing URLs are sent immediately to the origin content server

(OCS), saving the ProxySG processing time. The amount of time a dynamic

15 < >


bypass entry stays in the list and the types of errors that cause the ProxySG to

add a site to the list, as well as several other settings, are configurable from the

CLI. Please refer to Blue Coat Configuration and Management Guide for detail.

Once the dynamic bypass timeout for a URL has ended, the ProxySG removes

the URL from the bypass list. On the next client request for the URL, the

ProxySG attempts to contact the OCS. If the OCS still returns an error, the URL

is once again added to the local bypass list for the configured dynamic bypass

timeout. If the URL does not return an error, the request is handled in the

normal manner.

Default Setting

Dynamic Bypass is disabled

server_bypass_threshold = 16

max_dynamic_bypass_entry = 16,000

dynamic_timeout = 60

Best Practice-> Enable dynamic bypass with trigger for connection and receiving errors.

-> Adjust default settings to suit network condition.

-> Individual HTTP response code(e.g 404 ) can be configured as trigger if necessary.

HTTP Timeout

You can configure various network receive timeout settings for HTTP

transactions. You can also configure the maximum time that the HTTP proxy

waits before reusing a client-side or server-side persistent connection. You

must use the CLI to configure these settings. Default HTTP Receive Timeout

for client is 120 seconds, for server is 180 seconds. Default HTTP Persistent

Connection Timeout for client is 360 seconds server and for server is 900

seconds. It is recommended to lower the timeout values in ISP environment so

that ProxySG can reuse its client/server workers effectively.

Best Practice-> http persistent-timeout server 20

-> http persistent-timeout client 10

Error Handling

Default proxy behavior when there are TCP timeouts or DNS failures is to

display an error page generated by the proxy. In an ISP environment it is

preferred to instead terminate the connection and let the end user’s browser

display an error page. The following sample CPL code is used to achieve this:

16 < >


; ---------------start here-----------------

<exception>terminate_connection(yes)<exception>exception.id=tcp_error terminate_connection(yes) <exception>exception.id=!policy_denied terminate_connection(yes) <exception>exception.id=!policy_denied;!user-defined.my_exception terminate_connection(yes) <exception>exception.id=!policy_denied;!user-defined.all terminate_connection(yes)

; ---------------start here-----------------

Attack Detection (5.x Proxy Edition)

The ProxySG prevents attacks by limiting the number of simultaneous TCP

connections from each client IP address and either does not respond to

connection attempts from a client already at this limit or resets the connection.

It also limits connections to servers known to be overloaded. You can configure

attack detection for both clients and servers or server groups, such as http://

www.bluecoat.com. The client attack-detection configuration is used to control

the behavior of virus-infected machines behind the ProxySG. The server attack-

detection configuration is used when an administrator knows ahead of time that

a virus is set to attack a specific host. This feature is only available through the

CLI. You cannot use the Management Console to enable attack detection.

It is very common for ISP residential customers to install broadband share

router. ISP has to mitigate with their accept use policy or terms of service to

decide what the connection limit should be. The global default for maximum

client connection is 100. If only HTTP traffic is redirected to ProxySG via L4-7

switch or other methods, connection limit is for HTTP only. If ProxySG is

deployed in-line, connection limit includes all TCP connections.

-> Enable Attack Detection for client and modify default connection-limit in CLI based on ISP internal policy.

-> Different connection-limit can be configured for client IP or subnet. If corporate customers are on one IP range, they can be set to have a higher connection limit.

----------------------------------------------------------------------------------

17 < >


SGOS Feature Table

SG OS 4.x SGOS 5.2 SGOS 5.3 SGOS Mach5

Attack Detection Yes Yes Yes After 5.2.4.8

WCCP, L2 Forward, GRE Return Yes No No No

WCCP, L2 Forward, L2 Return No Yes Yes Yes

Truest Destination IP No Yes Yes Yes

OverLoad Bypass No No Yes No

Appendix B. Tuning Options

Tuning is a process. It is typically a combination of making modifications to the

redirection layer combined with modifications to the caching policies on the

ProxySG devices. The basic tuning processes are described here along with

some high level examples. Due to the advanced investigation and configuration

required during this process it’s recommended that tuning be handled using

professional services from qualified partners and/or the vendors.

Investigation and Reporting

No tuning can be done without an investigation into the types of traffic passing

through the Blue Coat proxies. While continuous access logging and reporting is

not recommended in a service provider environment, it can be a powerful tool to

use on an ad hoc basis for tuning.

While the ProxySG default log format can be used for analysis, in depth analysis

would benefit from the following format:

date time time-taken rs-time-taken c-ip sc-status s-action sc-bytes cs-bytes sr-bytes rs-bytes cs-method cs-uri-scheme cs-host cs-uri-port cs-uri-path cs-uri-query s-hierarchy s-supplier-name rs(Content-Type) cs(User-Agent) s-ip

A general picture of the HTTP traffic profile can be determined with just a

day’s worth of log data. More than this will just burden the Reporter server. A

bandwidth gain analysis profile in Blue Coat Reporter can then be used for in

depth analysis of sites, content types, and caching ability.

Tuning

The investigation stage will highlight the Internet sites meriting further tuning.

In some cases the discoveries here will influence changes to the load balancing

18 < >


layer, the cache configuration, or both. Sites responsible for large amounts of

traffic such as YouTube.com may merit dedicated caching devices – devices

dedicated to specific sites can achieve much higher hit rates then devices where

which have the entire internet competing for cache space. Certain sites may

benefit from different load balancing hashes as well.

An example of this would be separating out a dedicated set of proxies for

caching YouTube.com. The YouTube host or IPs can be identified at the

redirection layer, balancing traffic across those proxies. Because of the

way YouTube structures their URIs a standard hash based load balancing

approach would be sub-optimal, a load balancing scheme which supported

sending the same video to the same proxy is the most important factor. On the

proxies themselves additional policies are required to make YouTube’s videos

cacheable.

Appendix C. Alternative designs

Design options discussed in this appendix are not recommended by Blue Coat

but are discussed because in some environments they are unavoidable. Issues

such as sub-optimal bandwidth gain are likely to occur when deployments do

not follow best practice.

C.1 WCCP

WCCP is a Cisco®-developed protocol that allows you to establish redirection of

the traffic that flows through routers.

Common reasons for using WCCP are:

-> Legacy. Cisco routers are already deployed in the network and other load balancers are not considered a valid option.

-> Scalability. With no reconfiguration overhead, redirected traffic can be automatically distributed to up to 32 ProxySG Appliances.

-> Redirection safeguards. If no ProxySG Appliances are available, redirection stops and the router forwards traffic to the original destination address.

WCCP has two versions, version 1 and version 2, both of which are supported by

Blue Coat. However, only one protocol version can be active on the ProxySG at a

time. The active WCCP protocol set up in the ProxySG configuration must match

the version running on the WCCP router.

19 < >


WCCP version 2 is preferred as it supports multiple service groups and

all redirection of other TCP ports and protocols. By default, Cisco’s GRE

encapsulation (Generic Routing Encapsulation) is used to forward packets

from the WCCP router to the caches. If you have a version 2 WCCP router,

you can alternatively use Layer 2 (L2) rewrites to forward packets, which is

faster than GRE and saves network bandwidth. Using GRE, redirected packets

are encapsulated in a new IP packet with a GRE header. Using L2, redirected

packets are not encapsulated; the MAC address of the target cache replaces the

packet’s destination MAC address. This different way of directing packets saves

you the overhead of creating the GRE packet at the router and decoding it at the

cache. Also, it saves network bandwidth that would otherwise be consumed by

the GRE header. The SG appliance also supports the L2 packet return method if

the router software does.

Caveats for using L2 redirection:

-> You must use WCCP version 2.

-> If a cache is not connected directly to a router, the router does allow the cache to negotiate the rewrite method.

-> The same rewrite method must be used for both packet forwarding and packet return.

In a load-balanced SG group, mask scheme should be used over hash scheme.

WCCP will distribute redirected traffic to each SG in the pool based on

destination IP. This yields better cache hit ratio as each SG caches only range

of IP addresses not entire internet. The mask assignment method can be used

only with the Catalyst 6500 Series switches and Cisco 7600 series routers.

Specifically, the “Supervisor Engine II with Policy Feature Card 2 (PFC2 and

newer) and MSFC2, 256-MB memory option” is required.

Best Practice:-> Beware of minimum Cisco IOS version for WCCP support

-> WCCP version 2

-> Mask assignment scheme with destination IP

-> L2 forward/return to reduce CPU load and network bandwidth

-> With SGOS4, WCCP can only be supported with IP spoofing with source based load balancing. Destination load balancing with IP spoofing using SGOS4 is not possible.

Notes

The basic WCCP assumption when using IP spoofing is that the client side

source/destination pair must match the server side source/destination pair.

20 < >


This is required so the packets get load balanced to the correct proxy from both

directions.

If the source mask on the client side is used then this works fine, as the source

mask and return destination mask are always tied to the client IP, and that

client will get locked to a proxy.

However, if destination mask is used on the client side asymmetric load

balancing problems can occur. In this case, the client does a DNS lookup,

resolves server A, and makes a request to server A. Server A gets load balanced

to Proxy A. Now Proxy A does a DNS lookup, and because of DNS load balancing

or location based DNS, DNS resolves to server B. So Proxy A initiates the server

side of the connection to server B. Server B responds, and since ProxySG went

to a different server, it matches the mask differently. The return traffic gets

load balanced to Proxy B. Since ProxyB didn’t initiate this connection it drops

the packet.

Trust destination fixes it because it guarantees the proxy uses the same

destination as the client, so the return traffic goes to the same proxy.

Appendix D. Vendor Specific Options

D.1 F5

Overview

F5 Networks, Inc. originally manufactured some of the first load balancing

products. Today, F5 still hold the leader position and has extended its reach

beyond load balancing, producing products for what is known today as

“Application Traffic Management”. These products go beyond routing traffic to

the most available server to improving the delivery of the application itself by

making the application run faster and more securely.

Functionality such as:-> SSL Acceleration

-> Intelligent Compression

-> L7 Rate Shaping

-> Caching

-> Local traffic management

-> Global traffic management

-> Application Firewall

21 < >


-> Link/Internet Service Provider (ISP) Load balancing

-> Software Development Kit (SDK) and Application Programming Interface (API)

-> Network Management

-> Global Product Support Network – being the de facto standard in Oracle and Microsoft designs

Cache Load Balancing

The Key Feature required for Cache Load Balancing is:

1 Single node persistency is most important

2 The least impacting redistribution of traffic

3 Simple implementation

Load balancing of cache and proxy servers has long been a standard task for

F5’s BIG-IP LTM. However, cache and proxies have unique requirements over

other types of applications. It’s not realistically possible to cache an entire site,

let alone the Internet, on a single cache device. The solution is to intelligently

load balance requests and ‘persist’ them to a specific cache not based on who

the client is, but based on the content requested. In a forward proxy example, all

requests for site1.com would go to cache1 while all requests for site2.com go

to cache2. Since each cache gets all the requests for their respective sites, they

will have the most up to date version of the content cached. This also allows

the caches to scale more efficiently as each would onlyneed to cache N percent

of the possible destination content. In BIG-IP LTM nomenclature, adding more

members (nodes or servers) to the pool reduces the percent of content such as

URIs that any cache server needs to store.

While this section addresses cache load balancing, the concepts and scripts are

easily adapted to a wide range of uses. The solutions discussed cover BIG-IP

LTM all version 9.x, except for Election Hash iRule which takes advantage of

commands added in version 9.4.2.

Hash Persistence on BIG-IP LTM

There are unlimited variations of hashing available with the LTM’s iRules. This

section will focus on three of the most representative of these options, pointing

out the strengths and weaknesses of each. The goal of each is the same, to

ensure an equal and consistent distribution of requested content across an

array of devices. This document focuses on hashing the HTTP URI, however this

can easily be substituted for other content such as [HTTP::uri], [HTTP::host],

22 < >


[HTTP::cookie <cookiename>] or [IP::client_addr], to name a few. For a site

doing virtual hosting, it may be worthwhile to use [HTTP::host][HTTP::uri] which

would capture a request such as: [www.site1.com][/index.html]

Hashing is both a persistence and load balancing algorithm. While you can

configure a load balancing algorithm within a pool, it will always be overridden

by the hash if everything is working correctly.

By default the LTM load balances new connections, which are different than

requests. OneConnect must be enabled in order to have the LTM separate and

distinctly direct unique HTTP requests over an existing TCP connection. See

Devcentral post for more information.

Basic Hash

To set up hash load balancing based on an HTTP URI, first create a new iRule

such as below.

when HTTP_REQUEST { persist hash [HTTP::uri]}

Second, create a new Persistence Profile, which inherits its properties from

the Hash Persistence method. Select the iRule as the one created in the first

step. Set the timeout to a relevant value, which likely should be set fairly high

for hash based persistence. The other properties can be ignored. From within a

test virtual server’s resources section, set the persistence type to the new Hash

Persistence profile just created.

While Basic Hash is easy to set up, it has many drawbacks.

-> The LTM is maintaining each object as an entry in memory. This persistence table could potentially exceed the available memory of the LTM.

-> You must specify a length of time for which the LTM will keep an object persisted to a node. If the object isn’t requested for that period of time, the entry will expire and is eligible to be reload-balanced on a subsequent request.

-> A node newly added to a pool will not receive requests. If all objects are already persisted to existing nodes, the new node will remain unused until the content changes or the content’s persistence timer expires.

-> The persistence table is volatile and must be mirrored to the standby LTM, causing overhead.

http://devcentral.f5.com/Default.aspx?tabid=63&articleType=ArticleView&articleId=114

https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/bigip9_4config/BIGIP_LTM_CONFIG_GD_9_4-06-1.html?sr=1

23 < >


Typical Hash

Instead of relying on maintaining a record in the LTM’s memory, a more robust

alternative is to use a mathematical equation to make the hash decision. This

ensures that requests for the same content are sent to the same server without

the drawbacks of the previous persistence method.

Using the following calculation the LTM converts the requested content into

a number (crc32), then matches that number with the Nth server in the pool

(modulo ‘%’).

Selected Node = Position_In_Pool = [Number-Of-Pool-Members % [crc32 [ URI ]]]

The following example fills in the variables and steps through the operation

used to select the desired node.

Nodes available in pool = 4

URI = index.html

Applying the variables: [4 % [crc32 [ index.html ]]]

Performing the crc32 hash: [4 % [ 350811804]]

Performing the modulo: = 70268630 remainder 0

Selected node is: 0

This operation returns an answer of 0, which tells the BIG-IP to send this

request for index.html to the first cache server in the pool (Node 0), which in

figure 1.1 which would be server 172.29.4.53.

Index.html

172.29.4.53 0

172.29.4.54 1

172.29.4.55 2

172.29.4.56 3

Figure 1.1

What’s the catch? A problem arises when the node list of the pool is altered.

This could be the result of a health check changing the state of a node or an

operator adding or removing a server. In figure 1.1 all four nodes are available,

while in figure 1.2 the .56 node has been marked down. The same formula

continues to be applied. If the hashed URI returns a score of …04, then with 4

24 < >


nodes in the pool the BIG-IP would pick node 1 (172.29.4.53). If a node becomes

unavailable, as in figure 1.2, the resulting modulo calculation is now performed

against 3 remaining nodes. Node 0 (172.29.4.53) has the desired fresh content

cached, but now all requests for this object are sent to Node 1, 172.29.4.54,

which does not have the object cached.

Index.html

172.29.4.53 0

172.29.4.54 1

172.29.4.55 2

172.29.4.56 -

Figure 1.2

As shown above between figures 1.1 and 1.2, whenever there is a pool

membership change, all hashing is recalculated against the new number and

order of pool members. This is devastating for caches as it may take hours

or days to expire and repopulate their contents with the new data. If another

change in pool membership occurs again it’s possible that the servers never

reach their potential. Some variations of this method add an additional layer

that minimizes the impact from a node being marked up or down from health

checks but they add load to the system without correcting the underlying

problems.

F5 documents this limitation in Solution ID SOL6586. This problem is not unique

to F5 or the BIG-IP LTM. Customers have long been plagued by this industry-

wide limitation due to the nature of simple hashing.

Pros• Unlimitednumberofcacheobjects,it’sanalgorithm,notamemory

based solution

• Simplemathoperationcanscaletoveryhighnumberofrequests per second

Cons• Cachehitrateisextremelydependentuponserverstability.Addingone

server to a pool can mitigate every caches stored content

• Hashdisruptioncancausecachingserverstooverwhelmoriginserversand cause site failures

https://support.f5.com/kb/en-us/solutions/public/6000/500/sol6586.html?sr=1

25 < >


Typical Hash iRule: Remember to change <pool> to the name of your pool.

when HTTP_REQUEST { set picked [lindex [active_members –list <pool>] [expr [crc32 [HTTP::uri]] % [active_members <pool>]]] pool <pool> member [lindex $picked 0] [lindex $picked 1]}

Election Hash

Instead of a single calculation to determine node position, the requested content

plus the node IP addresses are calculated. The LTM numerically ranks the

resulting answers across all available nodes, similar to an election. So instead

of a single calculation, a separate calculation is made for each of the possible

nodes in the pool.

Typical hash algorithm:

Selected Node = Position-In-Pool = [Number-Of-Pool-Members % [crc32 [ URI ]]]

Election hash algorithm:

Selected Node = Highest-Node-Score = [crc32 [Node-IP-Address + URI ]]

A given node will always attain the same score for a unique object regardless of

its order within the pool. This calculation is performed for every member within

the pool, using its unique IP within the calculation. The request is sent to the

node with the highest “score”, highlighted in yellow in the following examples.

url_1 url_2 url_3 url_4

172.29.4.53 8 4 5 0

172.29.4.54 2 9 1 7

172.29.4.55 7 0 7 6

172.29.4.56 4 4 6 8

Figure 1.3


172.29.4.53 8 4 5 0

172.29.4.54 2 9 1 7

172.29.4.55 7 0 7 6

172.29.4.56 4 4 6 8

Figure 1.4

26 < >


Compare figures 1.3 and 1.4. The formula has not changed, so each remaining

node would calculate the same answer. The request for url_1 can’t be won by

the .53 server since it has been disabled, so the next highest answer will be

from the .55 node. The only traffic shift that would occur is for requests that

had previously been directed to the .53 node. Requests for other content remain

unaffected by the pool membership change.


172.29.4.53 8 4 5 0

172.29.4.54 2 9 1 7

172.29.4.55 8 0 7 6

172.29.4.56 4 4 6 8

172.29.4.57 3 3 9 2

Figure 1.5

When a new node is added to the pool it would participate in the numeric

election system. In figure 1.5, the new .57 node would take over an average of

%20 of the requests. The only traffic shift that would occur is for elections the

.57 node has now won.

This system scales infinitely to the number of unique requests that can be made

by a client. Pool membership changes from adding, removing, enabling, or

disabling a node will have no greater effect on the remaining members other

than the assumption of relinquishment of its portion of the traffic.

Pros• Scalestoinfinitenumberofrequestedobjects

• Gracefullyhandlespoolmembershipchanges

Cons• Insteadofonecalculationperrequest,theLTMmustperformone

calculation per request per node

Election Hash iRule: Note that this iRule requires version 9.4.2+. Change <pool> to

your pool name.

when HTTP_REQUEST { set High_Score -9999999999 set Node_Picked “”

foreach Cur_Node [active_members -list <pool>] {

27 < >


if { [crc32 $Cur_Node[HTTP::uri]] > $High_Score } { set High_Score [crc32 $Cur_Node[HTTP::uri]] set Node_Picked $Cur_Node } } pool <pool> member [lindex $Node_Picked 0] [lindex $Node_Picked 1]}

LTM log results with four nodes active in the pool:

Rule CARP : Node: 172.29.4.53/32 Score: 917Rule CARP : Node: 172.29.4.54/32 Score: 67Rule CARP : Node: 172.29.4.55/32 Score: 74Rule CARP : Node: 172.29.4.56/32 Score: 577Rule CARP : Picked Node: 172.29.4.53 URI: /images/gray-bullet.gif Highest Score: 917

The results when node .53 is unavailable:

Rule CARP : Node: 172.29.4.54/32 Score: 67Rule CARP : Node: 172.29.4.55/32 Score: 74Rule CARP : Node: 172.29.4.56/32 Score: 577Rule CARP : Picked Node: 172.29.4.56 URI: /images/gray-bullet.gif Highest Score: 577

The iRule correctly scored each possible node against the object requested and

selected the highest scoring server. Removing a node only affected traffic which

that node had previously won.

How Does Election Scale

It is possible to estimate the percentage of CPU required to run this iRule a

specific number of times per second. This number is directly affected by the

number of nodes within the pool, as this dictates the number of crc32 and

variable set operations that need to be performed per query.

For information regarding iRule performance, start with Devcentral.f5.com’s

posts regarding performance and the timing on command. For additional

troubleshooting, take advantage of the log capabilities within the iRule. Keep in

mind that leaving timing and log commands in the iRule during production may

have a significant impact on the overall iRule CPU usage

Limit the size of the object that is being hashed against the crc32. For instance,

use [HTTP::uri] instead of [HTTP::host][HTTP::uri] which may not provide any

additional level of uniqueness. If the object being queried is a dynamic site,

28 < >


consider using [HTTP::path] instead of [HTTP::uri]. This will ignore additional

parameters that may not be relevant to uniquely identify the page. Be sure to

check out Devcentral for additional information on these options.

An important note regarding the performance numbers such as those

illustrated in Figure 1.6. Your mileage will vary. These are simulated examples

of the iRule utilizing %100 of the LTM CPU, which is not a realistic scenario.

LTM will always need to reserve CPU to push packets once the load balancing

decision has been made. The amount of CPU required for pushing packets

or other services will depend on factors such as the model of LTM, single

processor versus CMP, the number of requests per second, the desired MB/s

of throughput, and the requested object size. The matrix in Figure 1.6 shows

the relationship between maximum iRule exectutions per second versus the

number of nodes in the pool. When load balancing requests for a 32k object

across 8 nodes, the task of transferring the object might consume 5 times more

CPU than the iRule. The important point is that while the task of transferring the

requested object remains the same, the impact on the iRule is directly affected

by the number of nodes in the pool.

Figure 1.6

Alternate Rule for More Nodes

As figure 1.6 shows, the requests per second are directly affected by the

number of nodes. In order to improve upon the scalability, the iRule needs to be

hashing the request against a smaller group of nodes at a time. The following

29 < >


example will step through 200 nodes distributed across multiple smaller pools

of nodes.

1 Instead of a single pool of 200 nodes, create 10 pools of 20 nodes each. Name the pools “pool_0”, “pool_1”, “pool_2” .… “pool_19”. The actual number of pools can be defined as per the iRule below.

2 When a new request is made the iRule performs a simple URI hash and modulo calculation producing a number between 0 – 9. The rule uses this number to determine which pool to use.

3 LTM performs the more expensive URI + Node IP hashing routine against the ten nodes within the chosen pool, rather than against the entire 100 nodes available.

Pros• Thisvariationcanaccommodateanyreasonablenumberof

• AddingnodestoexistingpoolsmaintainstheminimalimpactofElectionHash method

• Easytotuneloadtoserversofvaryingdiskandprocessorcapacity.Forinstance, if one pool has 10 powerful servers, another pool has 20 smaller servers. On average, the servers in the second pool would receive half the load / unique requests.

Cons• Ifanodebecomesunavailable,onlytheremainingnodeswithinthe

selected pool would divide the load. In the current example, the remaining 19 nodes would take over the requests for the missing node, rather than all 199.

• Addinganewpoolofnodesincursthesamedrawbackspreviouslydiscussed with Typical Hashing. Specifically, this will reset the persistence distribution and should be done during an outage window with a controlled ramp up after making such a change. If the environment called for adding and removing large numbers of nodes seamlessly, it is possible to use the election hashing method to distribute across multiple pools and then use the election hashing method against the individual nodes within the pool. Performing this function twice will impact performance, but does allow for large scale changes in environments that are sensitive downtime.

Election Hash iRule: Note that this iRule requires version 9.4.2+. Change <pool> to

your pool name.

when HTTP_REQUEST { set High_Score -9999999999 set Node_Picked “” set Pool_Picked pool_[expr {crc32 [HTTP::uri]} % 10]

foreach Cur_Node [active_members -list $Pool_Picked] { if { [crc32 $Cur_Node[HTTP::uri]] > $High_Score } { set High_Score [crc32 $Cur_Node[HTTP::uri]]

30 < >


set Node_Picked $Cur_Node } } pool $Pool_Picked member [lindex $Node_Picked 0] [lindex $Node_Picked 1]}

Optimization options

In BIG-IP 9.x you can process HTTP traffic using various profiles, including

TCP+HTTP, FastHTTP, and FastL4. Each profile, or combination of profiles,

offers distinct advantages, limitations, and features.

F5 Networks recommends that you assess the needs of each HTTP virtual

server individually, using the following information to determine which profile,

or profile combination, will best meet the requirements for each virtual server.

Important: The HTTP profile will work in all cases; however, the HTTP profile

places BIG-IP in full Layer 7 inspection mode, which can be unnecessarily

intensive when used on simple load balancing virtual servers. Thus, the other

profile options provided should be considered in instances where the full Layer

7 engine is not necessary for a particular virtual server.

TCP+HTTP

Profiles: TCP+HTTP

Advantage: The HTTP profile can take full advantage of all of BIG-IP’s Layers

4 - 7 HTTP/HTTPS features.

When to use: The HTTP profile is used when any of the following features are

required:

TCPexpress and content spooling features reduce server load

Full OneConnect functionality (including HTTP 1.0 transformations)

Layer 7 persistence (cookie, hash, universal, and iRule)

Full HTTP iRules logic

HTTP FastCache

HTTP Compression

HTTP pipelining

Virtual Server Authentication

Redirect Rewriting

Limitations:

More CPU-intensive

Memory utilization

31 < >


FastCache: Provisions user-defined memory allocation for cache content for

each virtual server that utilizes the given HTTP profile with FastCache enabled.

Compression: Larger buffer sizes can increase memory utilization when

compressing large objects.

TCP offloading/content spooling: Can increase memory utilization in cases

where either the client-side or the server-side of the connection is slower than

the other. The BIG-IP will hold the data in the buffer until the slower side of the

connection is able to retrieve it.

Note: For more information about the TCP profile, see below.

FastHTTP

Profile: FastHTTP

Advantage: Faster than HTTP profile

When to use: FastHTTP profile is recommended when it is not necessary to use

persistence and or maintain source IP addresses. FastHTTP also adds a subset

of OneConnect features to reduce the number of connections opened to the

backend HTTP servers. The FastHTTP profile requires that the clients’ source

addresses are translated. If an explicit SNAT or SNAT pool is not specified, the

appropriate self IP address is used.

Note: Typically, server efficiency increases as the number of SNAT addresses

available to the virtual server increases. At the same time, the increase in SNAT

addresses available to the virtual server also decreases the likelihood that the

virtual server will reach the point of ephemeral port exhaustion (65535 open

connections per SNAT address).

Limitations:

Requires client source address translation

Not compatible with persistence

Limited iRules support L4 and are limited to a subset of HTTP header

operations, and pool/pool member selection

No compression

No virtual server authentication

No support for HTTP pipelining

No TCP optimizations

Note: FastHTTP is optimized for ideal traffic conditions, but may not be an

appropriate profile to use when network conditions are less than optimal. For

32 < >


more information about the FastHTTP profile, refer to SOL8024: Overview of the

FastHTTP profile.

FastL4

Profile: FastL4

Advantage: Uses PVA to process packets

When to use: FastL4 is limited in functionality to socket level decisions (for

example, src_ip:port dst_ip:port). Thus, you can use FastL4 only when socket

level information for each connection is required for the virtual server.

Limitations:

No HTTP optimizations

No TCP optimizations for server offloading

SNAT/SNAT pools demote PVA acceleration setting level to Assisted

iRules limited to L4 events, such as CLIENT_ACCEPTED and

SERVER_CONNECTED

No OneConnect

Source address or destination address based persistence only

No compression

No Virtual Server Authentication

No support for HTTP pipelining

The TCP profile

The TCP profile allows you to manage TCP network traffic destined for the BIG-

IP LTM system. The TCP profile can be used by itself to manage traffic, or paired

with other profiles, such as the HTTP profile for processing Layer 7 traffic, or

the SSL profiles for processing SSL traffic.

When the TCP profile is defined for a virtual server, the virtual server processes

traffic using a full proxy architecture. The full proxy architecture allows the

BIG-IP LTM to appear as a TCP peer to both the client and the server by

associating two independent TCP connections with the end-to-end TCP session.

Therefore, the client connection terminates on the BIG-IP LTM, and the BIG-IP

LTM maintains its own TCP connection behavior to the server side, such as data

packaging, sequencing, buffering, and TCP options.

Note: Depending on the BIG-IP configuration, certain client information such as

the source IP address, and source TCP port may be reused on the server-side of

the connection.

https://support.f5.com/kb/en-us/solutions/public/0000/sol8024.html

https://support.f5.com/kb/en-us/solutions/public/0000/sol8024.html

33 < >


The following flow diagram illustrates the TCP connection flow between an

internet-based client, a BIG-IP LTM configured with a standard virtual server

and a default TCP profile, and a pool member.

The default TCP profile is appropriate for most traffic conditions; however, it can

be optimized as needed. In the case of the service provider the clients will be

connection across a WAN and therefore the LTM can offer some benefit to the

connection by making use of the TCP tcp-wan-optimized profile outlined below.

34 < >


35 < >


36 < >


Beginning with BIG-IP versions 9.3 and 9.4, the tcp-wan-optimized profile is a

new, pre-configured profile type.

If the traffic profile is strictly WAN-based, and a standard virtual server with a

TCP profile is required, you can configure your virtual server to use the tcp-

wan-optimized profile to enhance WAN-based traffic. For example, in many

cases, the client connects to the BIG-IP virtual server over a WAN link, which

is generally slower than the connection between the BIG-IP system and the

pool member servers. As a result, the BIG-IP system can accept the data more

quickly, allowing resources on the pool member servers to remain available.

By configuring your virtual server to use the tcp-wan-optimized profile, you

can increase the amount of data the BIG-IP system will buffer while waiting for

a remote client to accept it. Additionally, you can increase network throughput

by reducing the number of short TCP segments the BIG-IP system sends on

the network.

Settings and definitions in the tcp-wan-optimized profile

The values in the tcp-wan-optimized profile are different from a default TCP

profile. The following table describes the settings found in the tcp-wan-

optimized profile:

Setting Value Description

Proxy Buffer Low

131072 This setting specifies the proxy buffer level at which the receive window was opened. For more information, refer to SOL3422: Overview of content spooling.

Proxy Buffer High

131072 This setting specifies the proxy buffer level at which the receive window is no longer advanced. For more information, refer to SOL3422: Overview of content spooling.

Send Buffer 65535 This setting causes the BIG-IP system to send the buffer size in bytes. To optimize LAN-based traffic, this setting should be at least 64K in order to allow the BIG-IP system to output more data at a time, if it is allowed by the congestion window.

Receive Window

65535 This setting causes the BIG-IP system to receive the window size in bytes. If this setting is set too low in a LAN environment it can cause delays, as some systems inhibit data transfers if the receive window is too small.

Selective ACKs

Enabled When this setting is enabled, the BIG-IP system can inform the data sender about all segments that it has received, allowing the sender to retransmit only the segments that have been lost.

Nagle’s Algorithm

Enabled When this setting is enabled, the BIG-IP system applies Nagle’s algorithm to reduce the number of short segments on the network by holding data until the peer system acknowledges outstanding segments.

37 < >


Configuring the virtual server to use the tcp-wan-optimized profile

To configure your virtual server to use the tcp-wan-optimized profile, perform

the following procedure:

Log in to the Configuration utility.

Click Local Traffic, then click Virtual Servers.

Select the virtual server that processes WAN-based traffic.

1 From the Configuration menu, select Advanced.

2 From the Protocol Profile menu, select tcp-wan-optimized.

3 Click Update.

BlueCoatSystems,Inc.•1.866.30.BCOAT•+1.408.220.2200Direct+1.408.220.2250Fax•www.bluecoat.com

Copyright© 2008 Blue Coat Systems, Inc. All rights reserved worldwide. No part of this document may be reproduced by any means nor translated to any electronic medium without the written consent of Blue Coat Systems, Inc. Specifications are subject to change without notice. Information contained in this document is believed to be accurate and reliable, however, Blue Coat Systems, Inc. assumes no responsibility for its use, Blue Coat is a registered trademark of Blue Coat Systems, Inc. in the U.S. and worldwide. All other trademarks mentioned in this document are the property of their respective owners.

BlueCoat ISP

Documents

Transcript of BlueCoat ISP