AVOIDING 3AM TROUBLESHOOTING - Welcome to F5 … · AVOIDING 3AM TROUBLESHOOTING Jack Fenimore ......

35
AVOIDING 3AM TROUBLESHOOTING Jack Fenimore FSE, Central and Southern Ohio

Transcript of AVOIDING 3AM TROUBLESHOOTING - Welcome to F5 … · AVOIDING 3AM TROUBLESHOOTING Jack Fenimore ......

AVOIDING 3AM TROUBLESHOOTING

Jack Fenimore

FSE, Central and Southern Ohio

http://xkcd.com/979/

Where am I starting from?

What are we troubleshooting?

Did this work before?

Does the traffic go through the F5?

Is it reproducible?

Is there a log server?

Did the timing of the issue coincide with any other changes?

Before beginning determine what devices are involved

Obtain or create a network diagram from the client to the F5 to the pool members

Network Map

Module Statistics

Statistics -> Performance

iHealth

• Displays a snapshot of the BIG-IP system configuration in a user-friendly format

• Evaluates the configuration against a database of known issues, common errors, and published F5 best practices

• Provides tailored feedback about configuration issues, a description of the issue, recommendations for resolution, and a link to additional information in the AskF5 Knowledge Base

What does BIG-IP iHealth do?

Displays System Configuration Snapshot

View all uploaded qkview files

iHealth Diagnostics Page

Reports configuration issues and provides a link to additional

information in AskF5

Packet Flow Review

• Routing to a listener on the BIG-IP

• Listeners are

• Self IPs

• SNATs

• NATs

• Virtual Servers

How Does Traffic Enter a BIG-IP?

10.2.2.100:80

10.2.2.1

External VLAN

Internet

10.2.2.50

NAT to 192.168.4.8

1. Existing connection in connection table

2. Packet filter rule

3. Virtual server

4. SNAT

5. NAT

6. Self-IP

7. Drop

Packet Processing Priority

• Standard

• Forwarding IP

BIG-IP Virtual Server Types

http_pool 1.1.1.1 :8080 1.1.1.2 :8080

VLAN Internal

IP 1.1.1.254

VLAN External

IP 2.2.2.254

RED BLUE

HTTP request DST: 2.2.2.2:80 SRC: 3.3.3.3

http_vs 2.2.2.2:80

Client 3.3.3.3

HTTP request DST: 1.1.1.1:8080 SRC: 3.3.3.3

HTTP response DST: 3.3.3.3

SRC: 2.2.2.2:80

HTTP response DST: 3.3.3.3

SRC:1.1.1.1:8080

BIG-IP LTM chooses RED

The default gateway for the RED and BLUE servers is 1.1.1.254 on BIG-IP LTM

Standard Virtual Server Packet Flow

IPv4 IPv6

VS listener

IPv4 IPv6

iRules

Load balancing

algorithms

TCP

Express

iRules SSL

iRules HTTP

iRules RAM

Cache

Proxy

iRules HTTP

iRules

iRules

iRules TCP

Express

iRules

SSL

Forwarding (IP) VS Packet Flow

IPv4 IPv6

VS listener

IPv4 IPv6

iRules

TCP

UDP

Forward Request

1. Specific IP address and specific port 10.0.33.199:80

2. Specific IP address and all ports 10.0.33.199:*

3. Network IP address and specific port 10.0.33.0:433 netmask 255.255.255.0

4. Network IP address and all ports 10.0.33.0:* netmask 255.255.255.0

5. All networks and specific port 0.0.0.0:80 netmask 0.0.0.0

6. All networks and all ports 0.0.0.0:* netmask 0.0.0.0

Virtual Server Priority

Up Through the Layers

Layer 1

tmsh show /net interface 1.1 all-properties field-fmt

Layer 2

tmsh show /net arp dynamic

tmsh show /net fdb

Layer 3

• Ping

• Check routes

• Tracepath utility

• Traceroute from both directions

• Telnet to the remote port

[root@3900-1:Active:In Sync] config # tracepath 10.0.180.1

1: 10.50.0.221 (10.50.0.221) 0.175ms pmtu 1500

1: 10.0.180.1 (10.0.180.1) 2.981ms reached

Resume: pmtu 1500 hops 1 back 1

Connections 10.0.180.250:59918 - 10.50.220.101:80 - any6.any - any6.any ----------------------------------------------------------- TMM 2 Type any Acceleration none Protocol tcp Idle Time 52 Idle Timeout 300 Unit ID 1 Lasthop /Common/external 00:18:19:9e:b4:75 Virtual Path 10.50.220.101:80 ClientSide ServerSide Client Addr 10.0.180.250:59918 any6.any Server Addr 10.50.220.101:80 any6.any Bits In 5.4K 0 Bits Out 4.8K 0 Packets In 6 0 Packets Out 5 0

10.0.180.140:51711 - 10.50.220.100:80 - 10.80.0.220:51711 - 10.80.0.51:8080 --------------------------------------------------------------------------- TMM 3 Type any Acceleration none Protocol tcp Idle Time 2 Idle Timeout 300 Unit ID 1 Lasthop /Common/external 00:18:19:9e:b4:75 Virtual Path 10.50.220.100:80 ClientSide ServerSide Client Addr 10.0.180.140:51711 10.80.0.220:51711 Server Addr 10.50.220.100:80 10.80.0.51:8080 Bits In 52.9K 67.2K Bits Out 134.9K 54.2K Packets In 21 15 Packets Out 36 25

[root@3900-1:Active:Changes Pending] config # tmsh show ltm persistence persist-records client-addr 10.0.180.140 Sys::Persistent Connections source-address 10.50.220.100:80 10.80.0.51:8080 3 Total records returned: 1

tmsh show /sys conn

tmsh show /ltm persistence persist-records

Situations Specific to F5

MAC Masquerade

• Unique MAC assigned to a traffic group

• Minimize ARP communication or dropped packets during failover by using a consistent MAC address

• Improve reliability and failover speed

• Improve interoperability with switches slow to process gARP’s

• When a BIG-IP becomes active it will send a gARP for all Virtual IP’s for which it is now active. If link down on failover is set it will also perform an interface reset, dropping carrier momentarily

• SOL13502 (SOL7214 for v10.x)

Review of Auto Last Hop

• Tracks the source MAC address and VLAN of incoming connections.

• Return traffic from pools is sent to the MAC transmitted the request,

• Even if the routing table points to a different network or interface

• The BIG-IP can send return traffic to clients even if no matching route.

• Auto Last Hop is a desired behavior and so it is enabled by default.

• F5 Networks recommends leaving enabled

• Under rare circumstances you may want to disable Auto Last Hop

• If disabled the routing table is used to forward the packet

• SOL11796: Overview of the Auto Last Hop setting

TCP Reset Cause

• Informs where and why a TCP reset was generated. (SOL13223)

• A diagnostic enhancement

• Use as necessary for troubleshooting

• Added for all profiles which could cause a TCP RST

• HTTP

• Stream

• FastL4

• FastHTTP

• etc. 3900-1 err tmm3[8641]: 01230140:3: RST sent from 10.80.0.50:80 to 10.80.0.221:1115, [0x173b10d:5961] TCP RST from remote system

Viewing Reset Cause

• Insert into TCP reset (packet captures)

- tmsh mod sys db tm.rstcause.pkt {value "enable"}

- The default is “disabled”

• Send to syslog (/var/log/ltm)

• tmsh mod sys db tm.rstcause.log {value “enable”}

• The default is “disabled”

• Show reset cause stats

• tmsh show net rst-cause

RST Packets Containing Data (RFC1122)

• What do the RFCs have to say about this?

• A TCP SHOULD allow a received RST to include data.

• It has been suggested that a RST segment could contain ASCII text that encoded and explained the cause of the RST. No standard has yet been established for such data.

• Some other stacks do the same (e.g., HP-UX and MacOS)

• Has been known to cause issues in the field

Wireshark Plug-In

• Available from devcentral.f5.com

Pool action on service down

How the system should respond when the target pool member

becomes unavailable – pool object property.

• None: Specifies that the system maintains existing connections,

but does not send new traffic to the member (default)

• Reject: Use "Reject" when you want LTM to explicitly close both

sides of the connection when the server goes DOWN

• Drop: Specifies that the system simply cleans up the

connection, no reset will be sent

• Reselect: Specifies that the system manages established client

connections by moving them to an alternative pool member

I did ABC and now when I log in the GUI I see: “The configuration has not yet loaded. If this message persists, it may

indicate a configuration problem.”

To determine what is wrong:

tmsh load /sys config partitions all

Attack Prevention and Dynamic Reaping

• SYN flood, DDoS, DoS attack prevention

• SYN Cookies*

• Dynamic Reaping

• Continually monitors existing TCP connections to ensure the integrity of the connection table

• Removes the oldest idle connections if it needs to clear up more memory

• Protects the BIG-IP against SYN attacks from non-spoofed IP addresses that fully negotiate a connection

• Avoid changing default values without Support assistance

7 APPLICATION

6 PRESENTATION

5 SESSION

4 TRANSPORT

3 NETWORK

2 DATA LINK

1 PHYSICAL

* The article http://cr.yp.to/syncookies.html provides an elaborate explanation

of SYN cookies

Tips on General Configuration

• Set DNS and NTP

• Re-activate your license before upgrading (*Will impact traffic)

• Adjust the Number of Records Per Screen

• Set up a floating IP address on each VLAN

• Understand the BIG-IP operates in STP pass-thru mode

• Virtual Address vs Virtual Server, disabling ARP

• Nagles algorithm