2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

download 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

of 35

Transcript of 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    1/35

    ClusterXL

    Under the hood

    Wednesday, September 14, 2011

    About author

    Valeri Loukine

    CCMA 0019 Ex-Check Point Senior Security Consultant - Dimension Data Email: [email protected]

    Blog: http://checkpoint-master-architect.blogspot.com/

    2

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    2/35

    Agenda

    Understanding the Cluster Elements CCP

    State synchronization

    Pnote

    Check Point Solutions: aka ClusterXL (HA ,Load Sharing)

    Advanced features and problematic scenarios

    3rd party clusters

    Some Troubleshooting

    Wednesday, September 14, 2011

    CCP

    Check Control protocol runs on proto UDP 8116.

    CCP is running on all interfaces (in Cluster XL) Note: When VLAN are used CCP will run only on the lowest VLAN ID

    (Not true for VSX)

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    3/35

    CCP is in charge of

    Health status reports

    Cluster member probing

    State change commands

    Querying for cluster membership

    State table synchronization

    Wednesday, September 14, 2011

    CCP modes

    Multicast or Broadcast

    To change: cphaconf set_ccp STATE

    $FWDIR/boot/ha_boot.conf The Mac address used for the multicast is

    determine with a special algorithm.

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    4/35

    Checking CCP state

    # cphaprob a if

    Required interfaces: 3

    Required secured interfaces: 1

    eth0 UP non sync(non secured), multicasteth1 UP non sync(non secured), multicasteth2 UP sync(secured), multicast

    Virtual cluster interfaces: 2

    eth0 192.168.10.1eth1 10.1.1.1

    Wednesday, September 14, 2011

    State Sync

    Used to exchange kernel table informationbetween cluster members Composed of two phases:

    Full Sync andDelta Sync

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    5/35

    Full Sync

    Happens upon boot fwd communication on port 256

    Does not have to be on the Sync interface

    Wednesday, September 14, 2011

    Delta Sync

    Done over CCP (UDP 8116) Updates changes in kernel tablesincrementally

    Happens with every operation done to asynchronized kernel table

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    6/35

    How it works

    When cluster members starts, it requestsFull Sync before becoming Standby Member

    Full Sync replicates all existing kernel tablesand existing connections information

    Wednesday, September 14, 2011

    How it works

    Upon FS completion cluster memberchanges its state to Standby

    From now on, Delta Sync occurs Only changes are synced Some may be not synced, configurable

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    7/35

    Tuning Sync

    Wednesday, September 14, 2011

    Tuning Sync

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    8/35

    Sync summary

    Global - supports all kernel tableoperations

    Transparent - does not require directawareness of its existence

    Serves both ClusterXL and third partieswithout significant changes

    Wednesday, September 14, 2011

    Sync summary

    User mode applications information is notsynced!(Security Servers, etc)

    May require some performance andbandwidth

    Can be tuned administrator can choosesome services not to be synced

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    9/35

    fw ctl pstat - syncSync:

    Version: new

    Status: Able to Send/Receive sync packets

    Sync packets sent:

    total : 209693, retransmitted : 166, retrans reqs : 129,acks : 54

    Sync packets received:

    total : 134755, were queued : 221, dropped by net : 101

    retrans reqs : 29, received 26 acks

    retrans reqs for illegal seq : 0

    dropped updates as a result of sync overload: 0

    Callback statistics: handled 11 cb, average delay : 1, maxdelay : 1

    Wednesday, September 14, 2011

    Under the hood

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    10/35

    Pnote

    critical device AKA a ProblemNotification (pnote)

    If a critical device stops functioning, this isdefined as a Failure

    fwd , cphad are predefined

    also checked: policy (filter) , sync andinterfaces

    Wednesday, September 14, 2011

    Pnote

    To check:cphaprob list

    Can be used to cause a failover by adding anew faulty device

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    11/35

    cphaprob list

    Built-in Devices:

    Device Name: Interface Active CheckCurrent state: OK

    Registered Devices:

    Device Name: cphadRegistration number: 2Timeout: 2 secCurrent state: OKTime since last report: 0 sec

    Device Name: fwdRegistration number: 3Timeout: 2 secCurrent state: OKTime since last report: 0.8 sec

    Wednesday, September 14, 2011

    Register new device

    cphaprob -d -t -s [-p] register

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    12/35

    Clusters basicre uirements

    OS must be the same.

    FW-1 version must be the same.

    Installed products must be the same.

    NOTE : Check Point recommends that customers use the same hardware.

    Wednesday, September 14, 2011

    ClusterXL basics

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    13/35

    ClusterXL

    CP clustering product (CCP), UDP 8116 Same for both HA and LS solutions Supports Solaris, SPLAT and Linux, not

    IPSO

    4 modes of operation

    HA Legacy and New LS Multicast and unicast!

    Wednesday, September 14, 2011

    HA new mode

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    14/35

    HA new mode

    Active - Standby roles CCP runs on multicast by default Active member answer whois ARP for VIP

    with its physical MAC address

    Wednesday, September 14, 2011

    HA new mode

    Sync is done

    If Active fails, Standby takes over andbecomes Active

    By default no secondary failover CCP can be switched to unicast (flooding

    VIP segment)

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    15/35

    HA new mode

    #cphaprob stat

    Cluster Mode: New High Availability (Active Up)

    Number Unique Address Assigned Load State

    1 (local) 172.18.100.5 100% active

    2 172.18.100.6 0% standby

    Wednesday, September 14, 2011

    HA legacy mode

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    16/35

    HA legacy mode

    Linux only Both members are configured to have same

    IP addresses and SAME MAC addresses onclustered interfaces

    Managed through private interfaces

    Wednesday, September 14, 2011

    LS multicast

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    17/35

    LS multicast mode

    Both members process traffic whois is answered with virtual multicast

    MAC shared among members

    All members receive the packet Random decision to process

    Wednesday, September 14, 2011

    LS multicast mode

    #cphaprob stat

    Cluster Mode: Load Sharing (Multicast)

    Number Unique Address Assigned Load State

    1 192.10.0.1 33% active

    2 192.10.0.2 33% active

    3 (local) 192.10.0.3 33% active

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    18/35

    LS pivot (unicast)

    Wednesday, September 14, 2011

    LS pivot mode

    Pivot always answers whois with its physicalMAC

    It always get packets, but forwards some ofthem to other cluster members

    Forwarding is done on receiving network,original source MAC is replaced

    Load is not equally shared

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    19/35

    LS pivot mode

    #cphaprob stat

    Cluster Mode: Load Sharing (Unicast)

    Number Unique Address Assigned Load State

    1 (local) 10.10.10.57 30% active (pivot)

    2 10.10.10.61 70% active

    Wednesday, September 14, 2011

    Advanced parameters

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    20/35

    Advanced parameters

    Asymmetric Routing Session from standby (Forwarding) Block new Conns Different subnet

    Magic MAC Disconnected interfaces

    Wednesday, September 14, 2011

    Asymmetric Routing

    C2S packet goes through one clustermember

    S2C packet goes through another

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    21/35

    Asymmetric Routing

    Whats the problem? Race conditions (syn/syn-ack/ack)

    Features without sync (Security Servers)

    NATed and encrypted connections

    Data connections

    Wednesday, September 14, 2011

    Asymmetric Routing

    Resolution: Flush and Ack mechanism hold a packet that made a

    change in the kernel table until the change is synced

    successfully

    Sticky Decision Function

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    22/35

    Decision Function

    Wednesday, September 14, 2011

    SDF - when?

    FTP - The data connections are passedthrough the same cluster member as the

    control connection

    NATed connections, including Static NATand Hide NAT

    VPN, including encrypted connectionsgenerated from SecuRemote/SecureClientor from another VPN gateway.

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    23/35

    SDF - limitations

    Some connection types are not recognizedby SDF- default DF will be used

    SDF does not work with SecureXLacceleration will be stopped

    Does not work for VPN routing

    Wednesday, September 14, 2011

    Session from Standby

    If a session start from Standby:

    To the server it will go directly fromStandby

    From the Server it will go to Activemember , then it will forward theconnection to Standby

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    24/35

    Block new conns

    If sync is at risk, new connections shouldnot be processes.

    Error message:FW-1: State synchronization is in risk.

    Please examine your synchronizationnetwork to avoid further problems!

    Wednesday, September 14, 2011

    Block new conns

    fw_sync_block_new_conns Enable load detection - set to 0 Disable load detection - set to -1 FW-1 default is -1 , VSX the default is 0

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    25/35

    Different Subnet

    When VIP is not on the same subnet asphysical member IP addresses

    Automatic ARP is not supported. local.arprequired

    May need some additional static routes

    Wednesday, September 14, 2011

    Magic MAC

    Used by CCP on Layer 2 Belongs to all members on all interfaces Forward MAC is used to forward packets

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    26/35

    Magic MAC

    fwha_mac_magic 0xfe fwha_mac_forward_magic 0xfd

    Wednesday, September 14, 2011

    DisconnectedInterfaces

    Interfaces that do not run CCP

    Sync Interface must NOT bedisconnected

    In 3rd party all interfaces except for thesync interface

    Will not be monitored

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    27/35

    DisconnectedInterfaces

    $FWDIR/conf/discntd.if, reboot

    May define in topology as private

    No need to list them in 3rd party

    Wednesday, September 14, 2011

    3rd party clusters

    Were many vendors Now Crossbeam and IPSO, what else? ClusterXL - only Sync

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    28/35

    Cluster member state

    Wednesday, September 14, 2011

    Cluster member state

    Active

    Active Attention Down Ready Standby

    Initializing

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    29/35

    Active

    Everything is good Passing traffic

    Wednesday, September 14, 2011

    Active Attention

    Something is wrong in the cluster I am passing traffic

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    30/35

    Down

    One of the critical devices is down Not passing traffic

    Wednesday, September 14, 2011

    Ready

    Upgraded, old version member is Active Not passing traffic

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    31/35

    Standby

    Everything is good Not passing traffic

    Wednesday, September 14, 2011

    Initializing

    Cluster member is booting up, ClusterXL product is already running VPN-1 Pro is not yet ready Full Sync is not completed

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    32/35

    Troubleshooting tools

    Wednesday, September 14, 2011

    CLI

    cphaprob

    list

    -a if

    state

    fw ctl pstat (check sync data)

    fw ctl debug m cluster xxx

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    33/35

    fw ctl debug flags

    conf Configuration related kdebugmessages

    if - Interface tracking and validation

    stat - Cluster module state change

    select - Packet selection including DF

    ccp Cluster control packet handeling

    pnote - Pnote device

    Wednesday, September 14, 2011

    fw ctl debug flags

    mac mac address sync forward forwarding layer debug df decision function drop drops caused by SDF

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    34/35

    Other tips

    Snoop (still using UDP port 8116 traffic)

    fw monitor (forwarded packets maycause confusion)

    Wednesday, September 14, 2011

    QuestionsAnd Answers

    Wednesday, September 14, 2011

  • 8/3/2019 2011 CPUG CON EUROPE Loukine Valeri All About ClusterXL

    35/35

    Thank You For YourTime!

    Wednesday, September 14, 2011