Tetration Analytics - Network...

Post on 09-Apr-2018

244 views 6 download

Transcript of Tetration Analytics - Network...

Tetration Analytics - Network Analytics & Machine Learning Enhancing Data Center Security and Operations

Mike Herbert, Principal Engineer, INSBU

BRKDCN-2040

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Tetration (or hyper-4) is the next hyperoperation after exponentiation, and is defined as iterated exponentiation

• It’s bigger than a Google [sic] (Googol)

• And yes the developers are a bunch of mathematical geeks

Okay what does Tetration Mean?

BRKDCN-2040 3

Tetration Analytics Platform

Introduction

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

We Are at the Cusp of a Major Shift

DIGITAL EXPERIENCESEFFICIENCY SIMPLICITY | SPEED

Adoption

Curve

IT as a Service IaaS | PaaS | SaaS | XaaS

Flexible Consumption Models

CONSOLIDATIONVIRTUALISATION

HYBRID

CLOUDS

2000 2010 2015 The Next 5+ Years

AUTOMATION

TRADITIONAL DATA CENTRE

We are here

CLOUD DATA CENTRE

Efficiency

BRKDCN-2040 5

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Modern data centers are getting increasingly complex

• Zero trust model

• Multi cloud orchestration

• Application portability

Hybrid cloud

• Increase in east-west traffic

• Expanded attack surface

• Open source

Big and fast data

• Continuous development

• Application mobility

• Micro services

Rapid app deployment

BRKDCN-2040 6

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

What if you could actually look at every

data packet header that has ever

traversed the network without sampling?

BRKDCN-2040 7

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Tetration Analytics Platform Every Packet, Every Flow, Every Speed

BRKDCN-2040 8

Cisco Tetration Analytics™

Network

Pervasive

Visibility

and Forensics

Application

Insight

Policy

Compliance

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Cisco Tetration Analytics

Application

Insights

Policy Simulation

and Impact

Assessment

Automated

Whitelist Policy

Generation

Forensics:

Every Packet,

Every Flow, Every

Speed

Policy Compliance

and Auditability

BRKDCN-2040 9

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Cisco Tetration AnalyticsPervasive Sensor Framework

Provides correlation of data sources across entire application infrastructure

Enables identification of point events and provides insight into overall systems behavior

Monitors end-to-end lifecycle of application connectivity

BRKDCN-2040 10

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Information

about Consumer

– Provider and

type of traffic

Detail

information

about the flow

Datacenter Wide Traffic Flow Visibility

BRKDCN-2040 11

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Application Discovery and Endpoint Grouping

Cisco Tetration

Analytics™

Platform

BM VM VM BM

BM VM VM BM

Brownfield

BM VM VM VM BM

Cisco Nexus® 9000 Series

Bare-metal, VM, & switch telemetry

VM telemetry (AMI …)

Bare-metal & VM telemetry

BM VM

BMVM

VM BM

VMVM

VM BM

BMVM

BM

Network-only sensors, host-only sensors, or both (preferred)

Bare metal and VM

On-premises and cloud workloads (AWS)

Unsupervised machine learning

Behavior analysis

BRKDCN-2040 12

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Whitelist Policy Recommendation

Application Discovery

AppTier

DBTier

Storage

WebTier

Storage

Policy Enforcement(Future Roadmap)

Whitelist Policy Recommendation(Available in JSON, XML, and YAML)

BRKDCN-2040 13

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Real-Time and Historical Policy Simulation

• Validating policy impact assessment in real time

• Simulating policy changes over historic traffic

• View traffic “outliers” for quick intelligence

• Audit becomes a function of continuous machine learning

Cisco Tetration

Analytics™

PlatformVM BM

VMVM

BM VM

VMVM

VM BM

VMVM

VM

BRKDCN-2040 14

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Policy Compliance

• Identify policy deviations

in real-time

• Review and update

whitelist policy with one click

• Policy lifecycle

management

VM BM

VMVM

BM VM

VMVM

VM BM

VMVM

VM

Cisco Tetration

Analytics™

PlatformVM

BM

VM

BRKDCN-2040 15

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Tetration Analytics

Servers

Buffer Stats

Process

User

Compute

Application

InsightsPolicy Forensics

Tetration Analytics EnginePB Scale Secure Appliance

Ecosystem

Partners

Network

Network flows

Ap

plic

ation

Dep

en

dency

Ap

plic

ation

Pe

rfo

rman

ce

Au

tom

ation &

Com

plia

nce

En

forc

em

ent

Infr

astr

uctu

re

Be

ha

vio

ral

An

om

alie

s

BRKDCN-2040 16

Tetration Analytics Platform

Architecture - Sensors

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Tetration Analytics Architecture Overview

Analytics Engine

Cisco Tetration

Analytics™

Platform

Visualization and

Reporting

Web GUI

REST API

Push Events

Data Collection

Host Sensors

Network Sensors

3rd-Party

Metadata Sources

Tetration

Telemetry

Configuration

Data

Cisco Nexus®

92160YC-X

Cisco Nexus

93180YC-EX

VM

BRKDCN-2040 18

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Pervasive Sensors

Host Sensors NW Sensors 3rd Party

Geo

Whois

IP Watch Lists

Load Balancers

Linux VM

Windows Server VM

Bare Metal(Linux and Windows Server)

Hypervisors

Containers

Available at FCS Next Generation 9K switches Future releases 3rd party Data Sources

Low CPU Overhead (SLA enforced)

Low Network Overhead (SLA enforced)

Highly Secure (Code Signed, Authenticated)

Every flow (No sampling), NO PAYLOAD

Nexus 9200-X

Nexus 9300-EX

BRKDCN-2040 19

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Traditional Monitoring Is Showing Its AgeNot suited for Modern Network and Security Operations

Where Data Is Created Where Data Is Useful

Non

Real

time

SNMP

CLI

Syslog

SNMP

CLI

Syslog

SNMP

Server

Syslog

Collector

Scripts

Storage & Analysis

Strong burden on

back-end

Normalize different

encodings, transports, data

models, timestamps

BRKDCN-2040 20

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Streaming Telemetry is a game changerMonitoring becomes a big data problem

Where Data Is Created Where Data Is Useful

• Streaming paradigm

• Dense Sensor Framework

• Increased Data Granularity

• Update on every event

• Multiple Data Sources

Volume – Scale of Data

Velocity – Analysis of Streaming Data

Variety – Different Forms of Data

Removing limitations and

complexity

Big Data and

Machine Learning

Problem

Real

time

BRKDCN-2040 21

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Why Multiple Sensors?Example monitoring temperature in a room

Lamp Sensor Plug Sensor

Heater

BRKDCN-2040 22

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Tetration SensorsLocations

9732C-EX

LC

HYPERVISORHYPERVISOR

92160CY-X

93180Y-EX

HYPERVISOR

Software Sensor

Processes & Socket

Packet and Flow Events

Hardware Sensor

Packet and Flow Events

Buffer and Switch State

Tetration Cluster

BRKDCN-2040 23

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Embedded Module (Flow Cache)

• Nexus 92160CY-X

• Nexus 93180Y-EX & 9732C-EX Line Cards

• Extracts Meta-Data from the forwarding pipeline

• No latency impact, no performance impact

Hardware Sensor

PRX LUA LUB

Flow Cache

LUC

BRKDCN-2040 24

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Not in the data path

• Sits in User Space

• Designed by Kernel Developers

• Secure

• Code Signed

• SLA Enforcement

• CPU and BW throttling

• FCS availability

• Windows

• 2008 / 2008 R2 / 2012 / 2012 R2

• Linux

• RedHat (5.3+, 6.x)

• CentOS (5.11+, 6.x)

• Ubuntu (12.04, 14.04, 14.10)

Software Sensor

NIC

Driver

Network Stack

Application

libpcap

Tetration Sensor

BRKDCN-2040 25

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Tetration Cluster runs an internal PKI

• Root CA is per cluster, inserted at Image creation

• Not accessible outside the cluster

• Cannot connect to an external PKI

• Certificate based authentication is performed for the Control Channel

• CN of the certificate is the IP address

• Certificates are rotated every 60 days

• Sensors are code signed

• Signature Authority is Cisco’s code signing certificate

• Code Signature is validated at process start

PKI within the Cluster/Sensor

BRKDCN-2040 26

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

How Sensor Communicate with the Cluster the First Time?

Config Server

Collector

Rails

Sensor

Register with web server via ssl

Assign UUID

Register with web server via ssl

Download config

Send meta data to collectors

BRKDCN-2040 27

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Components & CommunicationHardware Sensor

ASIC

NXOS Agent

Cisco Nexus 9000

Tetration

Cluster

Control Channel

TCP/443

Sensor Data

UDP/5640

Guest ShellAgent Communication

Unix Socket

BRKDCN-2040 28

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Components & CommunicationSoftware Sensor

Software Sensor

LINUX/Windows/…

Tetration

Cluster

Control Channel

TCP-SSL 443

Sensor Data

TCP-SSL 5640

Agent Communication

Unix Socket

BRKDCN-2040 29

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Windows 2008• Datacenter, Enterprise, Essentials,

Standard

• Windows 2008 R2• Datacenter, Enterprise, Essentials,

Standard

• Windows 2012• Datacenter, Enterprise, Essentials,

Standard

• Windows 2012 R2• Datacenter, Enterprise, Essentials,

Standard

• RedHat Enterprise Server• 5.3 & above

• 6.x

• CentOS• 5.11 & above

• 6.x

• Ubuntu• 12.04

• 14.04

• 14.10

Currently Supported Platforms

This list ’will’ grow based on what you need and ask for

BRKDCN-2040 30

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Methods to deploy the sensor

BRKDCN-2040 31

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Coming soon to a GitHub near you

github.com/datacenter

BRKDCN-2040 32

Tetration Analytics Platform

Architecture - Sensor Data

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Looking Beyond ConnectivityApplication Processes and Sockets

Provider/Service ProcessConsumer Process

Socket = 443Socket > 1023

Chrome NGINX

• Application developers implement business logic as code that runs as processes and threads

• TCP/IP which forms a foundation of the Internet was designed to allow these application processes

to interact via sockets

• Application logic can be viewed on one level as the interaction between a group of processes and

their associated sockets

• Understanding the inter-process communication and mapping that directly to the infrastructure

provides a direct correlation between the application and the infrastructure

BRKDCN-2040 34

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 35BRKDCN-2040

Looking Beyond ConnectivityApplication Processes and Sockets

#create an INET, STREAMing socket

serversocket = socket.socket(

socket.AF_INET, socket.SOCK_STREAM)

#bind the socket to a public host,

# and a well-known port

serversocket.bind((socket.gethostname(), 80))

#become a server socket

serversocket.listen(5)

#create an INET, STREAMing socket

s = socket.socket(

socket.AF_INET, socket.SOCK_STREAM)

#now connect to the web server on port 80

# - the normal http port

s.connect(("www.mcmillan-inc.com", 80))

Provider/Service ProcessConsumer Process

Socket = 80Socket > 1023

Chrome NGINX

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 36BRKDCN-2040

What do we mean by Application VisibilityInternet Stack

Application

Transport

Network

Data Link

Physical

Application

Transport

Network

Data Link

Physical

Network

Data Link

Physical

Network

Data Link

Physical

Sockets

ProcessProcess

Sockets

ProcessProcess

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 37BRKDCN-2040

What Does Tetration Sensor CollectSocket Connectivity, the data flows

Application

Transport

Network

Data Link

Physical

Application

Transport

Network

Data Link

Physical

Network

Data Link

Physical

Network

Data Link

Physical

Sockets

ProcessProcess

Sockets

ProcessProcess

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 38BRKDCN-2040

What does the Sensor Collect Context

Application

Transport

Network

Data Link

Physical

Application

Transport

Network

Data Link

Physical

Network

Data Link

Physical

Network

Data Link

Physical

Sockets

ProcessProcess

Sockets

ProcessProcess

Process Information:

Which process is it, who started it, etc.

Device Information: Buffer/ACL Drops, etc.

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Sensor DataProcess Information

• Host Sensor collects

information about the

consumer and provider

processes

• /proc

• runtime system

information (e.g.

system memory,

devices mounted,

hardware

configuration, etc).

BRKDCN-2040 39

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 40BRKDCN-2040

Additional ContextExternal Data Sources

Application

Transport

Network

Data Link

Physical

Application

Transport

Network

Data Link

Physical

Network

Data Link

Physical

Network

Data Link

Physical

Sockets

ProcessProcess

Sockets

ProcessProcess

Tetration

Analytics Engine

CMDB, DNS,

whois, Talos

(future), etc.

Pervasive Sensors

APIC

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

What does the Sensor CollectSocket Level Flow Information + Context Information

• Understanding of what happens TO ‘and’ INSIDE a flow

• Distributions (packet sizes, TCP windows…)

• Burstiness

Length

66

Length

9000

Accumulated Flow Information (Volume…)

Per Packet Variations

• Anomaly detection

• Latency (application and network)

• Events

• VXLAN information

BRKDCN-2040 41

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Full vs. Sampled What happens when you sample?

Full Packet Stream

Flow A

Flow B

Flow C

SYN SYNACK ACK FIN

Flow D

BRKDCN-2040 42

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Full vs. Sampled Reasons and Use Cases for Both

• Sampling has it’s use cases, in SP environments for example

• High Volume, no behavioral analysis

• Sampling provides a good statistical model

• For Trends

• For Traffic Visibility

• For Volume Indication

• Depending on the number of flows and type of flows

• Mice flows can go completely unseen

• Connection Oriented flows may not be tracked properly (missed flags)

• Accuracy of the flow increases with the packet count

• Type of sampling and quality of entropy

• Entropy is very important

Sampled Full

BRKDCN-2040 43

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Tetration Examines every packet

Full Packet Stream

• Variability ’within’ the flow

• Variability ‘between’ the flows

• Changes ‘within’ the flow

BRKDCN-2040 44

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Ethernet

Header

IP

Header

UDP

Header

VXLAN

Header

Ethernet

Header

IP

Header

TCP

HeaderPayload

Ethernet

Header

IP

Header

TCP

HeaderPayload

Ethernet

Header

IP

Header

UDP

HeaderPayload

Meta-Data – Including Overlay VXLAN/GRE/IPinIP Encapsulated Header

Privacy Risk

Collects the Meta-Data not the Packet

BRKDCN-2040 45

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• COS

• Overlay Type (Native, 802.1q / 802.1p, VXLAN, iVXLAN, NVGRE, NSH, other)

• Source TEP or Port ID

• Destination TEP

• Disposition (RPF or Port Security failure, Policy drop, redirect or span)

• Port type (spine to leaf or leaf to host)

Sensor DataFlow Data – Forwarding

BRKDCN-2040 46

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Bytes, Packet Count

• IP options present

• IP length error

• DF bit set

• Fragment seen

• Last TTL

• Accumulated TCP flags

• Last ACK / SEQ

• Sampled Packet length

• Sampled Packet ID

Sensor Data Accumulated Flow Information

BRKDCN-2040 47

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Flow Cache has the notion of “bins” to build histograms

• TCP options length (8 bits)

• Payload length (12 bits)

• Receive window (6 bits)

• This means more visibility on the activity of flow

• Bin sizes are configurable

• Bins don’t need to be of equal size (but need to be contiguous)

• Last bin will capture the configured size and above

Sensor Data Histogram Bins

1 0 1 0 1 0 0 0

#1

82 bits

#2

82 bits

#3

0 bits

#4

165 bits

Export

0 0 1 1 1 0 0 0

#5

82 bits

#6

82 bits

#7

130 bits

#8

165 bits

Export

=Histogram of

the flow

BRKDCN-2040 48

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Measure the “burstiness” of a flow

• Current Burst

• Max Burst

• Burst Index

• Flowlets

• Burst are measured in 32k interval

• Each export period is divided by 128

• Flowlets are activity after a silence period (configurable)

Sensor Data Burst

0 1 2 3 128

Current – 128

Max – 128

Burst Index - 0

Current – 256

Max – 256

Burst Index - 3

Current – 1024

Max – 1024

Burst Index - 80

8030

Current – 32

Max – 256

Burst Index - 3

Current – 0

Max – 1024

Burst Index - 80

Max Burst occurred at 62.5ms with a value of 1024 and 2 flowlets

SilenceFlowlet #1 Flowlet #2

BRKDCN-2040 49

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• TTL changed

• IP reserved flags are not 0

• DF bit has changed

• Ping of death

• Fragment is too small to contain L4 header (TCP, UDP and SCTP)

• TCP SYN and FIN are set

• TCP SYN and RST are set

• TCP FIN, PSH and URG are set

• TCP flags are zero’d

• TCP SYN with data

• TCP FIN with no ACK

• TCP RST with no ACK

• TCP SYN, FIN, RST and ACK zero’d

• URG set but no URG pointer

• URG pointer with no URG flag

• TCP seq outside the expected range

• TCP seq is less than expected (rexmit)

Sensor Data Anomaly List

BRKDCN-2040 50

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Way of approximating the RTT based on specific packet characteristics

• Preset ACK & SEQ

• Approximation as this includes the OS network stack

• Uses sampling, sample taken every 8192 bytes (by default, configurable)

• Tracks ACK for these specific SEQ and creates an event for each

• By using this global configuration, if return path is via another switch the ACK is still tracked

Sensor Data Events RTT Sample

BRKDCN-2040 51

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

RTT Sample – Example

TCP SEQN 8192

Event Triggered

TCP SEQN 100

Event time stamped

TCP ACK 100 TCP ACK 8192

Event time stamped

RTT = Event ACK TS – Event SEQ TS

BRKDCN-2040 52

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Mouse Packet

• Export the first “n” packets of a flow (configurable)

• Analytics Changed

• A parameter of the flow has changed (bit mask comparison), 1 mask configurable

• Packet Value Match

• A packet field contains a specific value, 1 field configurable (mask + value)

Sensor Data Events

BRKDCN-2040 53

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Let’s take this Web page request as an example

• Assumption is that it’s the first connection, this is a new flow

• One flow is created per direction

Sensor Data Example

76 77 78 79 82 83 84 85 86 88 89

Flow Export

Flow A A A A A AB B B B B

BRKDCN-2040 54

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

First Packet Event

76 77 78 79 82 83 84 85 86 88 89

Event Triggered

BRKDCN-2040 55

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Mouse Packet Event

76 77 78 79 82 83 84 85 86 88 89

Length 78 74 66 178 66 1299 66 66 66 66 66

• n = 2 (2nd packet of a flow, within an export interval)

Event Triggered

BRKDCN-2040 56

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Analytics Changed Event

76 77 78 79 82 83 84 85 86 88 89

Event Triggered

• Bitmask = sampled packet length (in the flow analytics TCAM)

Length 78 74 66 178 66 1299 66 66 66 66 66

Sampled 78 78 66 66 66 1299 66 66 66 66 66

BRKDCN-2040 57

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Packet Value Match Event

76 77 78 79 82 83 84 85 86 88 89

Event Triggered

TTL 64 58 64 64 58 58 64 64 58 58 64

• TTL = 64

BRKDCN-2040 58

Pervasive VisibilityFlow Search and Forensics

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 60

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 61

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 62

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 63

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 64

Tetration Analytics Platform

Architecture - Cluster

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Tetration Analytics Architecture Overview

Analytics Engine

Cisco Tetration

Analytics™

Platform

Visualization and

Reporting

Web GUI

REST API

Push Events

Data Collection

Host Sensors

Network Sensors

3rd-Party

Metadata Sources

Tetration

Telemetry

Configuration

Data

Cisco Nexus®

92160YC-X

Cisco Nexus

93180YC-EX

VM

BRKDCN-2040 66

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

The Analytics ClusterComponents

• Hadoop Based Platform

• Self managed

• One touch deployment

• Tiered System

• Heavy Compute for Machine Learning

• Caching for light speed queries

• Extensibility (future)

• Messaging Bus

• API Access

Long Term Storage

(Data Lake)

Caching

(Search)

Front End

Compute

(Data Cleaning and

Analytics)

BRKDCN-2040 67

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• The Analytics Cluster operates as an appliance

• Avoids the need for in house Big Data, Analytics expertise

• Supported by Cisco TAC

• Self Monitoring

• The cluster leverages a sensor architecture to track it’s state and provides event based notifications for

• Software upgrades and full install are all automated

The Analytics ClusterAppliance

BRKDCN-2040 68

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 69BRKDCN-2040

Cluster Monitoring and Maintenance

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 70BRKDCN-2040

Collector Monitoring and Maintenance

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 71BRKDCN-2040

Sensor Monitoring and MaintenanceSensor Throttled

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 72BRKDCN-2040

Hardware Sensor Monitoring

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

FCS Analytics Cluster Configurations

4 x 3-Phase PDU

22.5 KW Peak Power

4 x 1-Phase PDU

11.5 KW Peak PowerBRKDCN-2040 73

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 74BRKDCN-2040

Options for Future Cluster Models

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Analytics EngineThe Platform

• Hadoop Based Platform

• Self managed

• One touch deployment

• Tiered System

• Heavy Compute for Machine Learning

• Caching for light speed queries

• Extensibility (future)

• Messaging Bus

• API Access

Long Term Storage

(Data Lake)

Caching

(Search)

Front End

Compute

(Data Cleaning and

Analytics)

BRKDCN-2040 75

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 76BRKDCN-2040

Front EndGUI, RESTful API, Messaging BUS

• Servers hosting

front end

processes

• GUI and

Operational

Interfaces

• RESTful API

(post FCS)

• Messaging BUS

(post FCS)

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 77BRKDCN-2040

Data ProcessingPipeline

• Data Ingest and Processing

• Multiple Pipelines for different processing activities

• Scaled to Millions of events per second

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 78BRKDCN-2040

Caching LayerNatural Language Search

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 79BRKDCN-2040

Caching LayerSearch

• Caching Layer provides a large in memory and flash based data store for real time searches

e.g. 16 weeks of policy delta data accessible for real time search

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 80

Data Lake HDFS Storage

• Long Term Storage for collected observations, for pipeline processing tasks, etc

• Usage is based on

• Time Based Retention

• Space Based Retention

• Greedy Retention

• Max possible Retention period will depend on cluster size and observation rate

14.10 K hours of available capacity at the current collection rates (587 days)

BRKDCN-2040

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Standard Data Analytics PipelineTetration Data Analysis

Data

Prep &

cleansing

Data

Aggregation

Statistical Analysis

&

Prediction Tools

Automated

Data

Discovery&

Evaluation

Reporting,

Visualization

or Alerts

De-duplication, unification of uni-directional flows into bi-directional,

annotate flows with context information, etc.

Sensor Collectors

Various Pipelines (e.g. ADM) process the data to derive

appropriate insights

GUI, REST API, Kafka, Policy Export, …

BRKDCN-2040 81

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 82BRKDCN-2040

Data Collection Sensor to Collector

Data

Prep &

cleansing

Data

Aggregation

Statistical Analysis

&

Prediction Tools

Automated

Data

Discovery&

Evaluation

Reporting,

Visualization

or Alerts

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 83BRKDCN-2040

Data Prep

Data

Prep &

cleansing

Data

Aggregation

Statistical Analysis

&

Prediction Tools

Automated

Data

Discovery&

Evaluation

Reporting,

Visualization

or Alerts

• De-duplication, unification of uni-directional flows into bi-directional, annotate flows with context information, etc.

Collector

Collector

Application

Transport

Network

Data Link

Physical

Application

Transport

Network

Data Link

Physical

Network

Data Link

Physical

Network

Data Link

Physical

Sockets

ProcessProcess

Sockets

ProcessProcess

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 84BRKDCN-2040

Analyzing the Data

Data

Prep &

cleansing

Data

Aggregation

Statistical Analysis

&

Prediction Tools

Automated

Data

Discovery&

Evaluation

Reporting,

Visualization

or Alerts

• Endpoints are iteratively compared with each other to find which “profiles” are most similar

• Sensor Data: Ports provided and consumed, Addresses sent and received from, Properties of network flows, Running processes, Process originating flow, Hostname,

• External Context: Load balancers / DNS / route tags

• Human approved clusters from current or other workspaces and base cluster definition

• This is an example of where we use machine leaning

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Machine Learning

Cognitive Computing - Finding and remembering all the

relationships between data, querying the matrix of relationships

(Watson)

Machine Learning - Remember what has happened before and

then look at new data coming in that context to try and find

patterns, build up a body of knowledge and then use that data to

make a decision based on the new data. Can machines

remember and apply what they remember to new data

Deep Learning - Not trying to maintain data and relationships

over time but analyze that data through better representations

and create model to learn these representations from large scale

unlabeled data. Succession analysis

BRKDCN-2040 85

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Machine Learning

A "Field of study that gives computers the ability to learn

without being explicitly programmed“ Arthur Samuel (1959)

The programmers construction of algorithms that can learn from and make

predictions on data (as opposed to static programming instructions).

7:00 am = 65 degrees

8:00 am = 75 degrees

9:00 am = 85 degrees

How warm will it be at 8:30 am tomorrow?

77.5 degrees

Supervised learning: Linear regression , Logistics regression, SVMs

Unsupervised learning: K-means, PCA, Anomaly detection

BRKDCN-2040 86

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 87BRKDCN-2040

ADM ClusteringMachine Learning Example

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Randomly initialize cluster centroids

Repeat {for = 1 to

:= index (from 1 to ) of cluster centroid closest to

for = 1 to := average (mean) of points assigned to cluster

}

K-means AlgorithmFinding the Clusters

BRKDCN-2040 88

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 89

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 90

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 91

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 92

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 93

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 94

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 95

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 96

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 97

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 98

SilhouettingValidation of the Cluster

https://en.wikipedia.org/wiki/Silhouette_(clustering)

• The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation)

• Produces a higher degree of probability that the clustering is representational

BRKDCN-2040

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 99

Results of the Clustering Machine Learning

BRKDCN-2040

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Tuning Cluster GranularityTuning the Algorithms

1 2 1 1 1

15

BRKDCN-2040 100

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 101BRKDCN-2040

Analyzing the DataFitting the Curve

Data

Prep &

cleansing

Data

Aggregation

Statistical Analysis

&

Prediction Tools

Automated

Data

Discovery&

Evaluation

Reporting,

Visualization

or Alerts

• Every data set (e.g. flow) is examined to find the best function that describes it’s behaviour

• Comparison within and between ‘flows’ can be used to find ‘outlier’ or anomaly conditions

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 102

Visual Query with Flow Exploration

Replay flow details like a DVR

Information mapped across 25 different dimensions

Thick lines indicate common flows

Faint lines indicate uncommon flows

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

OutliersWhat does ot look like it ‘fits’

• Switch on Outlier view to

highlight uncommon flows

• Outlier dimension is

highlighted with purple

circle

BRKDCN-2040 103

Tetration Application Insight

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Why This Approach Is Different

App Insight derived based on actual communication

Automated grouping of similar endpoints in a cluster

Flexibility of using hardware or software sensors

Keep your App Insight up-to-date based on application evolution

BRKDCN-2040 105

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Dependencies

Why should I understand them?

106BRKDCN-2040

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Why should I understand

them?

What can I do with this information?

107BRKDCN-2040

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Why should I understand dependencies?

Identify a single point of failure that should be replicated

Find all the parts of a service that should be migrated together to the cloud

Replace infrastructure components of an undocumented application

ACI application profiles, end point groups, and contracts based on applications

BRKDCN-2040 108

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Load Balancer Database

App

Application Dependency Mapping

BRKDCN-2040 109

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Understand the communication

Load Balancer Database

App

BRKDCN-2040 110

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Initial recommendations

Load BalancerApp

DatabaseCache

BRKDCN-2040 111

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Optional and minimal human supervision

Load Balancer

App

Database

CacheBRKDCN-2040 112

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Approve the clustering

Load Balancer

App

Database

BRKDCN-2040 113

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 114

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Enforcement Anywhere

Cisco

Tetration

Analytics™

Cisco ACI™ and Cisco Nexus® 9000 Series

Standalone

Linux and Microsoft Windows

Servers and VM

PublicCloud

Data

Whitelist policyWhitelist policy{

"src_name": "App",

"dst_name": "Web",

"whitelist": [

{"port": [ 0, 0 ],"proto": 1,"action": "ALLOW"},

{"port": [ 80, 80 ],"proto": 6,"action": "ALLOW"},

{"port": [ 443, 443 ],"proto": 6,"action":

"ALLOW"}

]

}

• Cisco ACI EGP/Contract Integration via Cisco ACI Toolkit

• Traditional Network ACL

• Firewall Rules

• Host Firewall Rules

Amazon

Web

Services

Microsoft

Azure

Google

Cloud

BRKDCN-2040 115

Application Centric, Okay but how do I get there?

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Policy Creation Flow

Export Clusters and Policies in JSON/XML format

Import Policy using ACI Toolkit

Automatic creation of EPGs and Contracts

ACI Toolkit

DataNetwork

Policy

Application Policy

Nexus 9K

APIC

Cisco TetrationAnalytics™

BRKDCN-2040 117

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

ACI Toolkit• Simple toolkit built on top of APIC API

• Set of simple python classes

• Python Library

• Used to generate REST API calls

• Runs locally

• Small number of classes

• ~30 currently

• “Intuitive” names

• Not full functionality, most common

• Focused primarily on configuration

• Preserves ACI basic concepts

• Tenants, EPGs, Contracts, etc.

APIC

ACI Toolkit

Linux

Commands

NX-OS

like

CLI

Custom

Python

Scripts

BRKDCN-2040 118

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• Runs as a command line tool or a REST service. The initial expected usage is as a command line tool.

• https://github.com/datacenter/acitoolkit/tree/master/applications/configpush

• Command line tool is here:

• https://github.com/datacenter/acitoolkit/blob/master/applications/configpush/apic_tool.py

• Takes the JSON provided by Tetration and pushes to the APIC. It requires the APIC credentials and which tenant/app profile to place the EPGs.

• https://acitoolkit.readthedocs.io/en/latest/

Configpush Application

BRKDCN-2040 119

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

• python apic_tool.py -husage: apic_tool.py [-h] [--maxlogfiles MAXLOGFILES]

[--debug [{verbose,warnings,critical}]] [--config CONFIG][-u URL] [-l LOGIN] [-p PASSWORD] [--displayonly][--tenant TENANT] [--app APP]

• optional arguments:-h, --help show this help message and exit--maxlogfiles MAXLOGFILES Maximum number of log files (default is 10)--debug [{verbose,warnings,critical}] Enable debug messages.--config CONFIG Configuration file-u URL, --url URL APIC IP address-l LOGIN, --login LOGIN APIC login ID.-p PASSWORD, --password PASSWORD APIC login password.--displayonly Only display the JSON configuration. Do not

actually push to the APIC.--tenant TENANT Tenant name for the configuration--app APP Application profile name for the configuration

Configpush Application Syntax

BRKDCN-2040 120

Policy Simulation and Compliance

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

We know the expected communication

Load Balancer Database

App

BRKDCN-2040 122

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Publish, export, and enforce Policy

Load BalancerApp Database

172.31.185.158

172.31.185.152

172.31.185.154

172.31.185.156

172.31.185.149

172.31.185.150

172.31.185.151

BRKDCN-2040 123

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Publish, export, and enforce Policy

Load BalancerApp Database

172.31.185.158

172.31.185.152

172.31.185.154

172.31.185.156

172.31.185.149

172.31.185.150

172.31.185.151

Load Balancer

Provides Port 3306

BRKDCN-2040 124

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Publish, export, and enforce Policy

Load BalancerApp Database

172.31.185.158

172.31.185.152

172.31.185.154

172.31.185.156

172.31.185.149

172.31.185.150

172.31.185.151

Database

Provides Port 3306

Load Balancer

Provides Port 3306

BRKDCN-2040 125

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

But how do we map this to real life?

Load BalancerApp Database

172.31.185.152

172.31.185.154

172.31.185.156

172.31.185.149

172.31.185.150

172.31.185.151

BRKDCN-2040 126

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

But how do we map this to real life?

Load BalancerApp Database

172.31.185.158

172.31.185.152

172.31.185.154

172.31.185.156

172.31.185.149

172.31.185.150

172.31.185.151

BRKDCN-2040 127

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

But how do we map this to real life?

Load BalancerApp Database

172.31.185.158

172.31.185.152

172.31.185.154

172.31.185.156

172.31.185.149

172.31.185.150

172.31.185.151

Misdroppedpackets!

BRKDCN-2040 128

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

But how do we map this to real life?

Load BalancerApp Database

172.31.185.158

172.31.185.152

172.31.185.154

172.31.185.156

172.31.185.149

172.31.185.150

172.31.185.151

Escaped out of policy flow!

Misdroppedpackets!

BRKDCN-2040 129

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 130

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 131

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicBRKDCN-2040 132

What was seen

on the network

that was out of

Policy

Permitted Traffic

Seen on the

network

Policy Compliance Verification & Simulation

Summary

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

ACI Architecture

Intent (May)

Assurance (Can)Analytics (Did)

Configuration Analysis

“Very Large State-Space”

Traffic Analysis

“Lots of Data”

Guarantees

Compliance

Consistency

ACI

ADM

Security

Forensics

BRKDCN-2040 134

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Summary

Pervasive flow

telemetry that

supports

infrastructure for

multiple data

centers at scale

Ready-to-use

solution to address

critical data center

operational

use cases

Self-monitoring

and eliminate the

need for

in-house big data

expertise

Open platform

and northbound

APIs enable

transparent

integration

VM

Accelerated

adoption and

comprehensive

Solution

support with

Services

BRKDCN-2040 135

Q & A

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Complete Your Online Session Evaluation

Learn online with Cisco Live!

Visit us online after the conference

for full access to session videos and

presentations.

www.CiscoLiveAPAC.com

Give us your feedback and receive a

Cisco 2016 T-Shirt by completing the

Overall Event Survey and 5 Session

Evaluations.– Directly from your mobile device on the Cisco Live

Mobile App

– By visiting the Cisco Live Mobile Site http://showcase.genie-connect.com/ciscolivemelbourne2016/

– Visit any Cisco Live Internet Station located

throughout the venue

T-Shirts can be collected Friday 11 March

at Registration

BRKDCN-2040 137

© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Continue Your Education

• Demos in the Cisco campus

• Walk-in Self-Paced Labs

• Lunch & Learn

• Meet the Engineer 1:1 meetings

• Related sessions

138BRKDCN-2040

Thank you