Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger...

30
Enriching Network Security Analysis with Time Travel Gregor Maier 1 , Robin Sommer 2 , Holger Dreger 3 , Anja Feldmann 1 , Vern Paxson 4 , Fabian Schneider 1 ACM SIGCOMM 2008 1 TU Berlin / DT Lab, 2 ICSI / LBNL, 3 Siemens AG Corporate Technology, 4 ICSI

Transcript of Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger...

Page 1: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

Enriching Network Security Analysis with Time Travel

Gregor Maier1, Robin Sommer2, Holger Dreger3, Anja Feldmann1, Vern Paxson4, Fabian Schneider1

ACM SIGCOMM 2008

1TU Berlin / DT Lab, 2ICSI / LBNL, 3Siemens AG Corporate Technology, 4ICSI / UC Berkeley

Page 2: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 2

Reference

Stenfan Kornel, Vern Paxson, Holger Dreger, Anja Feldmann, Robin Sommer, “Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic,” 5th ACM IMC 2005. Stenfan Kornel, “High-Performance Packet Recording for Networ

k Intrusion Detection,” Master Thesis, 2005. Gregor Maier, Robin Sommer, Holger Dreger, Anja Feldman

n, Vern Paxson, Fabian Schneider, “Enriching Network Security Analysis with Time Travel,” ACM SIGCOMM 2008.

Time Machine webpage:

http://www.net.t-labs.tu-berlin.de/research/tm/

Page 3: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 3

Outline

Introduction Time Machine (TM) Design Performance Evaluation Coupling TM with a Network Intrusion Detection

System (NIDS) Discussion Conclusion & comments

Page 4: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 4

Introduction

Definition Time Travel is the capability allows us to conveniently “travel

back in time” Time Machine is the system that provides capability “Time

Travel” This paper present a Time Machine (TM) for network traffic to

enable later inspection of activity that becomes interesting only in retrospect

Benefit for network security monitoring? Security forensics Network trouble-shooting Event correlation

Page 5: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 5

Problems

(Storage) wholesale recording and retention of entire data streams is infeasible A Gigabit network several TB per day However, network trace with full packet content can provide

most information for investigating security incidents

(Data selection) only a very small subset of the traffic is relevant for later analysis How to decide beforehand what data will be crucial?

(Analysis) data retrieval is like finding needle in a haystack It’s time-consuming and cumbersome

Page 6: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 6

Common Practice at LBNL (Before using TM) LBNL: Lawrence Berkeley National Laboratory

About 10,000 hosts 10Gbps Internet connectivity 1-2TB per day

320 Mbps (37 Kpps) at busy-hour (IMC’05)

Bulk-recording with tcpdump Due to the storage constrains

Omit key services (HTTP, FTP, etc.) Omit some high volume hosts

Manual analysis of traces after incident The omissions constitutes a blind spot during analysis Increasing number of attacks carried out over HTTP

Page 7: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 7

Objective

Design a Time Machine (prototype) (IMC’05) Record raw packets (not only headers but full contents, not

aggregation or attribution) Leverage heavy-tails to capture nearly all of the likely-

interesting traffic while store only a small fraction of the total volume

A better Time Machine!! (SIGCOMM’08) Re-architected for better performance based on real world

experiences Coupled with a rich query-interface

Facilitate both manual (operator-driven) and automated (NIDS-driven) retrospective analysis

Page 8: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 8

Outline

Introduction Time Machine (TM) Design Performance Evaluation Coupling TM with a Network Intrusion Detection

System (NIDS) Discussion Conclusion & comments

Page 9: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 9

Time Machine (Key Insight) “Heavy-tailed” distribution in network traffic

Most network connections are quite short 91% of connections < 10 KB

Minority of connections carry most of volume Bulk data transfer (Video, Audio, etc.)

Relevant/interesting data mostly at beginning Handshakes, application protocol headers…

Compromising is at the beginning of most attacks For forensics and trouble-shooting applications the beginning of a

large connection contains the most significant information

Page 10: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 10

Time Machine (Employ Cutoff Limit) Exploit the “heavy-tailed” nature to partition the

traffic stream into a small subset of high interest vs. a large remainder of low interest Then record the small subset and discard the rest

Cutoff limit, N: Only store the first N bytes per connection

Greatly reduce the traffic we must buffer Retain full context for small connections and the beginning for

large connections

Page 11: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 11

TM “Multi-threaded” Architecture using libpcapmapping packets to connections

enforcing cutoff for each connectionseparating storage classes,

different classes can have different cutoff and buffer budgets

managing buffer budgets,subject to the budget constrains, TM always store most recent packets

support efficient query,indexes can be configured for any subset of packet’s header fields (depend on query)

manage indexes

query must related to the indexes

support 2 delivery method

support query subscription

Page 12: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 12

Outline

Introduction Time Machine (TM) Design Performance Evaluation Coupling TM with a Network Intrusion Detection

System (NIDS) Discussion Conclusion & comments

Page 13: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 13 Endace DAG card: http://www.endace.com/our-products/dag-network-monitoring-cards/

TM live deployments at MWN and LBNL

Environment

Institution MWN LBNL

# hosts ~50,000 ~10,000

Uplink capacity 10 Gbps 10 Gbps

Traffic volume 3~6 TB /day 1~2 TB /day

TM setting

Cutoff limit 15 KB 15 KB

Memory budgets 750 MB 150 MB

Disk budgets 2.1 TB 500 GB

CPUDual-CPU AMD

Opteron 244 1.8 GHz

Dual-core Intel

Pentium D 3.7 GHz

RAM 4 GB 4 GB

Kernel Linux 2.6.15.1 FreeBSD 6.2

NIC1 Gbps Endace DAG

network moniroting cardNeterion 10 Gbps NIC

Page 14: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 14

Recording: Cutoff vs. Data Volume

average data rate

Bulk data transfer in MWN

Connections in LBNLare more light-weight

data reduction rate

LBNL exhibits a higher variability(shows a diurnal variation)

Page 15: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 15

Recording: Does TM has Sufficient CPU Resources for Query Processing?

For recording & indexing,CPU utilization is low~

Page 16: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 16

Recording: Retention Time (how long we store packet data?)

(original 3~6 TB /day)

Avg. 4 days

LBNL has larger retention time, eventhe budgets are small

Page 17: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 17

Querying: number of queries can handle

at LBNL, focus on in-memory queries

Suffices to cope with thenumber of automated queries generated bya NIDS (mentioned later)

Page 18: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 18

Querying: latency between issuing queries and receiving the corresponding repliesat LBNL, with live traffic

Naturally, we wish to keepthe latency low, both toprovide timely responsesand to ensure accessibility of the data (in-memory queries)

In-memory

In-disk

Page 19: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 19

Outline

Introduction Time Machine (TM) Design Performance Evaluation Coupling TM with a Network Intrusion

Detection System (NIDS) Discussion Conclusion & comments

Page 20: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 20

Experiences for Operating the “Original” TM (IMC’05) at LBNL 1.) manually query is infeasible

Lots of NIDS alerts require the analyst to manually interact with the TM to extract the corresponding traffic prior to inspecting it

Provide a direct interface between NIDS and TM to extract the relevant traffic

2.) require dynamically adaptation of TM Sometimes analyst needs to access to more details of

problematic connections by bulk recording NIDS can automatically instruct TM to suspend the cutoff

Page 21: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 21

Experiences for Operating the “Original” TM (IMC’05) at LBNL (cont’d) 3.) support two-tiered analysis strategy

Using cheap, preliminary heuristics to find a pool of possibly problematic connections,

and then perform much more expensive analysis on just that pool

Coupling TM with a NIDS, enable the NIDS to perform retrospective analysis

4.) fine-tune TM’s performance Accommodate the interactions among recording, indexing,

and random queries for rigorous real-time requirements

Page 22: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 22

Prototype Deployment at LBNL

Improve forensics support on: • NIDS controls TM• NIDS retrieves data from TM• Support retrospective analysis

Bro

2-week experiences:• Network traffic: 22.7 TB• TM records 0.6 TB• retention time: 11 days• NIDS reports 66K alerts• 98% alerts are due to scanning activity

Page 23: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 23

NIDS Controls the TM

NIDS dynamically change TM’s parameters Change the storage class of the IP address the

attacker is coming from to a more conservative set of parameters Higher cutoff Larger budget (longer retention time)

Storage classes: Original (benign), 15KB cutoff Scanners (for scan notifications), 50KB cutoff Alarms (for non-scan notifications), disable cutoff

Page 24: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 24

NIDS Retrieves Data from TM NIDS queries the TM for the relevant packets

Then the packets feed back to NIDS and NIDS stores the reassembled payload stream on disk

Eases subsequent manual inspection of the activity E.g.,

HTTP 200 OK Applications running on non-standard ports

Also design a web-interface to notifications and their corresponding network traffic

Page 25: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 25

Retrospective Analysis

A tighter integration of TM and NIDS Recovering from Packet Drops

NIDS may incur measurement drops NIDS can query for connections that are missing

packets and reprocess them Offloading the NIDS

Address the tradeoffs between analysis and resource usage of NIDS

Broadening the analysis context Analyses traffic from past

Page 26: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 26

Outline

Introduction Time Machine (TM) Design Performance Evaluation Coupling TM with a Network Intrusion Detection

System (NIDS) Discussion Conclusion & comments

Page 27: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 27

Deployment Tradeoffs

Risk of Evasion (fundamental limitation) Solution: using different storage classes, using random cutoff limit

Network Load Solution: better hardware, TM clustering

Floods DDoS might stress the TM’s connection-handling, undermine the capture

of useful packets, reduce retention time… Solution: flood detection & mitigation

Retrieval Time Should be careful and notice that disk queries are resource-consuming

NIDS and Cutoff NIDS controls TM only for future activities, how about the past?

Page 28: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 28

Conclusion

Build an evaluated efficient Time Machine Support commodity hardware for Gigabit networks Used operationally

Cutoff heuristic: keep first N bytes of every connection Reduce volume typically by more than 90% Retain days/weeks of full payload traffic traces

Coupled TM with a NIDS (Bro) Improved forensics support Automatic queries for deeper inspection

Page 29: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 29

Future Work

Mitigate evasion risk Use randomized cutoff Keep some packets even after cutoff hit Use NIDS to disable cutoff

Cutoff processing in hardware E.g., NetFPGA

Aggregation instead of direct eviction

Page 30: Enriching Network Security Analysis with Time Travel Gregor Maier 1, Robin Sommer 2, Holger Dreger 3, Anja Feldmann 1, Vern Paxson 4, Fabian Schneider.

2008/9/5 Speaker: Li-Ming Chen 30

Comments

Privacy concern in full payload recording Performance evaluations only for original TM

When coupled with NIDS, the performance of recording and querying become…? Data volume, retention time, query latency?

NIDS controls TM for deeper inspection, when to stop it? Where is the critical evidence of attacks?

(TM) For connections, interesting data mostly at beginning (Gestalt) For connections/associations, interesting data mostly at

procedure violation (My research) For hosts, interesting data mostly at contact activit

y violation What else…?