FloVis: Flow Visualization System - Dalhousie University · PDF fileFloVis: Flow Visualization...

Click here to load reader

  • date post

    03-Apr-2018
  • Category

    Documents

  • view

    215
  • download

    2

Embed Size (px)

Transcript of FloVis: Flow Visualization System - Dalhousie University · PDF fileFloVis: Flow Visualization...

  • FloVis: Flow Visualization System

    Teryl TaylorDalhousie UniversityHalifax, NS, Canada

    teryl@cs.dal.ca

    Diana PatersonDalhousie UniversityHalifax, NS, Canadapaterson@cs.dal.ca

    Joel GlanfieldDalhousie UniversityHalifax, NS, Canada

    joel@joelglanfield.com

    Carrie GatesCA LabsNew York

    carrie.gates@ca.com

    Stephen BrooksDalhousie UniversityHalifax, NS, Canadasbrooks@cs.dal.ca

    John McHughDalhousie UniversityHalifax, NS, Canadamchugh@cs.dal.ca

    Abstract

    NetFlow data is routinely captured at the border of amany enterprise networks. Although not as rich as fullpacket capture data, Netflow provides a compact record ofthe interactions between host pairs on either side of themonitored border. Analysis of this data presents a chal-lange to the security analyst due to its sheer volume. Wereport preliminary results on the development of a suite ofvisualization tools that are intended to complement the com-mand line tools, such as those from the SiLK Tools, that arecurrently used by analysts to perform forensic analysis ofNetFlow data. The current version of the tool set draws onthree visual paradigms: Activity diagrams that display var-ious aspects of multiple individual host behaviors as colorcoded time series, connection bundles that show the inter-actions among hosts and groups of hosts and the netbytesviewer that allows detailed examination of the port and vol-ume behaviors of an individual host over a period of time.The system supports drill down for additional detail andpivoting that allows the analyst to examine the relationshipsamong the displays. SiLK data is preprocessed into a rela-tional database to drive the display modes, and the toolscan interact with the SiLK system to extract additional dataas necessary.

    1. Introduction

    A security administrator or anaylst has a difficult job. Heor she must examine vast amounts of network data, hostand IDS logs and information from other sources in orderto understand what is occurring on the network. Buried inthis mass of data are the signs of intrusive behaviour. Thetasks associated with this analysis can be time consuming

    and cumbersome as it is difficult to find patterns in the dataand trends over time.

    Recent research has focused on using visualization tech-niques to aid security administrators in gaining a mentalimage of how data is flowing along the network. Visual-ization is powerful because it allows us to view a significantamount of data at once and utilize our cognitive and pre-cognitive abilities to find patterns much more quickly thansifting through raw packet contents.

    The real challenge for using visualization is creating im-ages that are more than just pretty pictures. They must pro-vide insight into the underlying data to be useful. It is possi-ble to show too much information in a picture which couldserve to confuse rather than help.

    Some years ago, we developed a number of static vi-sualization approaches[13] that provided insight into bothnetwork wide and host behaviors. The work reported hereextends these preliminary ideas to an interactive tool that isintegrated with the SiLK netflow tools to allow the analysista rich and flexible visualization capability that complementsthe textual features of the usual SiLK analysis.

    FloVis is a suite of interactive visualizations designedto show various aspects of network data flow. Each imagecomplements the others and shows the data in a differentway such that if one image does not show a specific patternanother might.

    This paper describes the FloVis application features andsome of the key visualizations offered in the suite. In Sec-tion 2 we discuss our overall philosophy and our approachthe the visual representation of flow data. This is followedby the introduction of our three key visual paradigms inSection 3. The representations capture time series behav-ior for a group of hosts as color coded blocks representingcategorical aspects of hourly behavior, as bundles of flowsfrom related sources to related destinations, and as detailed,

  • Figure 1. FloVis Bundle Diagram which shows data flows between IP entities.

    pseudo three dimensional displays of per port volume activ-ity over time. Using data available from a local network, wedescribe an analysis scenario in which the ability to visual-ize connection behavior leasd to the discovery of a com-promised host in Section 4. Section 5 gives an overviewof related work in visual analysis for security and the pa-per ends with our preliminary conclusions and out plans forfuture work in Section 6.

    2 Approach

    FloVis works along with the SiLK Toolkit1 for networkdata analysis. The SiLK tools are used to filter the raw flowdata and to aggregate the selected Netflow data into countedsets or SiLK bags. The bags contain counts of flows, pack-ets, or bytes and can be indexed by IP addresses, ports, pro-tocols and other scalar fields contained in flow data. Theycan also be indexed by composite vaules composed of up to32 bits taken from any two scalar fields of the flow records.These can be used to indicate connections at the subnet tosubnet (/16 to /16, for example), subnet to host (within agiven subnet, /24 prefix to hosts in a given /24, for exam-ple) or host to host (within two prescribed subnets, hosts

    1The SiLK Tools are available as open source under the GPL fromhttp://tools.netsa.cert.org/silk/index.html.

    within a given /8 to hosts within a /24, for example). Bagsare typically accumulated on a per hour or per day basisand contain a volume measure associated with the index.Volumes are usually counts of flow records or sums of thepacket or byte counts contained in the records. In a produc-tion environment, hourly bags for various fields of interestwould be created using command scripts that are scheduledfor periodic execution. The resulting bags are then format-ted into a data format readable by the various FloVis visu-alizations. Some visualizations utilize an SQL (MySQL atpresent) database for its quick and powerful queries whileothers read from the processed bag files directly. As the sys-tem progresses, we hope to use the database wherever pos-sible because it buffers the visualization from direct depen-dence on the SiLK Tools, allowing data form other sourcesto be treated in similar ways.

    FloVis itself is a client based application designed to runon a client workstation. Its built in OpenGL and utilizes aplatform independent windowing system which should al-low it to be built on most operating systems. FlowVis cur-rently contains three different types of visualizations: FlowBundle Diagrams, the NetBytes Entity Viewer and an Exis-tence Graph. Each presents different aspects of host or net-work behavior, but they share linkages through the databaseand, ultimately, through an underlying SiLK repository.The approach allows explicit interactions among the differ-

  • ent views, allowing a interesting region of one visualizationto trigger visualizations of related behaviors. The underly-ing philosophy of the system is to aid the analyst in develop-ing insight into the behaviors that are seen on the networkand hosts being monitored. We assume that the analyst isalready familiar with most aspects of intrusive behavior andis familiar with the command line operation of the SiLKtools allowing us to translate drill down requests that gobeyond the precomputed summaries into scripts to extractadditional data from the SiLK (or a similar) repository andprocess it for either visualization or subsequent, manual orautomated, analysis outside the FloVis system. The scriptswill be parameterized and can be saved in a library for sub-sequent execution or modification as circumstances dictate.

    3 Sample Visualizations

    3.1 Bundle Diagrams

    The Bundle Diagram as shown in Figure 1 focuses ondisplaying Netflow connections between entities (i.e. hostsor subnetworks) on a network. The knock against currentapplications that visualize network connections is that theycan have significant visual occlusion when showing largedatasets. The Bundle Diagram attempts to mitigate this bybundling connections [7] together and utilizing node aggre-gation to limit occlusion. Figure 1 shows a set of circularrings. The outer ring contains 512 points. Each point rep-resents the highest 8 bits of an entitys address (could bean individual host or a subnet) on the network. The linedividing the circle represents a network border where net-flow data crosses from (for instance) an internal network toan external network (or vice versa). Connections are rep-resented by B-spline curves connecting points (entities) onthe circle. Colour is used to represent flow direction (dataflows from blue to red) while transparency is used to indi-cate flow volume. The more opaque a connection line is,the higher the percentage of the maximum volume traffic isrunning along it (for a specific time period).

    The user can choose to loosen the bundles by using aslider control as shown in Figure 2. Bundle looseningstraightens the bundles essentially separating them so theuser can get a better look at the individual netflows.

    The inner rings of the bundle diagram play an importantpart in the overall visualization. They create a radial treelayout for the underlying data source which facilitates theability to drill down in the data. The bundle diagram sup-ports a drill down mode as show in Figure 3. As seen fromthe Figure, the diagram activates a set of transparent greenovals overtop of points on the inner rings. This allows theuser to select a specific branch on the diagram (highlightedin a purple ring in the figure). Clicking on a specific branchdrills down in that branch launching another bundle visu-

    alization with that particular branch taking the full portionof the semi-circle. Drilling down essentially means to slidethe highest 8 bits of