Post on 25-Dec-2015
1
Data Mining ampIntrusion Detection
Shan Bai Instructor Dr Yingshu Li
CSC 8712 Spring 08
2
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
3
What is an intrusion
An intrusion can be defined as ldquoany set of actions that attempt to compromise the Integrity confidentiality or availability of a resourcerdquo
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
1 2 3 4 5 6 7 8 9 10 11 12 13 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Incidents Reported to Computer Emergency Response
TeamCoordination Center
Spread of SQL Slammer worm 10 minutes
after its deployment
4
Intrusion Examples Trojan horse worm Address spoofing
a malicious user uses a fake IP address to send malicious packets to a target
Many othershellip
DOS denial-of-service
R2L unauthorized access from a
remote machine eg guessing password
U2R unauthorized access to local
super user (root) privileges eg various ``buffer overflow attacks
Probing surveillance and other probing
eg port scanning
5
Intrusion Detection System (IDS)
Intrusion Detection System combination of software and hardware that attempts to
perform intrusion detection raises the alarm when possible intrusion happens
6
IDS Categories
Intrusion detection systems are split into two groups Anomaly detection systems
Identify malicious traffic based on deviations from established normal network
Misuse detection systems Identify intrusions based on a known pattern
(signatures) for the malicious activity
7
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
2
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
3
What is an intrusion
An intrusion can be defined as ldquoany set of actions that attempt to compromise the Integrity confidentiality or availability of a resourcerdquo
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
1 2 3 4 5 6 7 8 9 10 11 12 13 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Incidents Reported to Computer Emergency Response
TeamCoordination Center
Spread of SQL Slammer worm 10 minutes
after its deployment
4
Intrusion Examples Trojan horse worm Address spoofing
a malicious user uses a fake IP address to send malicious packets to a target
Many othershellip
DOS denial-of-service
R2L unauthorized access from a
remote machine eg guessing password
U2R unauthorized access to local
super user (root) privileges eg various ``buffer overflow attacks
Probing surveillance and other probing
eg port scanning
5
Intrusion Detection System (IDS)
Intrusion Detection System combination of software and hardware that attempts to
perform intrusion detection raises the alarm when possible intrusion happens
6
IDS Categories
Intrusion detection systems are split into two groups Anomaly detection systems
Identify malicious traffic based on deviations from established normal network
Misuse detection systems Identify intrusions based on a known pattern
(signatures) for the malicious activity
7
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
3
What is an intrusion
An intrusion can be defined as ldquoany set of actions that attempt to compromise the Integrity confidentiality or availability of a resourcerdquo
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
1 2 3 4 5 6 7 8 9 10 11 12 13 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Incidents Reported to Computer Emergency Response
TeamCoordination Center
Spread of SQL Slammer worm 10 minutes
after its deployment
4
Intrusion Examples Trojan horse worm Address spoofing
a malicious user uses a fake IP address to send malicious packets to a target
Many othershellip
DOS denial-of-service
R2L unauthorized access from a
remote machine eg guessing password
U2R unauthorized access to local
super user (root) privileges eg various ``buffer overflow attacks
Probing surveillance and other probing
eg port scanning
5
Intrusion Detection System (IDS)
Intrusion Detection System combination of software and hardware that attempts to
perform intrusion detection raises the alarm when possible intrusion happens
6
IDS Categories
Intrusion detection systems are split into two groups Anomaly detection systems
Identify malicious traffic based on deviations from established normal network
Misuse detection systems Identify intrusions based on a known pattern
(signatures) for the malicious activity
7
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
4
Intrusion Examples Trojan horse worm Address spoofing
a malicious user uses a fake IP address to send malicious packets to a target
Many othershellip
DOS denial-of-service
R2L unauthorized access from a
remote machine eg guessing password
U2R unauthorized access to local
super user (root) privileges eg various ``buffer overflow attacks
Probing surveillance and other probing
eg port scanning
5
Intrusion Detection System (IDS)
Intrusion Detection System combination of software and hardware that attempts to
perform intrusion detection raises the alarm when possible intrusion happens
6
IDS Categories
Intrusion detection systems are split into two groups Anomaly detection systems
Identify malicious traffic based on deviations from established normal network
Misuse detection systems Identify intrusions based on a known pattern
(signatures) for the malicious activity
7
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
5
Intrusion Detection System (IDS)
Intrusion Detection System combination of software and hardware that attempts to
perform intrusion detection raises the alarm when possible intrusion happens
6
IDS Categories
Intrusion detection systems are split into two groups Anomaly detection systems
Identify malicious traffic based on deviations from established normal network
Misuse detection systems Identify intrusions based on a known pattern
(signatures) for the malicious activity
7
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
6
IDS Categories
Intrusion detection systems are split into two groups Anomaly detection systems
Identify malicious traffic based on deviations from established normal network
Misuse detection systems Identify intrusions based on a known pattern
(signatures) for the malicious activity
7
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
7
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
8
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
9
Goal of Intrusion Detection Systems (IDS) To detect an intrusion as it happens and be able to respond to it
False positives A false positive is a situation where something abnormal (as
defined by the IDS) happens but it is not an intrusion Too many false positives
User will quit monitoring IDS because of noise False negatives
A false negative is a situation where an intrusion is really happening but IDS doesnt catch it
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
10
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
11
Why do we need Data Mining
Despite the enormous amount of data particular events of interest are still quite rare frequency ranges from 01 to less than 10
We are drowning in data but starving for knowledge1048714
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
12
Data Mining vs KDD
Knowledge Discovery in Databases (KDD) The whole process of finding useful information and patterns in data
Data Mining Use of algorithms to extract the information and patterns derived by the KDD process
Data mining is the core of the knowledge discovery process
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
13
KDD Process
Selection Obtain data from various sources Preprocessing Cleanse data Transformation Convert to common format Transform
to new format Data Mining Obtain desired results InterpretationEvaluation Present results to user in
meaningful manner
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
14
Data Mining A KDD Processndash Data mining core of
knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
15
Typical Data Mining Architecture
Data Warehouse
Data cleaning amp data integration Filtering
Databases
Database or data warehouse server
Data mining engine
Pattern evaluation
Graphical user interface
Knowledge-base
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
16
Outline
Intrusion Detection
Data Mining
Data Mining in Intrusion Detection
Reference
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
17
Network intrusion detection
Number of intrusions on the network is typically a very small fraction of the total network traffic
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
18
Why Can Data Mining Help
Learn from traffic data Supervised learning learn precise models from past intrusions Unsupervised learning identify suspicious activities
Maintain models on dynamic data
Correlation of suspicious events across network sites Helps detect sophisticated attacks not identifiable by single site analyses
Analysis of long term data (monthsyears) Uncover suspicious stealth activities (eg insiders leakingmodifying
information)
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
19
Intrusion Detection
Traditional intrusion detection system IDS tools (eg SNORT) are based on signatures of known attacks
LimitationsSignature database has to be manually revised
for each new type of discovered intrusionThey cannot detect emerging cyber threatsSubstantial latency in deployment of newly created
signatures across the computer system
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
20
Data Mining for Intrusion Detection Techniques and Applications
Frequent pattern mining Classification Clustering Mining data streams
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
21
Patterns that occur frequently in a database
Mining Frequent patterns ndash finding regularities
Process of Mining Frequent patterns for intrusion de
tection Phase I mine a repository of normal frequent itemsets for a
ttack-free data
Phase II find frequent itemsets in the last n connections an
d compare the patterns to the normal profile
Frequent pattern mining
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
22
Frequent pattern mining
Apriori bull Any subset of a frequent itemset must be also freque
nt mdash an anti-monotone propertyndash A transaction containing beer diaper nuts alsocontains beer diaperndash beer diaper nuts is frequent beer diaper mustalso be frequent
bull No superset of any infrequent itemset should be generated or tested
ndash Many item combinations can be pruned
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
23
Sequential Pattern Analysis
Models sequence patterns (Temporal) order is important in many situations
Time-series databases and sequence databases
Frequent patterns (frequent) sequential patterns
Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
24
Sequential Pattern Mining
Given a set of sequences find the complete set of frequent subsequences
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
25
Apriori Property in Sequences
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
26
Classification A Two-Step Process Model construction describe a set of predetermined
classes Training dataset tuples for model construction
Each tuplesample belongs to a predefined class Classification rules decision trees or math formulae
Model application classify unseen objects Estimate accuracy of the model using an independent test
set Acceptable accuracy apply the model to classify data tu
ples with unknown class labels
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
27
Classification
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
28
Classification Decision Tree
A node in the tree a test of some attribute A branch a possible value of the attribute Classification
Start at the root Test the attribute Move down the tree branch
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
29
Neural classification HIDE
ldquoA hierarchical network intrusion detection system using statistical processing and neural network classificationrdquo by Zheng et al
Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the
statistical model Statistical processor maintains a model for normal activities
and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
30
Clustering
What Is Clustering Group data into clusters
ndash Similar to one another within the same cluster ndash Dissimilar to the objects in other clusters ndash Unsupervised learning no predefined classes
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
31
Clustering
What Is A Good Clustering High intra-class similarity and low interclasssimilar
ity Depending on the similarity measure
The ability to discover some or all of the hidden patterns
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
32
Clustering
Clustering Approaches Partitioning algorithms
ndash Partition the objects into k clusters ndash Iteratively reallocate objects to improve the clusterin
g Hierarchy algorithms
ndash Agglomerative each object is a cluster merge clusters to form larger ones
ndash Divisive all objects are in a cluster split it up into smaller clusters
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
33
Clustering
K-Means Example
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
34
Mining Data Streams for Intrusion Detection
Maintaining profiles of normal activities The profiles of normal activities may drift
Identifying novel attacks Identifying clusters and outliers in traffic data
streams Reduce the future alarm load by writing
filtering rules that automatically discard well-understood false positives
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
35
Data Mining for Intrusion Detection
Misuse detectionPredictive models are built from labeled data sets (instances
are labeled as ldquonormalrdquo or ldquointrusiverdquo)These models can be more sophisticated and precise than manually
created signatures Recent research eg JAM (Java Agents for Metalearning)
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
36
Misuse Detection
Intrusion Patterns
activities
pattern matching
intrusion
Canrsquot detect new attacks
Example if (src_ip == dst_ip) then ldquoland attackrdquo
look for known indicators ICMP Scans port scans connection attempts CPU RAM IO Utilization File system activity modification of system files permission modifications
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
37
JAM (Java Agents for Metalearning)
JAM (developed at Columbia University) uses data mining techniques to discover patterns of intrusions It then applies a meta-learning classifier to learn the signature of attacks
The association rules algorithm determines relationships between fields in the audit trail records and the frequent episodes algorithm models sequential patterns of audit events Features are then extracted from both algorithms and used to compute models of intrusion behavior
The classifiers build the signature of attacks So thus data mining in JAM builds misuse detection model
Classifiers in the JAM are generated by using rule learning program on training data of system usage After training resulting classification rules is used to recognize anomalies and detect known intrusions
The system has been tested with data from Sendmail-based attacks and with network attacks using TCP dump data
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
38
Data Mining for Intrusion Detection
Anomaly detection Identifies anomalies as deviations from ldquonormalrdquo behavior Eg ADAM Audit Data Analysis and Mining MINDS ndash MINnesota INt
rusion Detection System
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
39
Anomaly Detection
activity measures
0102030405060708090
CPU ProcessSize
normal profileabnormal
probable intrusion
Relatively high false positive rate - anomalies can just be new normal activities
baseline the normal traffic and then look for things that are out of the norm
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
40
ADAM Audit Data Analysis and Mining
Detecting Intrusion by Data MiningCombination of Association Rule and Classification Rule Firstly ADAM collects known frequent datasetsan off-line
algorithm Secondly ADAM runs an online algorithm
Finds last frequent connection records Compare them with known mined data Discards those which seems to be normal Suspicious ones are forwarded to the classifier Trained classifier then classify the suspicious data as one
of the following Known type of attack Unknown type of attack False alarm
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
41
ADAM Detecting Intrusion by Data Mining
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
42
ADAM Audit Data Analysis and Mining
ADAM has two phases in their model
1st Phase Train the classifier Offline process Takes place only once Before the main experiment
2nd Phase Using the trained classifier Trained classifier is then used to detect anomalies Online process
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
43
The MINDS Project
MINDS ndash MINnesota INtrusion Detection System
Learning from Rare Class ndash Building rare class prediction models
Anomalyoutlier detection
Summarization of attacks using association pattern analysis
TID Items
1 Bread Coke Milk
2 Beer Bread
3 Beer Coke Diaper Milk
4 Beer Bread Diaper Milk
5 Coke Diaper Milk
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
Rules Discovered Milk --gt Coke Diaper Milk --gt Beer
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
44
MINDS - Learning from Rare Class
Problem Building models for rare network attacks (Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
45
MINDS - Anomaly Detection
Detect novel attacksintrusions by identifying them as deviations from ldquonormalrdquo ie anomalous behaviorIdentify normal behaviorConstruct useful set of featuresDefine similarity functionUse outlier detection algorithm
Nearest neighbor approach
Density based schemes
Unsupervised Support Vector Machines (SVM)
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
46
Experimental Evaluation
network
net-flow data using CISCO
routers
Data preprocessing
MINDSanomaly detection
helliphellip
Anomaly
scores Association pattern analysis
Open source signature-based network IDS wwwsnortor
g
10 minutes cycle
2 millions connections
Anomaly detection is applied
4 times a day
10 minutes time window
bull Publicly available data setDARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab includes a wide variety of intrusions simulated in a military network environment
bull Real network data from University of Minnesota
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
47
MINDS - Framework for Mining Associations
Anomaly Detection System
attack
normal
R1 TCP DstPort=1863 Attack
hellip
hellip
hellip
hellip
R100 TCP DstPort=80 Normal
Discriminating Association
Pattern Generator
1 Build normal profile
2 Study changes in normal behavior
3 Create attack summary
4 Detect misuse behavior
5 Understand nature of the attack
update
Knowledge Base
Ranked connections
MINDS association analysis module
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
48
Discovered Real-life Association Patterns
At first glance Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first
followed by an attack on a specific machine identified as vulnerable by the attacker
Rule 1 SrcIP=XXXX DstPort=80 Protocol=TCP Flag=SYN NoPackets 3 NoBytes120hellip180 (c1=256 c2 = 1)
Rule 2 SrcIP=XXXX DstIP=YYYY DstPort=80 Protocol=TCPFlag=SYN NoPackets 3 NoBytes 120hellip180 (c1=177 c2 = 0)
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
49
DstIP=ZZZZ DstPort=8888 Protocol=TCP (c1=369 c2=0)DstIP=ZZZZ DstPort=8888 Protocol=TCP Flag=SYN (c1=291 c2=0)
This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol
Having an unauthorized application increases the vulnerability of the system
Discovered Real-life Association Patterns
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
50
SrcIP=XXXX DstPort=27374 Protocol=TCP Flag=SYN NoPackets=4 NoBytes=189hellip200 (c1=582 c2=2)
SrcIP=XXXX DstPort=12345 NoPackets=4 NoBytes=189hellip200 (c1=580 c2=3)
SrcIP=YYYY DstPort=27374 Protocol=TCP Flag=SYN NoPackets=3 NoBytes=144 (c1=694 c2=3)
helliphellip This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window
Discovered Real-life Association Patternshellip(ctd)
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
51
DstPort=6667 Protocol=TCP (c1=254 c2=1)
This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets
fromto various IRC servers around the world Although IRC traffic is not unusual the fact that it is flagged
as anomalous is interesting This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)
Discovered Real-life Association Patternshellip(ctd)
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
52
DstPort=1863 Protocol=TCP Flag=0 NoPackets=1 NoByteslt139 (c1=498 c2=6)
DstPort=1863 Protocol=TCP Flag=0 (c1=587 c2=6)
DstPort=1863 Protocol=TCP (c1=606 c2=8)
This pattern indicates a large number of anomalous TCP connections on port 1863
Further analysis reveals that the remote IP block is owned by Hotmail
Flag=0 is unusual for TCP traffic
Discovered Real-life Association Patternshellip(ctd)
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
53
MINDS Conclusion Data mining based algorithms are capable of detecting intrusions that cannot be
detected by state-of-the-art signature based methods
SNORT has static knowledge manually updated by human analysts
MINDS anomaly detection algorithms are adaptive in nature
MINDS anomaly detection algorithms can also be effective in detecting anomalous behavior originating from a compromised or infected machine
MINDS Research Defining normal behavior Feature extraction Similarity functions Outlier detection Result summarization Detection of attacks
originating from multiple sites
Wormvirus detectionafter infection
Insider attack Policy violation
Outsider attack Network intrusion
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
54
IDS Using both Misuse and Anomaly DetectionRIDS-100
RIDS( Rising Intrusion Detection System) is provided by Rising Tech It is a leader in antivirus and content security software and services in China
The company is a leading provider of client gateway and server security solutions for virus protection firewall and intrusion detection technologies and security services to enterprises and service providers around China
RIDS make the use of both intrusion detection technique misuse and anomaly detection
Distance based outlier detection algorithm is used for detection deviational behavior among collected network data
For misuse detection it has very vast set of collected data pattern which can be matched with scanned network data for misuse detection
This large amount of data pattern is scanned using data mining classification Decision Tree algorithm
httpwwwrising-globalcom
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
55
A cooperative anomaly and intrusiondetection system (CAIDS)
built with a network-based intrusion detection system (NIDS) and an anomaly detection system (ADS) operating interactively through a signature generator
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
56
A cooperative anomaly and intrusiondetection system (CAIDS)
A frequent episode rule (FER) is generated out of a collection of frequent episodes The FER is defined over episode sequences with multiple connection events
For an example we envision a window where we observe a 3-event sequence
E D and F An FER is generated as E rarr D F confidence level freq (a U b)freq (b)=08 where a represents the event E on the LHS and b corresponds t
o the two events D and F on the RHS of the rule
If the b occurs with 5 and the joint event a and b has 4 to occur there is a (004005) = 80 chance that D and F will follow in the same window
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
57
A cooperative anomaly and intrusiondetection system (CAIDS)
In practice the event E could be an authentication service characterized by two attributes
(service =authentication flag=SF) The events D F may be two sequential smtp requests denoted b
y (service = smtp) Thus we can derive an FER with a confidence level of c = 80 t
hat two smtp services will follow the authentication service within a window w = 2 sec The three joint traffic events accounts with a support level s = 10 out of all the network connections being evaluated This FER is formally stated as follows
(service = authentication) rarr (service = smtp) (service = smtp) (08 01 2 sec) (1)
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
58
A cooperative anomaly and intrusiondetection system (CAIDS)
An association rule is aimed at finding interesting intra-relationship inside a single connection record
In general an FER is specified by the following expression
L1 L2hellip Ln R1hellip Rm (c s window) (2)
Li (1 le i le n) and Rj (1 le j le m) are ordered traffic connection events
We call L1 L2hellip Ln the LHS episode and R1hellip Rm the RHS of the episode rule
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
59
A cooperative anomaly and intrusiondetection system (CAIDS)
Architecture of the CAIDS simulator built with a 2000-signature Snortand an anomaly detection subsystem (ADS) with 60 FERs after 2 weeksof rule training over the Lincoln Lab IDS evaluation dataset
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
60
Conclusion
In this report we have studied basic concept and some classic system models like ADAM MINDSin this area
To make summary of those system models their technologies and their validation methods
Hope to a overview on currently development in this area and how data mining is evolving into the field of network intrusion detection
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments
61
Reference DARPA 1998 data set
A cleansed set in KDDCuprsquo99 DARPA 1991 data set is also available httpwwwllmiteduISTidevaldatadata_indexhtml
Daniel Barbara Julia Couto Sushil Jajodia Leonard Popyack Ningning Wu ldquoADAM Detecting Intrusions by Data Miningrdquo Proceedings of the 2001 IEEE Workshop on Information Assurance and Security United States Military Academy West Point NY 5-6 June 2001
Zhang J and Zulkernine M 2006 A Hybrid Network Intrusion Detection Technique Using Random Forests In Proceedings of the First international Conference on Availability Reliability and Security (April 20 - 22 2006)
W Lee et al A data mining framework for building intrusion detection models In Information and System Security Vol 3 No 4 2000
Ertoz L et Al MINDS - Minnesota Intrusion Detection System Next Generation Data Mining Chapter 3 2004
Exploiting efficient data mining techniques to enhance intrusion detection systems Lu C-T Boedihardjo AP Manalwar P Information Reuse and Integration Conf 2005 IRI -2005 IEEE International Conference on Volume Issue 15-17 Aug 2005 Page(s) 512 - 517
Sal Stolfo Andreas Prodromidis Shelley Tselepis Wenke Lee Dave Fan and Phil Chan (Honorable mention (runner-up) for Best Paper Award in Applied Research Category) In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97) Newport Beach CA August 1997
62
Questions amp Comments