Mining Specifications of Malicious Behavior
description
Transcript of Mining Specifications of Malicious Behavior
![Page 1: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/1.jpg)
Mining Specifications of Mining Specifications of Malicious BehaviorMalicious Behavior
Mihai Christodorescu (work done atUniversity of Wisconsin)
Somesh Jha University of Wisconsin
Christopher Kruegel Technical University Vienna
IBM Research
![Page 2: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/2.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 2
Wide Spectrum of DetectorsWide Spectrum of Detectors
• Static detectors:
Semantics-aware malware detection [Christodorescu et al. 2005]
Model checking-based malware detection[Kinder et al. 2005]
Behavior-based spyware detection[Kirda et al. 2006]
Shadow Honeypots [Anagnostakis et al. 2005]
•Dynamic/hybrid detectors, host IDS:
……
![Page 3: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/3.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 3
Misuse DetectionMisuse Detection
Distinct techniques, fundamentally similar…
They all require high-quality specifications of malicious behavior.
……
![Page 4: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/4.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 4
Current Specification Current Specification GenerationGeneration• Specifications are manually developed
by experts.
Two issues:– Time consuming– Error prone
? Spec
![Page 5: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/5.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 5
Problem 1: Problem 1: Specification Specification DelayDelayTime from appearance of new malware
to availability of specification:• Manual analysis of the binary• Testing of the specification
Anti-virus industry: 4-18 hours to generate a new specification.
window of vulnerability
![Page 6: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/6.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 6
Problem 2: Problem 2: Spec. ImprecisionSpec. Imprecision
Too general = false positives Angry users
Infected machinesToo specific = false negatives
![Page 7: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/7.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 7
Our SolutionOur Solution
MINIMAL: a technique for mining malicious specifications
• Automatic• Flexible specification language• Fast• Performs well (compared to a human
expert)
![Page 8: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/8.jpg)
Specification LanguageSpecification Language
![Page 9: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/9.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 9
What’s In a Specification?What’s In a Specification?
Requirements for obfuscationresilience:
1. Describe only relevant operations
2. Capture dependencies where present
3. Preserve independence of operations
![Page 10: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/10.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 10
Specifying Malicious Specifying Malicious OperationsOperations• We chose system calls
– Compatible with specifications for behavior-based detectors
– Define interface between trusted OS and untrusted programs
• Mining algorithm is not restricted to the system-call interface.
![Page 11: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/11.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 11
Specifying Malicious Specifying Malicious ConstraintsConstraints• Program operations are insufficient to
distinguish malicious from benign.
• We need to capture relations between operations:
F=open(“file”) ; read(F,buf) ; send(S,buf)
Constraints = logical formulas over system-call arguments
![Page 12: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/12.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 12
A Sample Specification A Sample Specification (Malspec)(Malspec)• Mass-mailing malware:
Y))T,Base64(l(StringEqua
send(X4,“DATA”)
X:=socket()
connect(X2)
send(X3,“EHLO”)
Y:=read(Z2)
send(X5,T)
Z:=open(S2)
S:=process_name()
2XX
32 XX
43 XX
54 XX
2SS
2ZZ
![Page 13: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/13.jpg)
Mining AlgorithmMining Algorithm
![Page 14: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/14.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 14
The Specification Mining The Specification Mining ProblemProblem
Y))T,Base64(l(StringEqua
send(X4,“DATA”)
X:=socket()
connect(X2)
send(X3,“EHLO”)
Y:=read(Z2)
send(X5,T)
Z:=open(S2)
S:=process_name()
2XX
32 XX
43 XX
54 XX
2SS
2ZZ
Known malware
Known benign programsMINIMAL
Specification of malicious behavior
![Page 15: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/15.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 15
The Basic Mining OperationThe Basic Mining Operation
Known malware
Known benign program
C Q A
A O O O Q C
Q M C P
O F
O
F M C
A
Malware dependence graph
O O A A B O
R R R M M A A
R RC C R
A W
O W
C C
Benign dependence graph
O O O Q
M C P
Minimalmalspec
Compute dependence graphsStep 1 Compute graph differenceStep 2
![Page 16: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/16.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 16
Multi-Program MiningMulti-Program Mining
Maximal union of malspecs:
O O O Q
M C Pvs.
O Q
P
O O
M C
O O O Q
M C P
![Page 17: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/17.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 17
System-Call Dependence System-Call Dependence GraphGraph• We use a dynamic analysis to construct
the dependence graph– Static analysis too imprecise on binary
code
• Steps:1. Collect system-call trace2. Infer dependencies between system calls3. Construct (an underapproximation of the)
dependence graph
Step 1
![Page 18: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/18.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 18
Discovering DependencesDiscovering Dependences
NtOpenKey( 372, 0x20019, {24, 356, "ActiveComputerName", 0x40, 0, 0} )
NtQueryValueKey( 372, "ComputerName", Full, {TitleIdx=0, Type=1,Name="ComputerName", Data="Z...“}, 108, 76 )
NtClose( 372 )
Def-UseDependenc
es
SubstringDependenc
es
Step 1
![Page 19: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/19.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 19
Discovering Local ConstraintsDiscovering Local Constraints
• Access to well-defined resources:– Windows registry– Access to self– System files/directories
Step 1
NtCreateFile ( …, { …, "I-Worm.Mydoom.l.exe", … }, … )
![Page 20: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/20.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 20
Graph DifferencingGraph Differencing
Problem:Find the smallest subgraph of malicious operations that does not appear in any benign graph.
Solution:Minimal Contrast Subgraph
[Ting, Bailey SDM 2006]
Step 2
![Page 21: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/21.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 21
Minimal Contrast SubgraphsMinimal Contrast Subgraphs
• Idea:Minimal contrast subgraphs andmaximal common edge setsare duals.
• Vertex and edge labels (i.e., system calls and constraint formulas) help the search.
Step 2
![Page 22: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/22.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 22
Mining Contrast SubgraphsMining Contrast Subgraphs
• Size of graphs:100K-1.5M nodes, similar for edges
• Worst-case complexity: O(N!)
Step 2
C Q A
A O O O Q C
Q M C P
O F
O
F M C
AMalware dependence graph
O O A A B O
R R R M M A A
R RC C R
A W
O W
C C
Benign dependence graph
![Page 23: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/23.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 23
Heuristics Reduce Problem Heuristics Reduce Problem SizeSize• Normalize dependence graph
– Replace system-call sequences with shorter equivalents
• Eliminate disconnected subgraphs
• Eliminate trivial subgraphs
[see paper for details]
![Page 24: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/24.jpg)
EvaluationEvaluation
![Page 25: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/25.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 25
Evaluating Evaluating MMINIINIMMALAL
• Goals:– Compare MINIMAL malspecs with those
from human expert– Use mined malspecs with behavior-based
detector
![Page 26: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/26.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 26
Experimental SetupExperimental Setup
• Trace collection in Windows 2000:– Malware samples run with no user input
(cf. expected execution model)– Benign samples run with normal user input– Execution for 1 or 2 minutes
• 16 malware samples:– Netsky, MyDoom, Beagle
• 6 benign programs:– Firefox, Thunderbird, installers
![Page 27: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/27.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 27
MMINIINIMMALAL vs. Human Expert vs. Human Expert
Average success rate: 77.26%Average mining time: 8 minutes
MMINIINIMMALAL malspecs
Netsky.A 5 7
Netsky.H 5 6
Bagle.C 7 12
MyDoom.E
7 10
Behavioral features as
given by Symantec website.
Behavioral features as
given by Symantec website.
![Page 28: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/28.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 28
Mined Malspecs for Netsky.AMined Malspecs for Netsky.A
MMINIINIMMALAL malspecs
Create mutex
Self-installation
Modify boot sequence
Terminate antivirus
Email self as ZIP file
Copy self to network drive
![Page 29: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/29.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 29
Limitations of Limitations of MMINIINIMMALAL
• Sensitive to test environment– Malicious behavior might not be observed
during tracing.
• Underapproximation of dependence graph– Complex constraints are not discovered.
• Sensitive to test-set selection– Not all differences are malicious behaviors.
Future WorkFuture Work
![Page 30: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/30.jpg)
Mining Specifications of Mining Specifications of Malicious BehaviorMalicious Behavior
Mihai [email protected]
![Page 31: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/31.jpg)
Here Be Dragons!Here Be Dragons!
![Page 32: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/32.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 32
Mining Malspecs from Mining Malspecs from MalwareMalwareDynamic differential analysis
Malware system-call trace
Benign program system-call trace
Malspec:
![Page 33: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/33.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 33
Specifying Malicious BehaviorSpecifying Malicious Behavior
• Example• Now manual• NEED: automatic
![Page 34: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/34.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 34
A C Q A O O O Q C F O Q O M C P F O M C
![Page 35: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/35.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 35
1.1. Only Relevant OperationsOnly Relevant Operations
• System calls
Approach: Collect system-call traces.
![Page 36: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/36.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 36
2.2. DependencesDependences
![Page 37: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/37.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 37
3.3. Independence of Independence of OperationsOperations
![Page 38: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/38.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 38
Good SpecificationsGood Specifications
• One can write specifications satisfying the requirements.
• The algorithm to generate specifications must write specifications satisfying the requirements.
![Page 39: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/39.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 39
Misuse-Detection Misuse-Detection FundamentalsFundamentals
• Database of specifications (“signatures”) defines what is malicious.
Protected computerProtected computerProtected computer
Malware detector
Protected computerUnknown executable Protected computer
SigDB
![Page 40: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/40.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 40
• Byte signatures are not rich enough to capture malicious behavior.
Hackers evade detection through obfuscation.
Failures of Current DetectorsFailures of Current Detectors
All mass-mailing worms
Underapproximating byte signature
Overapproximating byte signature
![Page 41: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/41.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 41
Behavior-Based DetectorsBehavior-Based Detectors
• New detection techniques use higher level specificationsSemantics-Aware Malware Detection (2005)Model Checking-based Malware Detection (2005)Behavior-based Spyware Detection (2006)
These still depend on the quality of the specifications!
![Page 42: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/42.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 42
MMINIINIMMALAL: Malware-Mining : Malware-Mining AlgorithmAlgorithm
Known malware
NtOpenKey( 372, 0x20019, {24, 356, "ActiveComputerName",
0x40, 0, 0} )
Monitor system calls during Monitor system calls during executionexecution
System-call traceA C Q A O O O Q C F O Q O M C P F O M C
Step 1
![Page 43: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/43.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 43
MMINIINIMMALAL: Malware-Mining : Malware-Mining AlgorithmAlgorithm
Discover dependences between Discover dependences between syscallssyscalls
System-call traceA C Q A O O O Q C F O Q O M C P F O M C
Data dependence
graph
C Q A
A O O O Q C
Q M C P
O F
O
F M C
A
Step 2
![Page 44: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/44.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 44
MMINIINIMMALAL: Malware-Mining : Malware-Mining AlgorithmAlgorithm
Find malicious–benign graph Find malicious–benign graph differencedifference
C Q A
A O O O Q C
Q M C P
O F
O
F M C
A
Malspec
Malware dependence graph
O O O Q
M C P
O O A A B O
R R R M M A A
R RC C R
A W
O W
C C
Benign dependence graph
Step 3
![Page 45: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/45.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 45
Dependence Graph Dependence Graph NormalizationNormalization• Many sequences of system calls are
equivalent:
Aggregation replaces such sequences with a canonical sequence. (see paper for details)
Step 2
read( …, 1 )read( …, 1 )read( …, 1 )read( …, 1 )read( …, 1 )
read( …, 5 )≡
![Page 46: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/46.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 46
MMINIINIMMALAL Specs in Detection Specs in Detection
• Missed malspecs:– Due to unrecovered dependences
(e.g., ZIP compression of data)– Due to incompleteness of trace data
(e.g., certain behaviors did not execute)
• Using mined malspecs in semantics-aware malware detection:Netsky.A malspec Netsky.D, E, F, …
![Page 47: Mining Specifications of Malicious Behavior](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815383550346895dc186bd/html5/thumbnails/47.jpg)
ESEC/FSE 2007, Sept. 5 Mihai Christodorescu 47
ConclusionsConclusions
• The mining malicious of behavior can be automated
• Mined malspecs compare well with those from human experts
• Mining time significantly reduced over manual specification