CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of...
-
Upload
leslie-hood -
Category
Documents
-
view
218 -
download
2
Transcript of CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of...
CISC 879 - Machine Learning for Solving Systems Problems
Presented by: Suparna ManjunathDept of Computer & Information Sciences
University of Delaware
Behavioral Detection of Malware on Mobile Handsets
Abhijit Bose, Xin Hu, Kang G. Shin, Taejoon Park
CISC 879 - Machine Learning for Solving Systems Problems
Malware on Mobile Handsets
Like PC’s Mobile Handsets are becoming more intelligent and complex in functionality
Exposure to malicious programs and risks increase with the new capabilities of handsets
Cabir, the first mobile worm appeared in June 2004
WinCE.Duts, the Windows CE virus was the first file injector on mobile handsets capable of infecting all the executables in the device’s root directory
CISC 879 - Machine Learning for Solving Systems Problems
Rely primarily on signature-based detection
Useful mostly for post-infection cleanup
Example:
Scan the system directory for the presence of files with specific extension
.APP, .RSC and .MLD in Symbian-based devices
Due to differences between mobile and traditional desktop environments
Limitations of current anti-virus solutions for mobile devices
CISC 879 - Machine Learning for Solving Systems Problems
Why conventional anti-virus solutions are less efficient for mobile devices?
Mobile devices generally have limited resources such as CPU, memory, and battery power
Most published studies on the detection of internet malware focus on their network signatures
Mobile OSes have important differences in the way file permissions and modifications to the OS are handled
CISC 879 - Machine Learning for Solving Systems Problems
Goal
Develop a detection framework that
Overcomes the limitations of signature based detection
Address the unique features and constraints of mobile handsets
CISC 879 - Machine Learning for Solving Systems Problems
Approach
Behavioral detection approach is used to detect malware on mobile handsets
CISC 879 - Machine Learning for Solving Systems Problems
Behavioral Detection
Run-time behavior of an application is monitored and compared against malicious and/or normal behavior profiles
More resilient to polymorphic worms and code obfuscation
Database of behavior profiles is much smaller than that needed for storing signature-based profiles
Suitable for resource limited handsets
Has potential for detecting new malware
CISC 879 - Machine Learning for Solving Systems Problems
System Overview
CISC 879 - Machine Learning for Solving Systems Problems
Malicious Behavior Signatures
Behavior Signature: Manifestation of a specification of resource accesses and events generated by applications
It is not sufficient to monitor a single event of a process in isolation in order to classify an activity to be malicious
Temporal Pattern: The precedence order of the events and resource accesses, is the key to detect malicious intent
CISC 879 - Machine Learning for Solving Systems Problems
Temporal Patterns - Example
Consider a simple file transfer by calling the Bluetooth OBEX system call in Symbian OS
On their own, any such call will appear harmless
Temporal Pattern:
(received file is of type .SIS) and (that file is executed later) and (installer process seeks to overwrite files in the system directory)
CISC 879 - Machine Learning for Solving Systems Problems
Representation of Malicious Behavior
Simple Behavior: ordering the corresponding actions using a vector clock and applying the “and” operator to the actions
Complex Behavior: specified using temporal logic instead of classical propositional logic
Specification language of TLCK(Temporal Logic of Causal Knowledge) is used to represent malicious behaviors within the context of a handset environment
CISC 879 - Machine Learning for Solving Systems Problems
Behavior Signature
A finite set of propositional variables interposed using TLCK
Each variable (when true) confirms the execution of either
- A single or an aggregation of system calls
- An event such as read/write access to a given file descriptor, directory structure or memory location
PS = {p1, p2, ・ ・ ・ , pm} U {i|i N}∈
CISC 879 - Machine Learning for Solving Systems Problems
Operators used to define Malicious Behavior
Logical Operators:
Temporal Operators:
CISC 879 - Machine Learning for Solving Systems Problems
Example: Commwarrior Worm – Behavior Signature
CISC 879 - Machine Learning for Solving Systems Problems
Atomic Propositional Variables
CISC 879 - Machine Learning for Solving Systems Problems
Higher Level Signatures
Harmless Signatures:
Harmful Signatures:
CISC 879 - Machine Learning for Solving Systems Problems
Generalized Behavior Signatures
Studied more than 25 distinct families of mobile viruses and worms targeting the Symbian OS
Extracted most common signature elements and a database was created
Malware actions were placed were placed into 3 categories:
- User Data Integrity
- System Data Integrity
- Trojan-like Actions
CISC 879 - Machine Learning for Solving Systems Problems
Run-Time Construction of Behavior Signatures
Proxy DLL to capture API call arguments
CISC 879 - Machine Learning for Solving Systems Problems
Major Components of Monitoring System
CISC 879 - Machine Learning for Solving Systems Problems
Behavior Classification By Machine Learning Algorithm
Behavior signatures for the complete life cycle of malware are placed in the behavior database for run-time classification
To activate early response mechanisms, malicious behavior database must also contain partial signatures that have a high probability of eventually manifesting as malicious behavior
Behavior detection system can detect even new malware or variants of existing malware, whose behavior is only partially matched with the signatures in the database
SVM is used to classify partial behavior signatures from the training data of both normal and malicious applications
CISC 879 - Machine Learning for Solving Systems Problems
Possible Evasions
Program behavior can be obfuscated by:
Behavior reordering
File or directory renaming
Normal behavior insertion
Equivalent behavior replacement
CISC 879 - Machine Learning for Solving Systems Problems
Limitations
The detection might fail if most behaviors of a mobile malware are completely new or the same as normal programs
The system can be circumvented by malware that can bypass the API monitoring or modify the framework configuration
CISC 879 - Machine Learning for Solving Systems Problems
Evaluation
Monitor agent (platform dependent) and Behavior detection agent (platform independent) is evaluated
Program behavior is emulated and then tested against real-world worms
5 malware applications (Cabir, Mabir, Lasco, Commwarrior, and a generic worm that spreads by sending messages via MMS and Bluetooth) and 3 legitimate applications (Bluetooth OBEX file transfer, MMS client, and the MakeSIS utility in Symbian OS) were built
Applications
(Malwre + Legitimate)
Set of Behavior
Signatures
Obtain Partial/ Full Signatures
Remove Redundant Signatures
Training Dataset
Testing Dataset
CISC 879 - Machine Learning for Solving Systems Problems
Classification Accuracy of Known Worms
CISC 879 - Machine Learning for Solving Systems Problems
Detection Accuracy (%) of Unknown Worms
CISC 879 - Machine Learning for Solving Systems Problems
Evaluation with Real-world Mobile Worms
Two Symbian worms, Cabir and Lasco are considered
Behavior signatures are collected by compiling and running them on Symbian emulator
- SVC achieved 100% detection of all worm instances
Framework’s resilience to the variations and obfuscation is tested by considering the variants of Cabir
- The variants are easily detectable as the behavioral detection abstracts away the name details
CISC 879 - Machine Learning for Solving Systems Problems
Conclusions
Due to fewer signatures, the malware database is compact and can be place on a handset
Can potentially detect new malware and their variants
Behavioral detection results in high detection rates
CISC 879 - Machine Learning for Solving Systems Problems
Thank You