CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of...

28
CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware Behavioral Detection of Malware on Mobile Handsets Abhijit Bose, Xin Hu, Kang G. Shin, Taejoon Park

Transcript of CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of...

Page 1: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Presented by: Suparna ManjunathDept of Computer & Information Sciences

University of Delaware

Behavioral Detection of Malware on Mobile Handsets

Abhijit Bose, Xin Hu, Kang G. Shin, Taejoon Park

Page 2: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Malware on Mobile Handsets

Like PC’s Mobile Handsets are becoming more intelligent and complex in functionality

Exposure to malicious programs and risks increase with the new capabilities of handsets

Cabir, the first mobile worm appeared in June 2004

WinCE.Duts, the Windows CE virus was the first file injector on mobile handsets capable of infecting all the executables in the device’s root directory

Page 3: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Rely primarily on signature-based detection

Useful mostly for post-infection cleanup

Example:

Scan the system directory for the presence of files with specific extension

.APP, .RSC and .MLD in Symbian-based devices

Due to differences between mobile and traditional desktop environments

Limitations of current anti-virus solutions for mobile devices

Page 4: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Why conventional anti-virus solutions are less efficient for mobile devices?

Mobile devices generally have limited resources such as CPU, memory, and battery power

Most published studies on the detection of internet malware focus on their network signatures

Mobile OSes have important differences in the way file permissions and modifications to the OS are handled

Page 5: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Goal

Develop a detection framework that

Overcomes the limitations of signature based detection

Address the unique features and constraints of mobile handsets

Page 6: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Approach

Behavioral detection approach is used to detect malware on mobile handsets

Page 7: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Behavioral Detection

Run-time behavior of an application is monitored and compared against malicious and/or normal behavior profiles

More resilient to polymorphic worms and code obfuscation

Database of behavior profiles is much smaller than that needed for storing signature-based profiles

Suitable for resource limited handsets

Has potential for detecting new malware

Page 8: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

System Overview

Page 9: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Malicious Behavior Signatures

Behavior Signature: Manifestation of a specification of resource accesses and events generated by applications

It is not sufficient to monitor a single event of a process in isolation in order to classify an activity to be malicious

Temporal Pattern: The precedence order of the events and resource accesses, is the key to detect malicious intent

Page 10: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Temporal Patterns - Example

Consider a simple file transfer by calling the Bluetooth OBEX system call in Symbian OS

On their own, any such call will appear harmless

Temporal Pattern:

(received file is of type .SIS) and (that file is executed later) and (installer process seeks to overwrite files in the system directory)

Page 11: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Representation of Malicious Behavior

Simple Behavior: ordering the corresponding actions using a vector clock and applying the “and” operator to the actions

Complex Behavior: specified using temporal logic instead of classical propositional logic

Specification language of TLCK(Temporal Logic of Causal Knowledge) is used to represent malicious behaviors within the context of a handset environment

Page 12: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Behavior Signature

A finite set of propositional variables interposed using TLCK

Each variable (when true) confirms the execution of either

- A single or an aggregation of system calls

- An event such as read/write access to a given file descriptor, directory structure or memory location

PS = {p1, p2, ・ ・ ・ , pm} U {i|i N}∈

Page 13: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Operators used to define Malicious Behavior

Logical Operators:

Temporal Operators:

Page 14: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Example: Commwarrior Worm – Behavior Signature

Page 15: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Atomic Propositional Variables

Page 16: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Higher Level Signatures

Harmless Signatures:

Harmful Signatures:

Page 17: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Generalized Behavior Signatures

Studied more than 25 distinct families of mobile viruses and worms targeting the Symbian OS

Extracted most common signature elements and a database was created

Malware actions were placed were placed into 3 categories:

- User Data Integrity

- System Data Integrity

- Trojan-like Actions

Page 18: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Run-Time Construction of Behavior Signatures

Proxy DLL to capture API call arguments

Page 19: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Major Components of Monitoring System

Page 20: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Behavior Classification By Machine Learning Algorithm

Behavior signatures for the complete life cycle of malware are placed in the behavior database for run-time classification

To activate early response mechanisms, malicious behavior database must also contain partial signatures that have a high probability of eventually manifesting as malicious behavior

Behavior detection system can detect even new malware or variants of existing malware, whose behavior is only partially matched with the signatures in the database

SVM is used to classify partial behavior signatures from the training data of both normal and malicious applications

Page 21: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Possible Evasions

Program behavior can be obfuscated by:

Behavior reordering

File or directory renaming

Normal behavior insertion

Equivalent behavior replacement

Page 22: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Limitations

The detection might fail if most behaviors of a mobile malware are completely new or the same as normal programs

The system can be circumvented by malware that can bypass the API monitoring or modify the framework configuration

Page 23: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Evaluation

Monitor agent (platform dependent) and Behavior detection agent (platform independent) is evaluated

Program behavior is emulated and then tested against real-world worms

5 malware applications (Cabir, Mabir, Lasco, Commwarrior, and a generic worm that spreads by sending messages via MMS and Bluetooth) and 3 legitimate applications (Bluetooth OBEX file transfer, MMS client, and the MakeSIS utility in Symbian OS) were built

Applications

(Malwre + Legitimate)

Set of Behavior

Signatures

Obtain Partial/ Full Signatures

Remove Redundant Signatures

Training Dataset

Testing Dataset

Page 24: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Classification Accuracy of Known Worms

Page 25: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Detection Accuracy (%) of Unknown Worms

Page 26: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Evaluation with Real-world Mobile Worms

Two Symbian worms, Cabir and Lasco are considered

Behavior signatures are collected by compiling and running them on Symbian emulator

- SVC achieved 100% detection of all worm instances

Framework’s resilience to the variations and obfuscation is tested by considering the variants of Cabir

- The variants are easily detectable as the behavioral detection abstracts away the name details

Page 27: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Conclusions

Due to fewer signatures, the malware database is compact and can be place on a handset

Can potentially detect new malware and their variants

Behavioral detection results in high detection rates

Page 28: CISC 879 - Machine Learning for Solving Systems Problems Presented by: Suparna Manjunath Dept of Computer & Information Sciences University of Delaware.

CISC 879 - Machine Learning for Solving Systems Problems

Thank You