Improving accuracy of malware detection by filtering evaluation dataset based on its similarity

FFRI, Inc.

Fourteenforty Research Institute, Inc.

FFRI, Inc. http://www.ffri.jp

Junichi Murakami Director of Advanced Development

FFRI, Inc.

• This slides was used for a presentation at CSS2013

– http://www.iwsec.org/css/2013/english/index.html

• Please refer the original paper for the detail data

– http://www.ffri.jp/assets/files/research/research_papers/MWS2013_paper.pdf (Written in Japanese but the figures are common)

• Contact information

– research-feedback@ffri.jp

– @FFRI_Research (twitter)

Preface

FFRI, Inc.

• Background

• Problem

• Scope and purpose

• Experiment 1

• Experiment 2

• Experiment 3

• Consideration

• Conclusion

Agenda

FFRI, Inc.

Background – malware and its detection

Increasing

malware

Targeted Attack

(Unknown malware)

Malware generators

Obfuscators

Limitation of

signature matching

other methods

Heuristic

Could reputation

Machine learning Bigdata

FFRI, Inc.

Background – Related works

Features

Static information

Dynamic information

Hybrid

Algorithms

Naive bayes

Perceptron, etc.

Evaluation

TPR/FRP, etc.

ROC-curve, etc.

Accuracy, Precision

• Mainly focusing on a combination of the factors below

– Features selection and modification, parameter settings

• Some good results are reported (TRP:90%+, FRP:1%-)

FFRI, Inc.

• General theory of machine learning:

– Accuracy of classification declines if trends of training and testing data are different

• How about malware and benign files

Problem

FFRI, Inc.

① Investigating differences between similarities of malware and benign files(Experiment-1)

② Investigating an effect for accuracy of classification by the difference(Experiment-2)

③Based on the result above, confirming an effect of removing data whose similarity with a training data is low (Experiment-3)

Scope and purpose

FFRI, Inc.

• Used FFRI Dataset 2013 and benign files we collected as datasets

• Calculated the similarity of each malware and benign files (Jubatus, MinHash)

• Feature vector: A number of 4-gram of sequential API calls

– ex: NtCreateFile_NtWriteFile_NtWriteFile_NtClose: n times NtSetInformationFile_NtClose_NtClose_NtOpenMutext: m times

Experiment-1(1/3)

malware

benign A B C ...

A B C ...

A ー 0.8 0.52 ...

B ーー 1.0 ...

C ーーー ...

... ーーーー

FFRI, Inc.

Grouping malware and benign files based on their similarities

Experiment-1(2/3)

Threshold of similarity (0.0 - 1.0) benign

malware

FFRI, Inc.

Experiment-1(3/3)

正常

マル

ウェ

正常

マル

ウェ

正常

マル

ウェ

正常

マル

ウェ

正常

マル

ウェ

0.8 0.85 0.9 0.95 1

仲間無

仲間有

Threshold of similarity

It is more difficult to find similar benign files compared to malware

unique

not unique

FFRI, Inc.

• How much does the difference affect a result?

• 50% of malware/benign are assigned to a training, the others are to a testing dataset(Jubatus, AROW)

Experiment-2(1/3)

benign

malware

jubatus

classify

jubatus TPR: ?

FPR: ?

TPR: True Positive Rate FPR: False Positive Rate

FFRI, Inc.

Experiment-2(2/3)

benign

malware

jubatus

classify

jubatus TPR: ?

FPR: ?

• How much does the difference affect a result?

• 50% of malware/benign are assigned to a training, the others are to a testing dataset(Jubatus, AROW)

FFRI, Inc.

The accuracy declines if trends of training and testing data are different

Experiment-2(3/3)

0 50 100 0 1 2 3 4 5

■TPR ■FPR

97.996(not unique)

81.297(unique)

0.624(not unique)

4.49(unique)

-16.699

+3.866

FFRI, Inc.

benign(train) malware(train)

benign(test) malware(test）

dividing line

Experiment-3(1/6) – After a training

malware

benign

FFRI, Inc.

Experiment-3(2/6) – After a classification

dividing line

FFRI, Inc.

Experiment-3(2/6) – After a classification

dividing line

FFRI, Inc.

Experiment-3(3/6) – Low similarity data

TP(accidentally)

dividing line

FFRI, Inc.

Experiment-3(4/6) – Effect to TPR

0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

of cla

ssifie

FFRI, Inc.

Experiment-3(5/6) – Effect to FPR

0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

of cla

fied d

FFRI, Inc.

Experiment-3(6/6)

0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

マルウェア正常系ソフトウェア

of cla

ssifie

/ The n

tal te

Transition of the number of classified data

malware benign

FFRI, Inc.

• In real scenario:

– trying to classify an unknown file/process whether it is benign files or not

• If we apply Experiment-3:

– Files are classified only if similar data is already trained

– If not, files are not classified which results in

• FN if the files is malware

• TF if the files is benign (All right as a result)

• Therefore it is a problem about “TPR for unique malware” (Unique malware is likely to be undetectable)

Consideration(1/3)

FFRI, Inc.

• If malware have many variants as the current

– ML-based detection works well

• Having many variants ∝ malware generators/obfuscators

• We have to investigate

– Trends of usage of the tools above

– Possibility of anti-machine learning detection

Consideration(2/3)

FFRI, Inc.

• How to deal with unclassified (filtered) data

1. Using other feature vectors

2. Enlarging a training dataset (Unique → Not unique)

3. Using other methods besides ML

Consideration(3/3)

FFRI, Inc.

• Distribution of similarity for malware and benign are difference (Experiment-1)

• Accuracy declines if trends of training and testing data are different (Experiment-2)

• TPR of unique malware declines when we remove low similarity data (Experiment-3)

• Continual investigation for trends of malware and related tools are required

• (Might be necessary to develop technology to determine benign files)

Conclusion

Improving accuracy of malware detection by filtering evaluation dataset based on its similarity

Technology

Transcript of Improving accuracy of malware detection by filtering evaluation dataset based on its similarity

Kharon dataset: Android malware under a microscope

Observer effect: How Intercepting HTTPS traffic forces ... · HTTPs Malware dataset Nomad Dataset → 200 capturas de tráfico de red de malware Diferentes tipos de malware (Botnet,

Simseer.com - Malware Similarity and Clustering Made Easy

Fitur Pada Klasifikasi Malware 4-Implementasi Teknik Seleksieprints.umm.ac.id/58177/19/Similarity - Saputra Basuki...menggunakan metode statik dan dinamik, metode statik mengambil

Proceedings LASER 2016 Learning from …...USENIX Association LASER 2016 • Learning from Authoritative Security Experiment Results 1 Kharon dataset: Android malware under a microscope

Exploiting Similarity Between Variants to Defeat Malware€¦ · 04/01/2007 | Blackhat DC | Walenstein Exploiting Similarity Between Variants 5 So Few Families, So Many Variants Clearly

Kharon dataset: Android malware under a microscope · Kharon dataset: Android malware under a microscope ... 1 Introduction Android malware have become a very active research ...

The Patent Similarity Dataset Overview · the patent similarity dataset was created, this paper’s final section goes on to describe the patent similarity dataset’s qualities and

Malware Variant Detection Using Similarity Search over Sets of Control Flow Graphs

Malware Fails Best Bugs in Malware Felix Leder [Malware ... · 1 Malware Fails Best Bugs in Malware Felix Leder [Malware Detection Team] Felix.Leder@norman.com 5. desember 2011 malware

A FRAMEWORK FOR AUTOMATED SIMILARITY ANALYSIS OF MALWARE · A FRAMEWORK FOR AUTOMATED SIMILARITY ANALYSIS OF MALWARE ... 20 Dr. Amir Asif, ... we investigate the problem of automated

Exploring Timeline-Based Malware Classification - Springer · Exploring Timeline-Based Malware Classification 3 They claim that the similarity between two files can be determined

(2013) similarity metric method of obfuscated malware using function call graph

Deep Malware Analysis - Joe Sandbox Sandbox Ultimate Feature... · Sample: b19411d.js Startdate: ... Deep Malware Analysis, ... Joe Sandbox Mail Monitor Malware similarity analysis

Malware & Anti-Malware

Webcast: The Similarity Evidence Explorer For Malware

Tackling Android s Native Library Malware with Robust, Efficient … · 2019-01-30 · Tackling Android’s Native Library Malware with Robust, Efﬁcient and Accurate Similarity

A Comprehensive Measurement Study of Domain Generating Malware€¦ · 3. DGA-Malware Dataset 3.2 –Reimplementing DGAs –compara a saída com os domínios apresentados pelo respectivo

Static and Dynamic Analysis Method Implementation of ...eprints.uad.ac.id/6927/1/hasil-cek-similarity-0510088001-60020397... · Implementation of Malware Analysis using Static and

Improving Malware Detection with Neuroevolution: A Study ... · dataset about Android malware. Findings proved that SLM, sharing common parametrization with MLP, in order to have