BinGraph: Discovering Mutant Malware using Hierarchical … · 2014. 9. 2. · BinGraph:...

BinGraph: Discovering Mutant Malware usingHierarchical Semantic Signatures

Jonghoon Kwon, Heejo Lee

Div. of Computer & Communication EngineeringKorea University

Seoul, Republic of Korea{signalnine, heejo}@korea.ac.kr

Abstract— Malware landscape has been dramatically elevatedover the last decade. The main reason of the increase is thatnew malware variants can be produced easily using simple codeobfuscation techniques. Once the obfuscation is applied, themalware can change their syntactics while preserving semantics,and bypass anti-virus (AV) scanners. Malware authors, thus,commonly use the code obfuscation techniques to generate meta-morphic malware. Nevertheless, signature based AV techniquesare limited to detect the metamorphic malware since they arecommonly based on the syntactic signature matching. In thispaper, we propose BinGraph, a new mechanism that accuratelydiscovers metamorphic malware. BinGraph leverages the seman-tics of malware, since the mutant malware is able to manipulatetheir syntax only. To this end, we first extract API calls frommalware and convert to a hierarchical behavior graph thatrepresents with identical 128 nodes based on the semantics. Later,we extract unique subgraphs from the hierarchical behaviorgraphs as semantic signatures representing common behaviorsof a specific malware family. To evaluate BinGraph, we analyzeda total of 827 malware samples that consist of 10 malwarefamilies with 1,202 benign binaries. Among the malware, 20%samples randomly chosen from each malware family were usedfor extracting semantic signatures, and rest of them were usedfor assessing detection accuracy. Finally, only 32 subgraphs wereselected as the semantic signatures. BinGraph discovered malwarevariants with 98% of detection accuracy.

I. INTRODUCTION

Malware is a software that implemented to damage com-puter systems or networks without any user’s awareness [1].Such malware like trojan, virus, worm and bot leads mostkinds of cyber crimes, such as DDoS attacks, spam, click fraudand information theft. Even through there are numerical effortsto detect malware, malware threat is rampant.

The major difficulty of malware detection is the rapidincrease of malware variants. Symantec and McAfee securityreports state that the number of new malware signatures hasshown an extreme growth by more than doubling on a year-to-year between 2006 and 2011. Moreover, new signaturecreation in 2011 is figured about 403 Million [2], [3]. The mainreason for the dramatic growth is that the malware variants canbe easily produced by adopting new techniques such as codeobfuscation and modularization [4]–[7].

The code obfuscation technique supports that malware canchange their instruction sequences and also even their sig-natures with preserving their functionalities. Thus, the codeobfuscation provides the chance to evade existing signature-based AV scanners with inexpensive cost [8]. This is the whynew malware having explosively increase, and AV vendersneed to pay the amount of cost for generating new signature.Taha and Jacob et al. state that, over 50% of new malware areobfuscated version of existing known malware [9], [10].

Modularization of malware is another important issue ofthe malware landscape [11]. Riech et al. and Thomas et al.state that Malware authors often build their codes with severalmodules [7], [12]. It allows them to easily create new malware.For example, malware authors, who are lack of abilities toimplement their own malware, can build new malware bycombining existing malware modules. In addition, the mod-ularization also offers possible ways to evade AV scanners aswell. If one of the modules is labeled as a malicious by AVvendors, all the malware authors have to do is only simplemodification or substitution of the particular module.

Currently, most of AV venders have used signature scanningfor malware detection. It is known to be efficient because itguarantees less time, small overhead, and low false positiverate. In spite of the advantages, the signature-based malwaredetection have a fatal failure. It is not able to detect brand newmalware, and even their mutants that their signatures have notbeen updated yet. Therefore, malware analysis for signatureupdating is the most important task to AV venders.

When a new suspicious binary is shown up, malware expertsconduct behavior analysis through several ways to determinewhether or not the binary has malicious attempts. More pre-cisely, they execute the binary on the controlled environmentand monitor what the binary do, such as accessing specificregistries, creating files into windows system directories, tryingto kill the antivirus processes and so on. If any maliciousattempts are discovered, a signature of the malware is extractedand updated. However, the signature update is not an easy tasksince malware analysis is commonly time consuming task, andthere are too many malware to be analyzed. In the meantime,unspecified individuals are just exposed to the malware. A newtechnique to reduce the time for signature update is required.

To this end, we proposes BinGraph, a new behavior-basedmechanism for analyzing and classifying metamorphic mal-ware by automatic way. BinGraph is to mainly reduce theanalysis time for the signature update. We utilize a system-call sequence as a binary characteristic. This is based on theassumption that the same API call sequence S is appeared inan original malware M and its new variant malware M ′ aswell. That is, even if the M ′ is generated by applying codeobfuscation to M and the syntax of M ′ is quite different fromM , the semantics of M ′ cannot be deviated from the semanticsof M .

We extract only several instructions related to the systemcall sequence in a binary, and represent as a form of a directedgraph. After that, the graph is separated into several subgraphsdepending on their functionalities, and abstracted based onbehavioral semantics. We gather the subgraphs from not onlyseveral malware variants but also benign executables, andapply the graph mining to determine which subgraphs areonly present in malware. Finally, the subgraphs are used formalware detection as the semantic signatures.

We implemented BinGraph and evaluated with a total of827 real-world malware variants that can be classified as10 malware families, and 1,202 benign executables includingWindows system programs. The malware samples were givento us by a AV vendor who collect the binaries from submissionto a public web site on May 2012. Among the malwaresamples, we randomly chose 20% of malware sample fromeach malware family. 166 binaries were chosen to extractsemantic signatures, and rest of them were used for assessingwhether the extracted semantic signatures can detect mutantmalware binaries accurately.

In the experiments, we collected 8,673 subgraphs from theselected malware samples and 161,328 subgraphs from 1,002benign executables. By the graph mining step, 1,897 uniquesubgraphs were extracted as the malicious behavior, and only32 of them were addressed as the semantic signatures. Thesematic signatures are, finally, matched with 661 malwaresamples as well as 200 benign binaries. BinGraph successfullydiscovered malware variants with 98% of detection accuracyusing only the 32 semantic signatures, and there is no falsepositive.

The main contributions of this paper are threefold:

• We present a novel approach that discover malwareeven though the malware adopts code obfusctation. Ourapproach leverages a common semantic characteristic ofa malware family, thus it is not influenced whether or notthe malware manipulate their instructions.

• We can detect modularized malware. Our approach de-fines a binary with a hierarchical structure and separatesinto tiny pieces of code based on functionality. Therefore,we can determine whether or not a binary has a maliciousattempt even though the binary contains small piece ofmalicious code.

• We reduce the number of malware signatures. Our exper-

imental results exhibit that each malware families havedistinct semantic signature(s), so we can detect muchmalware variants using few signatures. This reductioncan solve the problem of extremely increasing malware.Furthermore, analysis times and storages to be needed aredramatically reduced as well.

The remainder of this paper is organized as follows.In Section 2, the background of this research and relatedstudies are discussed. And, we describe the architecture ofour mechanism, BinGraph in Section 3. Section 4 evaluateBinGraph, and finally, we conclude this paper and provide anoutline of future work in Section 5.

II. BACKGROUND

Malware and anti-malware warfare is endless. Malwareauthors adopt new techniques, such as code obfuscation tech-niques, which can manipulate their codes to avoid detec-tion by defenders, since traditional anti-malware solutionsare commonly based on the signature scanning. Accordingly,defenders have studied on the issue of obfuscated malware toimprove the detection performance.

Malware detection has been studied with mainly two ap-proaches, static analysis and dynamic analysis, and the ap-proaches have advantages and disadvantages for each. Dy-namic analysis [13]–[17] executes malware samples in acontrolled, isolated environment such as Virtual Machine(VM), monitors malwares behaviors, and reports automati-cally. ANUBIS [18], CWSandbox [19] and TTAnalyze [20]are the popular dynamic analysis tools. However, dynamicanalysis involves system infection, and malware authors canapply the anti-VM techniques to prevent code analysis. Mostimportantly, many of the current malware are operated byattacker’s commands. Thus, even the malware is loaded onthe analysis system, there exist a possibility that the malwaredoes not work. This is un-ignorable limitation.

Binary pattern matching [21], data flow [22], and code flowanalysis [23], [24] are the examples of static analysis. Theyanalyze the malware without malwares execution, thus theycommonly guarantee fast and safe analysis. In addition, staticanalysis can cover the entire malware code. Of course, staticanalysis also has limitations. Static analysis has difficulty toanalyze code obfuscated malware. Using the code obfusca-tion technique, the malware can change their codes withoutchanging functionality.

To overcome the limitations [24], code normalization [25],[26] and semantic approaches have been proposed [27], [28].Even though the approaches are helpful, code normalizationsare depending on specific obfuscating techniques, and thesemantic approaches have shown a weakness to new type ofcode obfuscation.

Recent studies have shown that API calls can be useful formalware detection since the API calls reflect the functionalitiesof a program. Bai et al. presented a malware detection researchbased on critical API-calling graph(CAG) [29]. The approach

deals with matching of the CAG rather than considering allAPI calls. Eskandari and Hashemi proposed a metamorphicmalware detection using API calls on the Control FlowGraph(CFG) [30]. They converted the resulted sparse graphto a vector and adopted few machine learning techniques.However, the works are only focused on the syntactic structureof API call sequences, not the semantics of the API calls.Therefore, their approaches cannot perform well for detectingmalware variants created by replacing the APIs with otherAPIs or reordering the APIs while preserving the same func-tionalities of the program.

CodeGraph [31], our previous work, has been proposed todetect metamorphic malware using semantic signatures. Asone of the static analysis, CodeGraph converts the API callsequence of the malware into a graph to extract the semantic,and converts the graph to a code graph used for the semanticsignature. Finally, the semantic signature is compared withother semantic signatures to calculate similarity. Unfortunately,CodeGraph is susceptible to new types of malware, especiallymodularized ones, since it only concerns whole behaviors ofmalware as one semantic signature.

To unveil the modularized malware, the semantic signaturehas to be represented as the functionality level rather than thecode level, and also contain unique behavioral features. Toaddress this, our work is focused on;

1) How to build subgraphs which can be distinguishedbased on their functionalities,

2) How to address their semantics in the graph structure,and

3) How to extract semantic signatures which representspecific malware or malware families.

This work is basically an extension of the previous work,where the concept of building the semantic signature wasproposed for detecting metamorphic malware.

III. BINGRAPH SYSTEM

BinGraph is a system to detect metamorphic malwareusing semantic signatures. Fig. 1 shows the working flowsof BinGraph, and it has two phases: semantic signatureextraction and malware detection. At the first phase, weconstruct the semantic signatures using both malwaresamples and benign executables. Then, malware detectionagainst unknown binaries is performed by matching withthe semantic signatures at next phase. Our system consistsof five different analysis steps, behavior graph construction,subgraph extraction, graph abstraction, graph mining, andsignature matching. In this section, we describe the details ofBinGraph from the perspective of the different analysis steps.

A. Behavior Graph Construction

Current malware detection is commonly based on the syn-tactic signature matching, and malware authors adopt newtechniques such as code obfuscation to evade the malware

Behavior Graph Construction

Subgraph Extraction

Graph Abstraction

Graph Mining

Malware Executable

Benign Executable

Semantic Signature

Behavior Graph Construction

Subgraph Extraction

Graph Abstraction

Signature Matching

Unknown Executable(Testing Sample)

Detect Result

Sematic Signature Extraction Malware Detection

Fig. 1. BinGraph architecture with two phase, semantic signature extractionand malware detection.

detection system. It is surely effective since the techniquechanges syntactics of malware without semantics. This is themotivation that BinGraph is focused on semantic characteris-tics of malware rather than syntatic characteristics.

The system call features are critical importance for under-standing and identifying the semantic of malware. In the Win-dows system, applications access the system resources, such asprocess, memory, registry, and network, through Win32 API.Malware is also able to access the system resources by usingAPI functions to implement their tasks, and semantics of theAPI functions are clear. Hence, BinGraph utilizes the systemcall information, especially call sequence, as the semanticcharacteristics of binaries.

To represent the semantic of a binary, BinGraph constructa behavior graph using the API call sequence. Therefore,the extracting accurate call sequence is required. The codeanalyzer [31], [32], fortunately, provides the extraction from aexecutable. The code analyzer first extracts instructions relatedto the system call sequence in the binary, and represents theresult into the set of nodes and edges. The result of codeanalyzer is a call graph, G = (V,E), where V is a set ofsystem calls selectively chosen among the system calls. AndE is a set of calling relations of the system calls in V , e.g.,E = {(vi, vj)}|vi, vj ∈ V }, where vi denotes the caller, andvj denotes the callee.

We build the set of node V , which contains the systemcalls, through the Import Address Table (IAT) in a executable.Next, we generate the set of edges E for the call graph Gbased on the system call sequence, where vi is the caller andvj is the callee.

B. Subgraph Extraction

Many malware are implemented as several modules, andthe modules are shared to create other malware or attached

Program

Function

Basic block

API call Function Fn

…

Program Pi

Function F3

Function F2

Function F1

Basic block B1

Basic block B2

Basic Block B3

Function Fi

Layer 1

Layer 2

Layer 3

Layer 4

Fig. 2. A hierarchical structure of binaries with four layers.

into benign executables. To detect such kinds of malware, weneed to understand the malware in terms of functionalities.Moreover, we need to understand the hierarchical architectureof malware and duplicated behaviors across the malwarevariants.

To do this, we classify the binary into four layers, which isin order of API calls, basic blocks, functions, and a program.Fig. 2 represents the hierarchy. A program P consists offunctions F , a function F consists of basic blocks B, and abasic block B consists of API calls v. The API calls representthe basic nodes composing a behavior graph, however a singleAPI call is not sufficient to express a specific behavior.

The basic block [33] is a portion of the code with certaindesirable properties. Usually, the basic block has an entry pointand an exit point, meaning there is no jump instruction andonly the last instruction can cause other basic block to execute.We consider the basic block as a basic subgraph GB , since itcan be the smallest piece of code embracing a behavior witha purpose.

In few cases, the basic blocks cannot represent specificbehaviors. For example, the size of all of them in a binaryare too small or only small number of API call is arisen inthe basic blocks. To handle this, if all the basic blocks are notsufficient to exhibit the semantic characteristics of a binary andwe are not able to generate unique subgraphs for the binary,we move to the upper layer, i.e., function layer, to producesubgraph GF that expresses the function. More precisely, if afunction Fi consists of basic blocks B1, B2 and B3, and allthe basic blocks cannot present the semantic characteristics ofthe binary, GFi is considered as one of the subgraphs.GP is a super graph that represents all behaviors in program

P , and it is widely used in previous researches which appliedAPI call based binary analysis [34]. However, in BinGraph,the GP is considered as a semantic behavior graph if andonly if there is no sufficient subgraphs for representing uniquesemantics. This hierarchical subgraph approach guarantees atleast same detection performance with the previous works inworst case, such as there is no unique malicious subgraph inmalware.

C. Graph Abstraction

So far, we have constructed call graphs based on the APIcall to detect malware variants through comparing the graphs.Unfortunately, the call graphs have thousands of nodes andeach nodes represents each calls, and the graph isomorphismis as commonly known as a NP-complete problem. We needto simplify the call graphs to comparable forms with preserv-ing semantic characteristics. We call this step as the graphabstraction.

To this end, we substitute the system calls into 128 nodesbased on their semantics. More precisely, we classify eachcalls by 32 objects of a system call, where the objects areprocess, register, memory, socket, and so on. And then, thecalls are classified again by four behaviors of the relatedobject, where the behaviors are open, close, read and write.For instance, CreateProcess() is a member of the process-open group. Note that the 32 objects are referenced from theMicrosoft Developer Network (MSDN) in order to implementBinGraph on Windows system.

After the classification, the nodes within the same groupsare merged and edges are relocated based on the call relation-ships of the nodes.

After subgraph extraction from a call graph, every subgraphsare transformed abstracted graphs. We use an adjacent matrixas a data structure to store the graph information. An adjacentmatrix is the most proper data structure for graph, since thegraph is a directed graph and the number of nodes is only128. Finally, we can successfully abstract and transform thecall graphs to 128×128 matrices.

Fig. 3 is a simple example of the graph abstraction. Asubgraph (a) presents malicious behavior that collects systeminformation and sends to third party through the network. Notethat malware usually collects the information from infectedmachines, and this behavior graph is captured from one ofthe malware samples. The malware invokes the system callsto collect Windows version and current memory status, andsends them to attackers through a network connection. Thesubgraph is abstracted as forms of a semantic graph (b) with

GlobalMemoryStatus

GetVersionExA

Send Recv

Connect

Socket

CreateThread

CloseSocket

Memory.Read Thread.Open

Socket.Open

Socket.ReadSocket.Write

Socket.Close

SystemInfo.Read

(a) An example of API call graph that

presents system information leakage.

(b) A semantic graph that is abstracted and

transformed from the information leakage

graph.

(c) Amatrix data structure that contains

the semantic graph in fixed size.

Fig. 3. Semantic abstraction with a real example of malicious behavior.

128 nodes, finally, transformed into the matrix data structure(c).

This graph abstraction offers next three advantages:1) The abstracted graphs can express semantics of the

executables,2) The abstracted graphs can be represented as a small data

structure irrespective of their sizes or hierarchy, and3) Computation time is extremely decreased, since the size

of a graph has only 128×128, thus it can be completedwithin finite time at any condition.

D. Graph Mining

In this step, we conduct the frequent subgraph miningfor extracting semantic signatures. A frequent subgraphmeans that it appears simultaneously in a fraction k of allbinaries. If k is greater than two and it appeared in onlymalware samples, it is regarded as a shared module acrossthe malware variants. Consequently, it can be useful as asemantic signature to detect malware variants.

Mal

war

e E

xec

uta

ble

Ben

ign

E

xec

uta

ble

Mal

icio

us

Su

bg

raph

Min

ing

Malware M1

Malware M2

Semantic

Signature

(Frequency 2)

Benign binaries Bn F

req

uen

t S

ub

gra

ph

Min

ing

Graph Mining Step

Fig. 4. A overview of graph mining process.

To mine graphs only observed in the malware, we performthe comparison between two graph sets came from malwareand benign binaries respectively. To this end, the three analysis

steps, behavior graph construction, subgraph extraction andgraph abstraction, are equally performed to malware andbenign binaries. Next, we compare the each graph to determinewhether or not it appears in benign binaries. If a graph appearsin a benign graph set, it is removed immediately. But, if agraph only appears in a malware graph set, it is added to theset of a candidate of semantic signatures.

Even though a graph is not appeared in the benign graphs,we do not consider the graph as a semantic signature directly.In some malware cases, there are tens of or hundreds of suchgraphs. If the number of samples to be analyzed is small,that is no big deal. However, analyzing millions of samplesis a big deal since the time to process increases exponentiallyin accordance with the number of signatures, and millions ofbinaries are submitted to AV vendors lately. Therefore, weneed to minimize the number of signatures with maximizingthe coverage over malware.

The greedy strategy [35] is selected for efficient signaturemining. At first, we select a graph that has the highestfrequency, and exclude malware samples that can be detectedby the graph from training data pool. After that, we selectthe most frequent graph among the data pool again. Thesignature mining with the greedy strategy is performed untilthe data pool becomes empty. The number of signature israrely similar with the number of malware sample, and mostcases, very small number of signatures is selected.

E. Semantic Signature Matching

The previous analysis adopts graph mining with sets ofknown malicious and benign graphs to extract semantic sig-natures. The semantic signature represents a behavior thatonly appears in particular malware or variants. This step isfor detecting unknown malware by matching the semanticsignature.

In order to discover unknown malware, we first extractsemantic graphs to be compared with the semantic signatures.As we can see in Fig. 1, BinGraph applies the behavior

TABLE I

STATISTICS OF SEMANTIC SIGNATURES FOR EACH MALWARE FAMILY

Malware Name # of Sample# of Total # of Unique # of Malicious # of SemanticSubgraph Subgraph Subgraph Signature

Conficker 8 845 422 13 3Killav 20 69 24 5 1

Koobface 8 173 126 41 3Nebuler 20 3246 605 413 1

OnlineGameHack18 20 271 43 21 1Palavo3 20 2011 1632 1345 7Qhost 20 400 20 8 1

Rustock 10 270 193 0 4a

TDSS3 20 69 24 10 10Userinit 20 1319 144 41 1

aThe semantic signatures of Rustock were composed in function layer.

graph construction, the subgraph extraction, and the graphabstraction to unknown binary. After the pre-analysis steps,matching with the semantic signature is performed.

As we mentioned in section 3.C, the graph isomorphismproblem is NP-complete, thus we utilize the matrix datastructure for semantic graph matching. Every semantic graphsand signatures are represented by 128×128 matrices. Morein details, each matrix is stored as a 2KB bit-stream, andsimple XOR operation is executed to determine whether ornot the matrices of semantic signatures are matched with thematrices of semantic graphs. Whenever a match is found,the result of XOR operation is equal to zero, we define thecorresponding graph as a malicious behavior, and we classifythe corresponding binary as a malware.

IV. EVALUATION

We performed several experiments to evaluate accuracy andefficiency of BinGraph. This section describes the evaluationresults and interesting features in detail.

A. Environments

BinGraph has been implemented and evaluated with atotal of 827 real-world malware variants. The samples wereprovided from an AV company. The company obtained thesamples through the public submission Web site and classifiedthe samples upon their own analysis processes. The malwaresamples were provided with names of 10 malware families,and they all had distinct binary patterns.

To ensure that whether the classification of the malwarefamilies is trustworthy, we leveraged the samples with twodifferent AV engines (e.g., Kaspersky and BitDefender). Un-fortunately, the AV engines showed a difficulty to detect all themalware samples. Besides, the labels of the malware conflictdepending on the AV engines. One of the binaries labeled

as Generic.Malware.SP!g.E654213E, thus we could not de-termine its malware family. The signature database might notbe up to date at that time. Consequently, our evaluation wasperformed based on the initial classification. The familieswere Conficker (40), Killav (100), Koobface (37), Nebuler(100), OnlineGameHack18 (100), Palavo3 (100), Qhost (100),Rustock (50), TDSS3 (100) and Userinit (100).

Total 1,202 benign binaries, Windows system programbinaries on pure environment and freeware binaries, wereutilized for our evaluation. We verified the benign sampleswith the three AV engines (Kaspersky, BitDefender and thecompany’s AV engine) and confirmed that all the sampleswere not classified as malicious. The size of the samplesranges from few KB to tens of MB, and a total of 161,328unique subgraphs were produced. The subgraphs cover thevarious benign behaviors, and we utilized them to eliminatebenign behaviors from semantic signatures.

B. Semantic Signature Generation

Lately, detection rates of less than 20% are commonly seenon newly discovered malware [36]. It means AV vendors haveonly 20% of samples for new malware. Hence, we need todiscover all the malware variants using information obtainedfrom the 20% of samples. Among the provided 827 malwaresamples, 20% of each family (166 samples) were randomlyselected for generating the semantic signatures.

Table I explains the analysis results for each malwarefamily. A total of 8,673 subgraphs were extracted and only3,233 subgraphs show uniqueness. Finally, we analyzed 1,897subgraphs as malicious by comparing with benign graphs.However, all of the 1,897 malicious graphs could not be thesemantic signature, since comparing thousands of signaturesis not efficient as we mentioned in Section 3.D. Therefore, weneeded to figure out most efficient semantic signatures.

The efficiency in malware detection means how manymalware can be caught by the signature. In other words, how

Frequency 20

32 (78%)

Frequency 18

1 (3%)

Frequency 14

5 (12%)

Frequency 6

3 (7%)

Frequency 4

2

(20%)

Frequency 2

8

(80%)

Frequency 4

4 (31%)

Frequency 2

3 (23%)

Frequency 1

6 (46%)

Frequency 20

1 (20%)

Frequency 15

1 (20%)

Frequency 9

1 (20%)

Frequency 5

1 (20%)

Frequency 3

1 (20%)

Frequency 2

3 (7%)

Frequency 1

38 (93%)

Frequency 20

6 (28%)

Frequency 14

9 (43%)

Frequency 3

6 (29%)

Frequency 6

1 (0%)

Frequency 5

1 (0%)Frequency 4

5 (0%)

Frequency 3

3 (0%)

Frequency 2

60 (5%)

Frequency 1

1275

(95%)

Frequency 20

8 (100%)

Frequency 20

31 (8%)Frequency 17

1 (0%)Frequency 16

2 (1%)Frequency 14

4 (1%)Frequency 13

1 (0%)Frequency 7

1 (0%)Frequency 6

3 (1%)Frequency 4

6 (1%)Frequency 3

3 (1%)Frequency 2

18 (4%)

Frequency 1

343 (83%)

Conficker Killav

Koobface

Nebuler

Onlinegamehack18 Palavo3

Qhost TDSS3 Userinit

Fig. 5. Distributions of subgraph’s frequency for various malware families.

often the signature is appeared across the malware variants. Wedefine this as “Frequency”. If the frequency is equal to k, themalicious graph appears in k distinct malware variants. Fig. 5illustrates the frequency of the malicious graphs. For instance,5 malicious graphs are discovered in 20 Killav variants andeach of the malicious graphs had frequency 20, 15, 9, 5 and3 respectively. That is, one of the malicious graph appearsin every Killav variants, thus it bocomes the most efficientsemantic signature.

Half of the malware samples in Fig. 5, Killav, Nebuler,Onlinegamehack18, Qhost and Userinit, show at least oneor more subgraphs that rank maximum frequency 20. Othermalware families also have frequent subgraphs ranked morethan two. Even though the frequent subgraphs are not able tocover all variants in a malware family, they are undoubtedlyuseful since a frequent subgraph can discover more than twomalware variants.

We performed the semantic signature selection by applyinga greedy algorithm. As a result, only 32 semantic signatureswere constructed for 166 malware variants. The number ofsemantic signatures decreased to only 19.28% of the numberof syntactic signatures.

C. Detection Accuracy

We assessed whether the semantic signatures can accuratelydetect malware variants without false alarm. To this end, weperformed two experiments. First, we evaluated the semanticsignatures with the remain 80% of real malware samples. Andthen, we applied the semantic signatures to benign binaries toevaluate false positives.

To validate the detection accuracy, we performed the fouranalysis steps (viz., Fig. 1) against the 661 malware variant(80% of samples). First, a call graph was extracted fromeach malware, and second, subgraphs were extracted basedon the basic blocks. Third, the subgraphs were transformedinto 128×128 matrices through the graph abstraction step.Finally, the graphs were matched with semantic signaturesto determine whether or not the graphs attempt maliciousbehavior.

Table II exhibits the detection results. Among the 661malware samples, 649 were matched to the 32 semanticsignatures. Average detection rate was 98.18%. Furthermore,we also performed matching with 200 benign binaries. 6,379subgraphs were extracted from the benign samples and therewas no matched results with our semantic signatures. As a

TABLE II

DETECTION RESULTS FOR REAL MALWARE SAMPLE

Malware Name# of Testing # of Matched Detection

Sample Sample RateConficker 32 32 100%

Killav 80 80 100%Koobface 29 25 86.20%Nebuler 80 80 100%

OnlineGameHack18 80 80 100%Palavo3 80 73 91.25%Qhost 80 80 100%

Rustock 40 39 97.50%TDSS3 80 80 100%userinit 80 80 100%

result, BinGraph effectively discovered metamorphic malwarewithout false positives.

V. CONCLUSION

In this paper, we present BinGraph, a novel approach toanalyze and classify metamorphic malware. The subgraphanalysis can provide not only metamorphic malwareclassification, but also behavior analysis of modules usedby malware variants. Such information can be useful to AVvenders and make the malware authors harder to developmetamorphic malware. For the future work, we will extractsemantic signatures presenting common behavior acrossvarious kinds of malware families, and analyze specificbehavioral features of metamorphic malware.

VI. ACKNOWLEDGMENTS

This research was supported by the KCC(KoreaCommunications Commission), Korea, under the R&Dprogram supervised by the KCA(Korea CommunicationsAgency)(KCA-2012-12-911-01-111). Additionally, this workwas partially supported by Seoul City R&BD programWR080951.

REFERENCES

[1] M. Christodorescu, S. Jha, and C. Kruegel, “Mining specifications ofmalicious behavior,” in European Software Engineering Conference andACM SIGSOFT Symposium on the Foundations of Software Engineering,2007.

[2] Symantec, “Global internet security threat report,” Apr. 2012.http://www.symantec.com/threatreport/.

[3] McAfee, “Threats report: First quarter 2012,” 2012.http://www.mcafee.com/us/resources/reports/rp-quarterly-threat-q1-2012.pdf.

[4] M. Driller, “Metamorphism in practice.” 29A Magazine, 2002.[5] L. Julus, “Metamorphism.” 29A Magazine, 2000.[6] Rajaat, “Polymorphism.” 29A Magazine, 1999.[7] K. Rieck, T. Holz, C. Willems, P. Dussel, and P. Laskov, “Learning and

classification of malware behavior,” in Fifth Conference on Detectionof Intrusions and Malware & Vulnerability Assessment (DIMVA 08),pp. 108–125, June 2008.

[8] D. Mohanty, “Anti-virus evasion techniquesand countermeasures.” Whitepaper, Aug. 2005.http://www.hackingspirits.com/ethhac/papers/whitrepapers.asp.

[9] G. Taha, “Counterattacking the packers.” McAfee, 2007.

[10] G. Jacob, H. Debar, and E. Filiol, “Behavioral detection of malware:from a survey towards an established taxonomy,” Journal in ComputerVirology, 2008.

[11] P. Ferguson, “Observations on emerging thrests,” in USENIX Workshopon Large-Scale Exploits and Emergent Threats (LEET), Apr. 2012.

[12] K. Thomas and D. Nicol, “The koobface botnet and the rise ofsocial malware,” in IEEE Int. Conf. Malicious and Unwanted Software(Malware 10), pp. 63–70, Oct. 2010.

[13] E. Kirda, C. Kruegel, G. Banks, G. Vigna, and R. Kemmerer, “Behaviorbased spyware detection,” in 15th Usenix Security Symposium, 2006.

[14] M. Egele, C. Kruegel, E. Kirda, H. Yin, and D. Song, “Dynamic spywareanalysis,” in Usenix Annual Technical Conference, 2007.

[15] L. Martignoni, E. Stinson, M. Fredrikson, S. Jha, and J. Mitchell, “Alayered architecture for detecting malicious behaviors,” in SymposiumonRecent Advances in Intrusion Detection (RAID), 2008.

[16] C. Kolbitsch, P. Comparetti, C. Kruegel, E. Kirda, X. Zhou, andX. Wang, “Effective and efficient malware detection at the end host,” inProceedings of the 18th USENIX Security Symposium, Aug. 2009.

[17] U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda,“Scalable, behavior-based malware clustering,” in Symposium on Net-work and Distributed System Security (NDSS), 2009.

[18] U. Bayer, I. Habibi, D. Balzarotti, E. Kirda, and C. Kruegel, “A viewon current malware behaviors,” in USENIX Workshop on Large-ScaleExploits and Emergent Threats (LEET), 2009.

[19] C. Willems, T. Holz, and F. Freiling, “Toward automated dynamicmalware analysis using cwsandbox,” in IEEE Security & Privacy,pp. 32–39, 2007.

[20] U. Bayer and E. Kirda., “Ttanalyze: A tool for analyzing malware,”in 15th Ann. Conf. of European Inst. for Computer Antivirus Research(EICAR), pp. 180–192, 2010.

[21] F. Cohen, “Computer viruses: Theory and experiments,” Computers &Security, pp. 22–35, Feb. 1987.

[22] D. Chess and S. White., “An undetectable computer virus,” in VirusBulletin Conference, Sept. 2000.

[23] A. Moser, C. Krugel, and E. Kirda, “Exploring multiple execution pathsfor malware analysis,” in IEEE Security & Privacy, pp. 231–245, 2007.

[24] A. Moser, C. Kruegel, and E. Kirda, “Limits of static analysis formalware detection,” in 23th Annual Computer Security ApplicationsConferecne (ACSAC 2007), pp. 421–430, Dec. 2007.

[25] D. Bruschi, L. Martignoni, and M. Monga, “Code normalization forself-mutating malware,” in IEEE Security & Privacy, pp. 46–54, 2007.

[26] M. Christodorescu, S. Jha, S. Seshia, D. Song, and R. Bryant,“Semantics-aware malware detection,” in IEEE Security & Privacy,pp. 32–46, 2005.

[27] M. Preda, M. Christodorescu, S. Jha, and S. Debray, “A semantics-basedapproach to malware detection,” ACM Transactions on ProgrammingLanguages and Systems (TOPLAS), Aug. 2008.

[28] V. Sathyanarayan, P. Kohli, and B. Bruhadeshwar, “Signature generationand detection of malware families,” in the 13th Australasian Conferenceon Information Security and Privacy (ACISP 2008), pp. 336–349, July2008.

[29] L. Bai, J. Pang, Y. Zhang, W. Fu, and J. Zhu, “Detecting maliciousbehavior using critical api calling graph matching,” in 1st InternationalConference on Information Science and Engineering, pp. 1716–1719,Dec. 2009.

[30] M. Eskandari and S. Hashemi, “Metamorphic malware detection usingcontrol flow graph mining,” International Journal of Computer Scienceand Network Security, Dec. 2011.

[31] J. Lee, G. Jeong, and H. Lee, “Detecting metamorphic malwares usingcode graphs,” in ACM Int’l Symp. on Applied Computing(ACM SAC2010), Mar. 2010.

[32] K. Jeong and H. Lee, “Code graph for malware detection,” in Int’l Conf.on Information Networking (ICOIN), Jan. 2008.

[33] WIKIPEDIA, “Basic block,” 2012.http://en.wikipedia.org/wiki/Basic block.

[34] K. Han, I. Kim, and E. Im, “Detection methods for malware variantusing api call related graphs,” in In proceedings of the InternationalConference on IT Convergence and Security, Dec. 2011.

[35] P. Black, “Greedy algorithm,” in Dictionary of Algorithms and DataStructures, Feb. 2005.

[36] A. Fried, “Whose internet is it, anyway,” in Blackhat Conference, Feb.2010.

BinGraph: Discovering Mutant Malware using Hierarchical … · 2014. 9. 2. · BinGraph:...

Documents

Transcript of BinGraph: Discovering Mutant Malware using Hierarchical … · 2014. 9. 2. · BinGraph:...