Behavior Grouping of Android Malware Family

6
Behavior Grouping of Android Malware Family Shun-Wen Hsiao Institute of Information Science, Academia Sinica, Taiwan Email: [email protected] Yeali S. Sun Department of Information Management, National Taiwan University, Taiwan Email: [email protected] Meng Chang Chen Institute of Information Science, Academia Sinica, Taiwan Email: [email protected] Abstract—Malicious apps may install unwanted program or gather sensitive information from mobile devices. We notice Android apps fork several threads to accomplish a complex task intrinsically, and so does Android malware, that makes security experts difficult to analyze them without knowing their structure. In this paper, we propose an analysis scheme to group and analyze Android malware based on their dynamic behaviors, and to identify the behaviors of a malware family. In addition, we apply the techniques of phylogenetic tree, significant principal components and dot matrix on different malware families to demonstrate their behavioral correlations. The proposed methods can automatically discover similar behaviors of different malware groups, extract the characteristics of each malware group, and provide visualized information based on runtime behaviors. We anticipate the grouping result and the structure of malware family are important and essential for further malware behavior analysis researches. I. I NTRODUCTION With the rapid growth of Android devices, many rich Android apps are developed to make living convenient. How- ever, unwanted apps, such as bot, backdoor and adware, are developed intentionally to gather sensitive information, or illegally gain the control of mobile device. Understanding their adverse actions is essential to malware research and an adequate analysis tool is essential for security analysis to to identify a peculiar behavior or certain commonly shared behaviors by malware apps. Based on our observation of more than 1000 Android malwares, we find an Android malware is quite complex that it could fork more than 10 threads and invoke 10k to 100k API calls in a short period of time (300 seconds) to accomplish a malicious activity, which is not possible for security experts to manually analyze them. Fig. 1 shows the statistics of 40 malware families. It motivates us to develop an automatic analysis and grouping method to simplify the early stage of malware behavior research before further investigation. In this paper, we develop analysis methods and tools to profile Android malware, and techniques to group and analyze malware behavior by performing similarity analysis, behavior grouping and characteristic extraction. The dynamic analysis approach is adopted to profile Android malware in a virtualized environment and record their invoked Java API calls as their execution traces. These traces are then used to perform sequence and similarity analysis. To identify important threads and significant behaviors of a malicious app from its trace is not an easy task. However, due to the implementation strategies chosen by the malicious app programmer, an app might reuse previous codes, interchange code snippets, or merge components from other apps. Since the proposed tool could record detailed runtime traces, we view the trace as malware’s DNA and adopt bioinformatics algorithms to exercise some grouping and alignment methods, including phylogenetic tree, principal components and dot matrix method to better understand the malware and its family. The proposed methods could visualize the relationship, group similar behaviors, and identify significant characteristics. Our contributions includes 1) building a dynamic analysis tool for Android apps, 2) developing Android app profile by JAVA API call sequence, 3) adopting three types of methods to visualize the internal structure of Android malware family, 4) identifying significant characteristics of different malware groups, and 5) testing the proposed methods using real data set. The remainder of the paper is organized as follows. In Section II, we review some works of Android malware study. Section III is the mechanism to perform dynamic analysis on Android malware. We present the behavior analysis methods and the results from different aspects in Section IV. Section V contains the conclusion and lists the future work. II. RELATED WORK Android Malware Genome Project [1] is one of the most popular Android malware databases for academic research. It focuses on the Android platform and aims to systematize or characterize existing Android malware. It has 1,260 mal- ware samples collected from August 2010 to October 2011. They emphasize on various aspects, including the installation methods, activation mechanisms and the carried malicious payloads. We use these samples to perform dynamic analysis and behavior profiling. TaintDroid [2] adopts the taint analysis technique and provides a system-wide dynamic taint tracking system capable of tracking multiple sources of sensitive data. It could monitor and track private information in a running Android. Droid- Scope [3] takes the advantage of virtualization technology to inspect an Android app in a virtual machine by virtual machine introspection technique. Our testbed runs on VM as well, but the proposed profiler lies in the Android (Delvik) JVM, which give us full-speed inspection. Barrera et al. [4] use of the self-organizing map to demon- strate the relationship of the permission usage of Android apps with different categories. Pscout [5] discusses the re- lationship between permissions and Java classes. VetDroid [6] is a dynamic analysis platform that can reveal how apps use permissions to access sensitive system resources, and how these acquired resources are further utilized by the app. Appsplayground [7] performs dynamic analysis in an Android IEEE ICC 2016 Communication and Information Systems Security Symposium 978-1-4799-6664-6/16/$31.00 ©2016 IEEE

Transcript of Behavior Grouping of Android Malware Family

Page 1: Behavior Grouping of Android Malware Family

Behavior Grouping of Android Malware Family

Shun-Wen HsiaoInstitute of Information Science,

Academia Sinica, TaiwanEmail: [email protected]

Yeali S. SunDepartment of Information Management,

National Taiwan University, TaiwanEmail: [email protected]

Meng Chang ChenInstitute of Information Science,

Academia Sinica, TaiwanEmail: [email protected]

Abstract—Malicious apps may install unwanted program orgather sensitive information from mobile devices. We noticeAndroid apps fork several threads to accomplish a complex taskintrinsically, and so does Android malware, that makes securityexperts difficult to analyze them without knowing their structure.In this paper, we propose an analysis scheme to group andanalyze Android malware based on their dynamic behaviors, andto identify the behaviors of a malware family. In addition, weapply the techniques of phylogenetic tree, significant principalcomponents and dot matrix on different malware families todemonstrate their behavioral correlations. The proposed methodscan automatically discover similar behaviors of different malwaregroups, extract the characteristics of each malware group, andprovide visualized information based on runtime behaviors. Weanticipate the grouping result and the structure of malware familyare important and essential for further malware behavior analysisresearches.

I. INTRODUCTION

With the rapid growth of Android devices, many richAndroid apps are developed to make living convenient. How-ever, unwanted apps, such as bot, backdoor and adware, aredeveloped intentionally to gather sensitive information, orillegally gain the control of mobile device. Understandingtheir adverse actions is essential to malware research andan adequate analysis tool is essential for security analysis toto identify a peculiar behavior or certain commonly sharedbehaviors by malware apps. Based on our observation of morethan 1000 Android malwares, we find an Android malwareis quite complex that it could fork more than 10 threadsand invoke 10k to 100k API calls in a short period of time(300 seconds) to accomplish a malicious activity, which is notpossible for security experts to manually analyze them. Fig.1 shows the statistics of 40 malware families. It motivatesus to develop an automatic analysis and grouping method tosimplify the early stage of malware behavior research beforefurther investigation.

In this paper, we develop analysis methods and toolsto profile Android malware, and techniques to group andanalyze malware behavior by performing similarity analysis,behavior grouping and characteristic extraction. The dynamicanalysis approach is adopted to profile Android malware ina virtualized environment and record their invoked Java APIcalls as their execution traces. These traces are then used toperform sequence and similarity analysis.

To identify important threads and significant behaviors of amalicious app from its trace is not an easy task. However, dueto the implementation strategies chosen by the malicious appprogrammer, an app might reuse previous codes, interchangecode snippets, or merge components from other apps. Since

the proposed tool could record detailed runtime traces, weview the trace as malware’s DNA and adopt bioinformaticsalgorithms to exercise some grouping and alignment methods,including phylogenetic tree, principal components and dotmatrix method to better understand the malware and its family.The proposed methods could visualize the relationship, groupsimilar behaviors, and identify significant characteristics. Ourcontributions includes 1) building a dynamic analysis tool forAndroid apps, 2) developing Android app profile by JAVA APIcall sequence, 3) adopting three types of methods to visualizethe internal structure of Android malware family, 4) identifyingsignificant characteristics of different malware groups, and 5)testing the proposed methods using real data set.

The remainder of the paper is organized as follows. InSection II, we review some works of Android malware study.Section III is the mechanism to perform dynamic analysis onAndroid malware. We present the behavior analysis methodsand the results from different aspects in Section IV. SectionV contains the conclusion and lists the future work.

II. RELATED WORK

Android Malware Genome Project [1] is one of the mostpopular Android malware databases for academic research.It focuses on the Android platform and aims to systematizeor characterize existing Android malware. It has 1,260 mal-ware samples collected from August 2010 to October 2011.They emphasize on various aspects, including the installationmethods, activation mechanisms and the carried maliciouspayloads. We use these samples to perform dynamic analysisand behavior profiling.

TaintDroid [2] adopts the taint analysis technique andprovides a system-wide dynamic taint tracking system capableof tracking multiple sources of sensitive data. It could monitorand track private information in a running Android. Droid-Scope [3] takes the advantage of virtualization technology toinspect an Android app in a virtual machine by virtual machineintrospection technique. Our testbed runs on VM as well, butthe proposed profiler lies in the Android (Delvik) JVM, whichgive us full-speed inspection.

Barrera et al. [4] use of the self-organizing map to demon-strate the relationship of the permission usage of Androidapps with different categories. Pscout [5] discusses the re-lationship between permissions and Java classes. VetDroid[6] is a dynamic analysis platform that can reveal how appsuse permissions to access sensitive system resources, andhow these acquired resources are further utilized by the app.Appsplayground [7] performs dynamic analysis in an Android

IEEE ICC 2016 Communication and Information Systems Security Symposium

978-1-4799-6664-6/16/$31.00 ©2016 IEEE

Page 2: Behavior Grouping of Android Malware Family

AD

RD

Anse

rverB

ot

Asr

oot

Base

Bri

dge

BeanB

ot

Coin

Pir

ate

Cru

seW

in

DogW

ars

Dro

idC

oupon

Dro

idD

elu

xe

Dro

idD

ream

Dro

idD

ream

Light

Dro

idK

ungFu

1

Dro

idK

ungFu

2

Dro

idK

ungFu

3

Dro

idK

ungFu

4

Dro

idK

ungFu

Sapp

Dro

idK

ungFu

Update

Endofd

ay

Fake

Netf

lix

Fake

Pla

yer

Gein

imi

GG

Tra

cker

Gin

gerM

ast

er

Gold

Dre

am

Gone60

Hip

poSM

S

Jifake

jSM

SH

ider

Pja

pps

Pla

nkt

on

RogueLe

mon

RogueSPPush

SndA

pps

Tapsn

ake

Walk

inw

at

YZ

HC

zHash

Zit

mo

Zso

ne

100

101

102

103

104

105

# of malware samples

# of total forked threads

Avg. # of API calls per thread

Fig. 1: The statistics of the generated profiles for different malware families.

emulator based on taint tracing of privacy-sensitive informa-tion, sensitive API monitoring and kernel-level tracking toidentify known exploits. Our work does not filter any JavaAPI because we view invoked APIs as the DNA of malware,which every one of them is important. The proposed analysismethod would then extract significant ones later.

Droidmat [8] applies machine learning algorithms on thefeatures in app’s manifest file and its API calls to distinguishAndroid malware. ProfileDroid [9] generates Android appprofile by using several features at multiple layers, such as re-source usage, system call, network traffic, and user behaviors.Networkprofiler [10] generates network profiles to identifyapps by using HTTP traffic. Hayes et al. [11] summarizesseveral known publications of malware phylogeny systems.They adopt different clustering algorithms and features toclassify malware samples. However, these works do not payattention on the thread structure of a app as we propose.

All above researches could help the researcher to under-stand the behavior of Android malicious app from differentaspects, e.g., instruction, permission, network, manifest file,and resource usage. In our case, we adopt dynamic analysisand take the invoked Java API calls as the basis for sequenceand similarity analyzing. We could further provide the structureof an Android malware, the relationship in a malware family,and the behavior of the forked malware threads automatically.Hence, we provide one more rich view for the security expertsto analyze Android malware.

III. DYNAMIC ANALYSIS ON ANDROID APP

Our 1,260 malware samples are classified into 49 families[1]. In our experiment, they fork 16,328 threads but some arenot runnable nowadays. Therefore, we generate 9,863 non-empty thread profiles (from 1,156 samples and 40 families).The dynamic analysis system is built on QEMU [12] and KVM[13]. The physical machine has an Intel i7-3770S 3.1 GHzquad-core CPU with 8 GB RAM running Ubuntu 12.04. Avirtual Android device has a single core CPU at 3.1 GHz with2 GB RAM and Wi-Fi running Android-x86 [14] 4.4.

In our system, an automation script establishes a TCPconnection to QEMU by QEMU Machine Protocol (QMP)[15] to control the virtual Android device. It also connects tothe Android Debug Bridge (adb) to interact with the Android

VERSION: 3Threads (22):1 main 12 Thread-842 GC 13 OkHttp ConnectionPool3 Signal Catcher 14 Thread-854 JDWP 15 Thread-865 Compiler 16 Thread-876 ReferenceQueueDaemon 17 Thread-887 FinalizerDaemon 18 Thread-898 FinalizerWatchdogDaemon 19 Thread-909 Binder_1 20 Thread-91

10 Binder_2 21 Thread-9211 Timer-0 22 Thread-93Trace (threadID action usecs class.method):1 xit 0 ..dalvik/system/VMDebug.startMethodTracing1 xit 2 .android/os/Debug.startMethodTracing1 xit 3 android/app/ActivityThread$Profiler.startProfiling1 ent 4 android/os/ParcelFileDescriptor.close

12 ent 0 ...net/youmi/android/g.run12 ent 2 ....android/view/View.getContext12 xit 4 ....android/view/View.getContext12 ent 6 ....net/youmi/android/b.<init>

Fig. 2: An example of trace file (reorganized and simplified).

system. Through QMP and adb, we can run app profilingautomatically, e.g., install malicious app, launch app, collectinvoked Jaca API calls, and reboot a virtual machine.

Fig. 2 is a trace file from ADRD family, which forks 22threads at the runtime. We notice that some threads are createdby default by the Android system for special proposes, suchas GC (garbage collection), JDWP (Java Debug Wire Protocol),Compiler. However, we anticipate the threads generated bythe malware app programmer (user thread) are what we shouldcare about, e.g., main, Timer-0, Thread-84. Hence, weonly keep user threads for analysis. Take Fig. 2 for example,they are thread number 1 and 11 to 22.

We generate a profile for each thread. Only the callsequence, class name, method name with ent flag, i.e., entera call, are preserved in a thread profile. Each thread profile hasa name having 4-digit hex number and a flag, e.g., 8a15-C.It means this thread is forked by an app indexed by 8a15 andits thread name is mapped to a table indexed by C.

Page 3: Behavior Grouping of Android Malware Family

IV. BEHAVIOR ANALYSIS AND RESULT

Although malware is a program, we could still view themas live organisms in the digital world. We adopt similarityanalysis algorithm to identify similar behavior among mul-tiple profiles, and group them together to demonstrate theirrelations. We also view a profile as the DNA of a malware,and adopt sequence analysis algorithm from bioinformatics.Sequence analysis is very useful for discovering functional,structural, and evolutionary information [16].

A. Dot Matrix

Dot matrix analysis is a method for comparing two se-quences to look for possible alignment between the sequences[17]. Fig. 3 shows four examples of dot matrix in Androidmalware family ADRD. The visualized figures can point outhow similar or different of two thread profiles, as well as whichpart of call sequence is similar.

One sequence, X , (in our case, the API call sequence) islisted across the bottom of the figure and the other sequence,Y , is listed up the left side. Starting with the first API call inY , we moves across the matrix keeping in the first row andplacing a dot in any column when the API call in Y is thesame. This process is continued until the matrix is filled withdots representing all the possible matches of X with Y . Anysimilar sequence is revealed by a diagonal row of dots.

In Fig. 3a, two threads from two different malware samples(i.e., 09b1 and 657d) in ADRD family show almost identicalbehavior (their Jaccard distance, dj , is close to zero); whileother examples show different degree of distance (or dissimi-larity). It also reveal the programming structure, and take Fig.3a for example, the first 400 API calls are common amongthread 09b1-B and 657d-B, then they both run a sequenceof code twice (from 400 to 700 and 800 to 1100). For Fig. 3b,the thread 7ded-C in Y-axis executes some codes (0 to 4000)which does not exist in the comparison thread 7ded-B in X-axis; however, theses two threads are spawned from the sameapp (i.e.m 7ded) and they still share some codes at sequencefrom 4000 to 8000 in Y-axis and 1000 to 5000 in X-axis.

B. Jaccard Distance and Phylogenetic Tree

To quantify the similarity of different thread behavior, wecalculate the Jaccard distance, dJ (Eq. 1), between all availablethreads in pairs. Each element in X and Y is defined as a JavaAPI call in the thread profile. The range of Jaccard distanceis between zero, that is X and Y are statistically identical, toone, i.e., X and Y do not have any common API call.

dJ(X,Y ) = 1− |X ∩ Y ||X ∪ Y |

(1)

To demonstrate the relationship between threads, we per-form Unweighted Pair Group Method with Arithmetic Mean(UPGMA) algorithm [18] by using Jaccard distance to con-struct phylogenetic tree. Fig. 4 shows the phylogenetic tree ofADRD family. There are 16 malware samples in this family andthey fork 44 user threads. The y-axis of phylogenetic tree isdistance between two threads or groups. The range of distanceis between zero (i.e., two threads are statistically identical) toone (i.e., they do not have any JAVA API call in common).

(a) dJ(09b1-B,657d-B) ≈ 0.0 (b) dJ(7ded-B,7ded-C) ≈ 0.1

(c) dJ(f2be-D,f2be-F) ≈ 0.35 (d) dJ(8255-A,e173-A) ≈ 0.8

Fig. 3: The example of the dot matrix of ADRD family.

e173-E

09b1-B

32cd

-B4015-B

657d-B

8255-B

8784-B

a359-B

dc4

6-B

f4fc

-B09b1-A

32cd

-A4015-A

4de0-A

657d-A

8784-A

a359-A

dc4

6-A

f4fc

-A9f8

3-A

730f-

A8255-A

f2be-B

e173-A

f2be-H

f2be-F

f2be-G

f2be-C

f2be-D

f2be-E

f2be-I

8a15-C

8a15-D

8a15-F

8a15-E

8a15-G

8a15-H

8a15-I

8a15-J

8a15-K

8a15-L

7ded-B

7ded-C

7ded-D

0.0

0.2

0.4

0.6

0.8

1.0

G1

G2

G2-1G2-2

G3

G3-1G3-2

G3-3

G3-4

Fig. 4: The phylogenetic tree of malware family ADRD.

If the Jaccard distance threshold is set to 0.5, there willbe 3 groups plus one orphan thread e173-E within theADRD family. (The setting of threshold beyonds this paper’sscope; however we will show the effectiveness of the distancethreshold later.) For the first group, G1, 9 out of 16 apps inthis family fork a thread falls into this group, and their Jaccarddistance is close to 0.0, which indicates the programmerreused certain code in 9 of his malware variants. The threads(09b1-B and 657d-B) in Fig. 3a fall into G1. The G2 grouphas 12 apps and has two sub-group, which are grouped atdistance of 0.3.

For each ancestor node in the phylogenetic tree, we extractcommonly shared API calls of all its descendants as thecharacteristics of this group. Then, we compare the API callsbetween these (sub-) groups to find out the difference betweenthem. Take G3 for example, the major difference between itstwo sub-groups, G3−1 and G3−2, is that G3−2 contains a largeamount of extra codes at the end of its execution (see also Fig.3c and Table II). However, the execution at the beginning issimilar and that is why they are grouped together.

Page 4: Behavior Grouping of Android Malware Family

8 6 4 2 0 2 4 6 8 10

PC1 (0.4834)

8

6

4

2

0

2

4

6

8

10

PC

2 (

0.2

36

7)

G1

G2-1

G2-2

G3-1G3-2

G3-3

G3-4

Fig. 5: The PCA result of malware family ADRD.

For a group Gα having several siblings, Gβ , Gγ , ...; lettheir ancestor be A, and S(G) be the set of commonly sharedAPIs of a group G of all its descendants, then we define thecharacteristics of Gα as follows.

CGα = S(Gα) \ S(A) (2)

We notice that there are two types of variants in ADRD.One type of variant will fork one thread in both G1 and G2,e.g., 09b1. The other type of variant would fork multiplethreads in G3, such as 8a15, but each variant in G3 hasdifferent grouping structure , i.e., G3−1 and G3−2 are twobehavior groups of one app; while G3−3 and G3−4 are twoapps spawning multiple similar threads. We anticipate thatclassifying G1, G2 and G3 into same family may not beappropriate from the perspective of execution; since they donot contain similar programming code, structure and style.

C. Principal Component Analysis

Although similarity distance could reveal the relationshipbetween threads, we have one more question that whichAPIs are significant to distinguish them. We adopt PrincipalComponent Analysis (PCA) to extract important API calls. Fig.5 plots ADRD’s threads by the first two principal components(i.e., PC1 and PC2) on a 2-dimensional coordinate.

The data points (i.e., threads) marked as A at the right-bottom side in Fig. 5 maps to the threads in G2 in Fig. 4. Thelight gray A at (-3.18, -0.61) and E at (3.54, 5.48) belong tomalware e173, and the B at (-0.96, 2.47) is f2be-A. Thesethree data points are away from others and such relation canbe shown in their phylogenetic tree as well.

Since 8a15, i.e., G3−3, has several similar threads, all itsdata points are plotted around (-6.9, -2.14) in blue. There arethree other sets of overlapping data points nearby, and they arethe threads of 7ded (G3−4 in yellow at (-6.10, -0.46)), and twosub-groups of f2be (G3−1 at (-5.46, 0.15) and G3−2 at (-7.60,-1.23)). The last group is labeled as B at (3.36, 8.19) whichexactly maps to G1. In this way, we can easily map the datapoints in PCA to the phylogenetic tree. For better visualization,each color in PCA figure maps to one app name (i.e., 4-digithex). Therefore, a cluster having data points with same color,e.g., G3−3, indicates that the malware forks multiple threads

TABLE I: The extracted characteristics of ADRD family

Group Size APIs |CG| Significant Classes

G1 9 66 47 FastPrintWriter, PrintWriter, ThreadDeath,Intent, Activity, ApplicationErrorReport,PhoneWindow, ApplicationThreadNative,ComponentName, RuntimeInit, Process,UnsupportedOperationException, Process,ActivityThread, Instrumentation, Slog, Uri,StringWriter

G2 12 168 149 CloseGuard, StructLinger, StructStatG3 20 152 133 HashMap, Charsets, AbstractCollection,

Splitter, FileDescriptor, CaseMapper

TABLE II: The extracted characteristics of Android malwarefamily ADRD

Group Size APIs |CG| Significant Classes

G2−1 9 168 142 ∅G2−2 3 234 207 PlainSocketImpl, OutputStream, Block-

Guard, StructTimeval, PlainSocketImpl,LinkedHashMap, Collections, BlockGuard,AbstractMap

G3−1 3 192 46 IOException, InterruptedIOException,SocketTimeoutException

G3−2 4 288 136 CloseGuard, StructLinger, StructStat,AsynchronousCloseMonitor, OsConstants

G3−3 10 233 81 ICU, Buffer, CollectionUtils, CharBuffer,ByteBuffer, CoderResult, CharArrayBuffer,AsynchronousCloseMonitor, ByteArray-Buffer

G3−4 3 212 60 Collections, LoadedApk, StackTraceEle-ment

having similar behavior. On the other hand, a cluster has samesymbol but different color, e.g., G1, indicates that multiplevariants in the family fork a thread with similar behavior.

For ADRD, there are total 604 APIs (i.e., features ordimensions) invoked. After PCA transformation, the first andsecond principal component contain 48.34% and 23.67% vari-ance, respectively, which is large enough to distinguish datapoints. For these APIs that have higher weight in PC1 andPC2, we consider they are more important to distinguishdifferent groups. Table I shows the extracted characteristicsof different groups. The ’Size’ column is the number ofthreads in this group. ’APIs’ is the number of commonlyshared Java APIs among the threads in this group. |CG| is thesize of characteristics picked by the first and second principalcomponent in Eq. 2. We list the most significant Java classes(i.e., having the highest absolute value of the weight of the APIin PC1 and PC2). They are the critical classes to differentiatethe behaviors between groups. Table II shows same analysisby using G2 and G3 as a family, therefore we can furtherdistinguish the minor difference between sub-behaviors.

D. Cross-Family Analysis

Table III show the grouping result of all 40 Androidmalware families. By setting different distance threshold (T ),we could discriminate behavior in a malware family. Foreach group in each family, we pick one representative threadhaving lowest Jaccard distance to other threads. We group theserepresentative threads in one phylogenetic tree (Fig. 8), and thegrouping results are shown in the last row of Table III.

Page 5: Behavior Grouping of Android Malware Family

TABLE III: The grouping result of different malware families

Malware Total Total User Number of GroupsFamily Samples Threads Threads T=.7 T=.5 T=.3ADRD 16 169 44 4 6 9AnserverBot 184 2705 159 11 18 22Asroot 7 34 2 1 1 1BaseBridge 114 1417 427 8 14 16BeanBot 8 61 1 1 1 1CoinPirate 1 9 0 0 0 0CruseWin 2 14 0 0 0 0DogWars 1 4 0 0 0 0DroidCoupon 1 25 6 4 4 5DroidDeluxe 1 4 0 0 0 0DroidDream 14 92 14 6 8 10DroidDreamLight 41 263 116 8 13 18DroidKungFu1 31 168 23 5 9 12DroidKungFu2 29 150 19 3 3 3DroidKungFu3 265 2597 198 15 22 34DroidKungFu4 71 549 54 11 14 18DroidKungFuSapp 3 51 9 1 1 2DroidKungFuUpdate 1 2 0 0 0 0Endofday 1 3 0 0 0 0FakeNetflix 1 7 1 1 1 1FakePlayer 6 13 0 0 0 0Geinimi 8 82 18 4 5 6GGTracker 1 4 0 0 0 0GingerMaster 4 56 29 5 6 7GoldDream 28 290 87 13 22 29Gone60 9 76 0 0 0 0HippoSMS 4 32 7 2 2 3Jifake 1 3 0 0 0 0jSMSHider 10 61 0 0 0 0Pjapps 40 330 78 9 14 21Plankton 11 86 1 1 1 1RogueLemon 2 20 0 0 0 0RogueSPPush 9 99 18 2 2 2SndApps 10 36 0 0 0 0Tapsnake 1 2 0 0 0 0Walkinwat 1 5 1 1 1 1YZHC 21 156 27 4 6 7zHash 11 127 47 3 4 7Zitmo 1 2 0 0 0 0Zsone 10 59 12 4 4 4Total above 980 9863 1398 127 182 240As one family 39 90 152

* T is the threshold in the phylogenetic tree. The last row is the number of groups whenwe group all user-threads in one big phylogenetic tree.

We notice that the number is smaller than the summationof individual families. It is because some of the malwarefamilies are ’siblings’, which means they share similar be-haviors across different families. Fig. 6 is one of the demon-strated example. We plot DroidKungFu2’s malware threadsin DroidKungFu1’s phylogenetic tree, and they can begrouped together. We also identify the main difference betweenDroidKungFu1 and DroidKungFu2 is that the sub-grouppointed by the right-most arc in Fig. 6. We can infer that themalware programmers indeed wrote new codes for the newvariants in DroidKungFu2, but they also borrowed someold codes. Fig. 7 shows another example of DroidDream andDroidDreamLight. They have 11 major behavior groups in130 threads when T is set to 0.5, and DroidDreamLight’sthread are mixed into DroidDream’s threads.

In Fig. 8, these representative threads form groupsas well, for example pjapps-1, DroidDreamLight-1,GoldDream-3, DroidKonFu3-7, and DroidDream-3are very similar. However, they belong to different families.Moreover, the representative threads from the same family (forexample, ADRD-1 to ADRD-3) do not always in the samegroup. It implies that when classifying Android malware, fine-grained thread behavior should take into account. The current

2b8

8-8

44

bf0

-84

d6

d9

-81

8fc

4-8

8a6

f3-8

18

66

7-8

14

6ec-

83

aead-8

18

66

7-8

34

6ec-

87

aead-8

34

bf0

-87

d6

d9

-83

d6

d9

-89

46

ec-

86

86

67

-82

aead-8

2c1

af-

84

c1af-

82

35

c1-8

17

cdd-8

13

5c1

-90

7cd

d-9

01

fce-8

24

2cc

-82

8bb6

-82

a5

4b-8

20

57

6-8

22

d7

7-8

23

c9e-8

29

c5f-

82

b2

1f-

82

d1

bc-

82

e3

b4

-82

ef3

b-8

2f7

c5-8

23

5c1

-88

7cd

d-8

83

5c1

-91

7cd

d-9

13

5c1

-87

7cd

d-8

7

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

DKF2DKF2DKF2DKF2DKF2

Fig. 6: The mixed phylogenetic tree of malware familyDroidKungFu1 and DroidKungFu2 (marked as DKF2).

L L L L L L L L L L L L L L

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 7: The mixed phylogenetic tree of malware fam-ily DroidDream and DroidDreamLight. (DroidDream-Light’s threads are marked as ‘L’ in x axis).

classifying rules or mechanisms may not reflect the malwarebehavior. We suggest that when classifying a malware sampleinto different families, we should first collect its user threadprofiles and compare them with the representative threads inthe database to specify which representative behavior it has orit similar to. Then, use the representative behavior as a ’geneticcode’ to classify similar malware samples.

V. CONCLUSION AND FUTURE WORK

We design and implement analysis methods and toolsto demonstrate how to profile Android malware, includingDot Matrix, Principal Component Analysis, and UPGMA.We analyze the malicious apps and investigate the behaviorgrouping characteristics in this paper. We demonstrate thateach thread of Android malware has its own behavior andcan be classified into deferent groups. We can extract thecharacteristics of each group and show their behavior distance.We also notice that the current family classification mechanismcan be improved by our analysis, because the analysis resultscan group ’sibings’ together. It helps the security experts tohave a better understanding of the Android malware structureand the relationship within and across malware families. Someinsights and future works are listed below.

Multiple threads. Because of the nature of Android sys-tem, the apps fork about 10 to 20 threads to accomplish a

Page 6: Behavior Grouping of Android Malware Family

0.0

0.2

0.4

0.6

0.8

1.0

ADRD-0Pjapps-5AnserverBot-0DroidDreamLight-0DroidDream-0DroidKungFu4-5DroidDreamLight-2AnserverBot-5RogueSPPush-0Asroot-0DroidKungFu1-4AnserverBot-10DroidCoupon-2DroidKungFu4-4GoldDream-9DroidKungFu3-4DroidKungFu3-2zHash-2Geinimi-2BaseBridge-5DroidCoupon-1GingerMaster-4AnserverBot-1DroidKungFu3-0DroidKungFu3-10DroidDream-5DroidDreamLight-7DroidKungFu4-7GoldDream-6Pjapps-6GingerMaster-0zHash-1BaseBridge-2DroidKungFu2-1Pjapps-3ADRD-2GingerMaster-3HippoSMS-0DroidKungFu1-0Walkinwat-0AnserverBot-6DroidKungFu3-6GoldDream-4AnserverBot-9Zsone-0DroidKungFu4-9DroidDream-2GoldDream-7Geinimi-1DroidKungFu2-2DroidKungFuSapp-0ADRD-3RogueSPPush-1BaseBridge-6DroidKungFu3-8DroidKungFu4-6GoldDream-1GingerMaster-1DroidCoupon-0zHash-0Pjapps-0DroidDreamLight-3DroidKungFu4-1AnserverBot-2DroidKungFu4-8BaseBridge-0DroidKungFu1-1DroidKungFu3-3DroidKungFu4-0DroidDream-1DroidDreamLight-6GoldDream-11DroidKungFu3-5DroidKungFu3-14Geinimi-3DroidKungFu3-13YZHC-3DroidKungFu2-0DroidKungFu3-12Plankton-0GoldDream-8BaseBridge-1DroidKungFu4-3YZHC-0ADRD-1FakeNetflix-0AnserverBot-3AnserverBot-7DroidKungFu1-3GoldDream-2DroidDreamLight-5Pjapps-2Zsone-3Pjapps-7DroidDream-4GoldDream-12DroidDreamLight-4DroidCoupon-3DroidKungFu3-11DroidKungFu1-2Pjapps-4DroidKungFu4-10YZHC-2GoldDream-10Zsone-1DroidKungFu4-2GingerMaster-2DroidDream-3DroidDreamLight-1DroidKungFu3-7GoldDream-3Pjapps-1Pjapps-8AnserverBot-8BaseBridge-7DroidKungFu3-9GoldDream-5DroidKungFu3-1BeanBot-0Geinimi-0BaseBridge-3HippoSMS-1Zsone-2AnserverBot-4YZHC-1BaseBridge-4GoldDream-0

Fig. 8: The mixed phylogenetic tree of all representativethreads of all malware families with Jaccard distance 0.7.

task. Our experiments show that we can divide them intogroups to ease the analysis process and reveal the structureand programming style of the malware.

Representative behavior. Some malware families are ’sib-ings’, and they share similar behaviors. In practice, we canbuild up a behavior database by using the representativethreads of each group in each family. If there is any new orunknown Android malware, we can identify if it is similar tothe previous malware more efficiently.

Programming structure. We observe that there are dif-ferent type of strategies to use threads (which could possiblybe the programming style of the malware programmer). Dotmatrix also shows that two threads may have almost the samepiece of code snippets, but the sequence of the snippets areinterchangeable. It could be an further research topic.

Feature works. We will further investigate the code snippet

(i.e., a sequence of API call) as the characteristics of afamily or a group rather than using a single call as a feature.We anticipate that sequence alignment analysis, e.g., LongestCommon Subsequence and Smith-Waterman Algorithm, canbe useful to identify commonly shared alignment of API calls.It can be a better description of a sequence of API call as thebehavior characteristics.

We anticipate this research and the on-going works couldhelp the security experts to investigate Android malware morethoroughly and further develop countermeasure.

ACKNOWLEDGMENT

This work was supported in part by Academia Sinica, andMinistry of Science and Technology, Taiwan, under the grantMOST 103-2221-E-002-167-MY3 and 104-2923-E-002-008-MY2.

REFERENCES

[1] Y. Zhou and X. Jiang, ”Dissecting Android Malware: Characterizationand Evolution,” in IEEE Symposium on Security and Privacy, 2012.

[2] W. Enck et al., ”TaintDroid: an information-flow tracking system forrealtime privacy monitoring on smartphones,” ACM Transactions onComputer Systems (TOCS), vol. 32, no. 2, 2014.

[3] L.-K. Yan and H. Yin, ”DroidScope: Seamlessly Reconstructing the OSand Dalvik Semantic Views for Dynamic Android Malware Analysis,”in USENIX Security Symposium, 2012, pp. 569–584.

[4] D. Barrera et al., ”A Methodology for Empirical Analysis ofPermission-based Security Models and Its Application to Android,” inProc. ACM Conference on Computer and Communications Security,2010, pp. 73–84.

[5] K. W. Y. Au et al., ”Pscout: Analyzing the Android PermissionSpecification,” in ACM Conference on Computer and CommunicationsSecurity, 2012, pp. 217–228.

[6] Y. Zhang et al., ”Vetting Undesirable Behaviors in Android Appswith Permission Use Analysis,” in Proc. ACM SIGSAC Conference onComputer and Communications Security, 2013, pp. 611–622.

[7] V. Rastogi et al., ”Appsplayground: Automatic Security Analysis ofSmartphone Applications,” in Proc. ACM Conference on Data andApplication Security and Privacy 2013, pp. 209–220.

[8] D.-J. Wu et al., ”Droidmat: Android Malware Detection ThroughManifest and API Calls Tracing,” in IEEE Asia Joint Conference onInformation Security (Asia JCIS), 2012, pp. 62–69.

[9] X. Wei et al., ”ProfileDroid: Multi-layer Profiling of Android Appli-cations,” in Proc. ACM Annual International Conference on MobileComputing and Networking, 2012, pp. 137–148.

[10] S. Dai et al., ”Networkprofiler: Towards Automatic Fingerprinting ofAndroid Apps,” in Proc. IEEE INFOCOM, 2013, pp. 809–817.

[11] M. Hayes, A. Walenstein, and A. Lakhotia, ”Evaluation of malwarephylogeny modelling systems using automated variant generation,”Journal in Computer Virology, vol. 5, no. 4, pp. 335–343, 2009.

[12] F. Bellard, ”QEMU, a Fast and Portable Dynamic Translator,” in Proc.USENIX Annual Technical Conference, 2005, pp. 41–46.

[13] A. Kivity et al., ”kvm: the Linux Virtual Machine Monitor,” in Proc.Linux Symposium, 2007, vol. 1, pp. 225–230.

[14] Android-x86 - Porting Android to x86, http://www.android-x86.org[15] QEMU Machine Protocol, http://wiki.qemu.org/QMP[16] D. W. Mount, ”The Grid: Bioinfomatics: Sequence and Genome Anal-

ysis. Cold Spring Harbor Laboratory Press, New York (2001)[17] A. J. Gibbs and G. A. McIntyre, ”The Diagram, a Method for Compar-

ing Sequences. Its Use with Amino Acid and Nucleotide Sequences,”Eur. J. Biochem, vol. 16, pp. 1–11, 1970.

[18] R. Durbin et al., Biological Sequence Analysis: Probabilistic Models ofProteins and Nucleic Acids, Cambridge University Press, 2006.