DroidSF - A framework for security analysis of mobile ......Durante a realiza˘c~ao desta tese,...

DroidSF - A framework for security analysis of

mobile applications

Joao Miguel Martins Nunes

Thesis to obtain the Master of Science Degree in

Information Systems and Computer Engineering

Supervisor: Prof. Pedro Miguel dos Santos Alves Madeira Adao

Examination Committee

Chairperson: Prof. Francisco Antonio Chaves Saraiva de MeloSupervisor: Prof. Pedro Miguel dos Santos Alves Madeira Adao

Member of the Committee: Prof. Nuno Miguel Carvalho dos Santos

May 2019

Acknowledgments

I would like to thank my parents and my brother for their friendship, encouragement and caring over

all these years, for always being there for me and without whom this dissertation would not be possible.

A special acknowledgment goes to my grandparents, which unfortunately passed away earlier this year,

for teaching me to respect everyone in this world and to not take anything for granted. I will forever be

grateful to them for providing my family with all the love and support we could ask for throughout all

these years.

A special mention to Pedro Durao Lino, a very dear friend who I miss everyday, for always trying to

protect those he cared about from the unfair and terrifying life obstacles, showing nothing but endless

strength when facing adversities.

I would also like to acknowledge my thesis supervisor Prof. Pedro Adao, for his insight, patience,

support and sharing of knowledge that has made this dissertation possible.

Last but not least, to all my friends and colleagues that helped me grow as a person and were always

there for me during the good and bad times in my life. Thank you.

To each and every one of you – Thank you.

Abstract

Mobile devices, specially smart-phones, are an increasingly valuable target for bad actors as they often

hold important personal information, that can potentially be exploited against its user.

With the growing number of mobile devices connected to the internet, it’s imperative that we develop

tools and document how to perform an in-depth analysis of mobile applications. We believe this knowledge

will help software developers, and even users, to be more conscious about security and implement better

code following the recommended practices.

This thesis will cover techniques and software one can use to analyse how an Android application

was built and gain insight to what it does in background. To be able to evaluate the security of an

Android mobile application it’s imperative to understand how they are developed, assembled and how

they operate on devices at runtime. With this in mind, we will provide details about the inner-workings

of the Android platform, with special attention to its security features.

A framework was produced along side this thesis, that aggregates frequently used tools to facilitate

the security analysis process. Our framework was designed to be a fully automated, easy to use and

extendable. These characteristics seek to promote a good starting point for anyone that wants to analyse

the behaviour of mobile applications and develop systematic tests to help assert their overall security

level.

We’ll focus on methodologies that allow us to inspect critical components of the application as de-

scribed by the Open Web Application Security Project (OWASP) Top 10 Security Risks.

Keywords

Security analysis; Android applications; Mobile security; Reverse engineering; Static analysis; Dynamic

analysis; Binary instrumentation

iii

Resumo

Dispositivos moveis, especialmente smartphones, sao cada vez mais um alvo valioso para agentes mal-

intencionados, pois e habitual conterem informacao importante que pode ser utilizada contra o seu uti-

lizador.

Com o aumento de dispositivos moveis ligados a Internet, e essencial que existam ferramentas e

documentacao que possibilitem uma analise completa de seguranca em aplicacoes moveis. Acreditamos

que este conhecimento pode ajudar programadores a serem mais conscientes sobre perigos de seguranca

e a implementar codigo robusto seguindo as praticas recomendadas pela industria.

Para podermos analisar a seguranca de uma aplicacao movel Android e imperativo entender como

estas sao desenvolvidas e como operam nos dispositivos em tempo de execucao. Com isto em mente,

forneceremos detalhes sobre o funcionamento interno da plataforma Android, com especial atencao a

funcionalidades relacionadas com seguranca e privacidade.

Esta dissertacao procura identificar tecnicas e ferramentas capazes de analisar aplicacoes concebidas

para dispositivos moveis Android, com o objetivo de obter informacao sobre as operacoes que estas

realizam em segundo plano.

Durante a realizacao desta tese, implementamos uma plataforma que aglomera diversas ferramentas

frequentemente usadas para facilitar o processo introspecao e analise de aplicacoes Android. Desenhamos

a nossa plataforma para ser completamente automatica, facil de utilizar e extensıvel.

Ao longo da dissertacao vamos focar-nos em tecnicas de engenharia reversa que permitam inspecionar

os componentes crıticos de uma aplicacao, identificados pelo documento OWASP Top 10 Security Risks.

Palavras-Chave

Analise de seguranca; Aplicacoes Android; Seguranca em dispositivos moveis; Engenharia reversa; Analise

estatica; Analise dinamica; Instrumentacao em executaveis

v

Contents

1 Introduction 2

1.1 Android platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 State of the Art 8

2.1 Android Platform Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Dalvik Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.2 Android Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.3 What is smali? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 Application Package Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Android Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4 Android API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5 Android Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.6 Signing an APK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.7 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.7.1 Signature-based analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.7.2 Taint analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.7.3 Behaviour-based analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.7.4 Challenges to static analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.8 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.9 Static Analysis vs Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.10 Anti-Tampering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.11 Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 Proposed Solution 33

3.1 Design and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.1 Design Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.1.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

vii

3.1.3 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 Framework Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.3 Framework Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.1 Default settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.3.2 Fully automated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4 Static analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4.1 APKtool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4.2 AndroGuard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4.3 DroidStat-X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.4.4 Implemented tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.5 Dynamic Binary Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.5.1 Frida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.5.2 Included instrumentation scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4 Evaluating the Solution 51

4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2 Comparison with other tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.1 Apkx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.2 Objection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.3 AppMon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.3 Selected testing applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.4 OWASP Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 Conclusion 58

5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

viii

List of Figures

2.1 Android Software Stack [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Java vs. Dalvik [2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3 Diagram of the Android Runtime [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4 APK decompilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.5 Android Application Development Flow [4] . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.6 Diagram of the Activity life-cycle [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.7 SafetyNet Attestation Application Programming Interface (API) protocol [6] . . . . . . . 30

3.1 XMind Map generated by DroidStat-X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

List of Tables

1.1 Ericsson Mobility Report - November 2018 [7] . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.1 DroidSF: Configuration parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.1 DroidSF basic tests - April 2019 Successful: X, Failed: X . . . . . . . . . . . . . . . . . . 56

4.2 DroidSF findings - April 2019 Detected vulnerability: X, No vulnerability found: X . . . . 57

ix

Listings

2.1 Signing an APK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1 DroidSF framework usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.2 Mechanism to handle Frida’s output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

x

Acronyms

RE Reverse Engineer

APK Application Package Kit

JAR Java Archive

JDK Java Development Kit

Dalvik-VM Dalvik Virtual Machine

SDK Software Development Kit

VM Virtual Machine

OS Operating System

IDE Integrated Development Environment

CLI Command-Line Interface

opcode operation code

API Application Programming Interface

IPC Inter-Process Communication

ASLR Address Space Layout Randomisation

ART Android Runtime

AIDL Android Interface Definition Language

JIT Just In Time

DBI Dynamic Binary Instrumentation

OWASP Open Web Application Security Project

AVD Android Virtual Device

HAL Hardware Abstraction Layer

JVM Java Virtual Machine

1

1Introduction

Contents

1.1 Android platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2

The Internet has become an essential part of the daily life of many people. It has evolved from a

basic communication network to an interconnected set of data sources with market places for the sale of

products and services. Services like online banking or advertising are some of most successful areas on

the Internet for commercial purposes [8].

In an effort to support all the modern functionalities and inter-connectivity that we have come to

expect from recent mobile devices, software is getting increasingly more complex and often includes

multiple libraries from external sources. This kind of complexity and inter-connectivity increases the

risk of security vulnerabilities which, in turn, can have severe consequences to people and systems that

interact with compromised software.

Just as in the physical world, there are people on the Internet with malevolent intents that relentlessly

search for vulnerabilities to exploit so they can enrich themselves, while taking advantage of oblivious

users.

Vulnerabilities enable a variety of attacks. The analysis of these attacks can determine the severity

of damage that can be inflicted and the likelihood that the attack can be further replicated.

Software that ”deliberately fulfils the harmful intent of an attacker” is commonly referred to as

malicious software or malware [9]. Malware helps bad actors to accomplish their goals and its prevalence

in third-party application stores indicates that this threat is not going away soon. Notably, in 2017 only

0.1 percent of discovered mobile malware was found on official application stores, with 99.9 percent being

hosted on third-party sites [10].

With each passing year, not only has the sheer volume of security threats to mobile devices increased,

but the threat landscape has become more diverse. The number of new mobile malware variants increased

by 54 percent in 2017, as compared to 2016 [10]. There was also a marked increase in the number of

ransomware infections on mobile devices during 2018, up by a third when compared to 2017 [11].

Attackers keep developing new methods of infection, new means of generating revenue from devices

and hacks to remain on compromised devices as long as possible. Being able to think like an attacker,

knowing its tools and having confidence that we’ve minimized the attacking surface is very important.

The task of screening and validating if an application is secure can be very time-consuming and easily

overwhelms analysts that try to perform this task manually. Due to the very substantial number of

sample applications submitted for security review every day, it is paramount that we use an automated

approach to quickly differentiate between samples that deserve an in-depth manual analysis, and those

that are a variation of already known threats [8].

Investigating current methods of analysis employed by security experts to detect malware is relevant

to the context of this thesis.

Through out chapter 2 of this thesis, we will review state-of-the-art approaches currently used by

security researchers and implemented in some anti-malware and anti-virus software.

3

1.1 Android platform

Android is a Operating System (OS) for mobile devices based on the Linux OS and it includes additional

system libraries, middle-ware, and a suite of pre-installed applications. Android applications, also com-

monly known as ‘apps’, are mainly written in Java by using a rich collection of Application Programming

Interfaces (APIs) provided by the Android Software Development Kit (SDK). Compiled code is packed

into an archive file, alongside data and resources required by the application. This archive file is known

as an Application Package Kit (APK) and once it is installed on an Android device, it runs by using the

Android Runtime (ART) environment.

We chosen the Android platform because it has around 75% worldwide market share in the mobile

device space [12]. Its open-source nature was also a major factor for us, as it contributes to better

documentation, bigger developer communities and the availability of tools to interact with the Android

OS.

Accordingly to The Ericsson Mobility Report, current and forecast figures for smart-phone subscrip-

tions are:

2017 2018 20244 350 million 5 010 million 7 210 million

Table 1.1: Ericsson Mobility Report - November 2018 [7]

Table 1.1 clearly shows a growing number of mobile devices connected to the internet, which represents

a huge security concern in order to keep each device updated and secure. This task is particularly difficult

since there are many different device manufacturers and a variety of modified versions of the Android

OS.

The Android system has evolved quite a bit from its first commercial device launch in 2008, to its

latest version 9 (codename Pie) deployed in August 2018.

The upcoming major version of Android is codenamed Q, updates the Android API to level 28 and

it is already in open beta testing. It promises an OS that will give users more control over privacy and

finer authority on what applications have access to. Some of the relevant new features and changes that

might affect our work:

• Scoped storage: new permissions and APIs for accessing files in external storage.

• More user control over location permissions.

• Improved constraints on activities launching from the background.

• New restrictions on accessing device serial and IMEI.

• Permission for wireless scanning: Wi-Fi and Bluetooth will require fine location permission

4

• Ability to run embedded DEX code directly from APK.

• Executable segments of system binaries and libraries are mapped into execute-only (non-readable)

memory, as a hardening technique against code-reuse attacks.

• Calls to ‘ptrace’ are unaffected, so ‘ptrace’ debugging is not impacted.

• Applications can no longer invoke ’exec()’ on files within their home directory.

• Restrict application in-memory modification of executable code, from files which have been open

with ‘dlopen()’. This includes any shared object (.so) files with text relocations.

Substantial changes in security features were introduced in its 10+ years of existence, so we felt it

was important to investigate current system features, particularly ones that are relevant for Reverse

Engineer (RE).

It is important to notice that many users continue to make life easy for attackers by continuing to

use older versions of Android. Only around 23% of devices are running the newest versions of Android

(version 8.1 codename Oreo and version 9 codename Pie) [11]. The lack of security awareness from users

is still one of the main ways devices are infected by malicious software and there has been a step-up in

the use of tried-and-tested distribution schemes like SMS and email spam [13].

With the launch of Android 9 (Pie) applications targeting older Android API levels (beginning with

Android 4.2) display a warning when launched. Google Play Store, the official application store for the

Android platform, now requires all applications to target an API level released within the past year, and

will also mandate 64-bit support in 2019 [14].

Modern OS architectures, like Android, have many built-in security features (e.g., process sand-boxing,

Address Space Layout Randomisation (ASLR), permission based access, etc.) that seek to minimize the

attacking surface on applications. However, as a side effect of providing flexibility to its ecosystem of

applications and programmers, there have been plenty of vulnerabilities uncovered over the years in the

Android platform.

Fixing API vulnerabilities, like fixing deployed protocols, is often hard because fixes may require

changes to the API which break backwards compatibility. It takes nearly a year (346 days) for 50% of the

Android devices using the Google Play Store to update to a new version of Android. Full deployment to

95% of devices takes a little more than 3 years (1230 days) [15]. From the time a new release is available,

which has fixed the vulnerability, to the moment when devices are updated, there is a very big window

of opportunity for attackers.

Rooted mobile devices give users special permissions and enable capabilities that break security as-

sumptions, e.g., read private data, circumvent permissions, instrument applications. A rooted device can

become a liability in terms of security, especially for an uninformed user. Android personal devices have

root ratio of 1 to 23 non-rooted devices, and enterprise devices have a root ratio of only 1 to 3890 [11].

5

1.2 Reverse Engineering

RE is the process of reconstructing the semantics of a compiled program’s source code.

This thesis will focus on several techniques and tools that can be leveraged to analyse the security

Android mobile devices, in particular RE techniques.

The motivation behind our efforts to RE an application is solely focused on assessing its overall

security.

The legality of two common forms of RE in software, namely, decompilation and disassembly of binary

code, has been challenged on trade secret, copyright, and contract law theories. Although courts and

legal commentators have overwhelmingly supported the legality of RE, it remains somewhat in a grey

area [16].

Disassembly is the process of converting the different binary sequences into their original operation

codes (opcodes). It relies on identifying the hardware architecture and instruction set the binary was

compiled for.

Decompilation consists in the process of interpreting opcodes and attempting to generate equivalent

source code. Due to compiler optimizations and obfuscation techniques, the decompilation process will

almost always generate source code different from the original.

One of the first well-known cases of RE was the Samba project. Andrew Tridgell wrote a packet

sniffer, reverse engineered the SMB protocol and implemented it on a Unix machine. Thus, he made the

Unix system appear to be a PC file server, which allowed him to mount shared file-systems from the Unix

server while concurrently running NetBIOS applications [17]. RE was instrumental to the development

of the Samba software because no public information was available about the SMB protocol.

Samba project’s history is a prime example that, with balance, interoperability has more beneficial

than harmful economic consequences. Hence, a legal rule permitting the RE of programs to achieve

interoperability is economically sound [16].

In the context of this thesis, we are only interested in RE to test if applications operate correctly and

do not perform malicious or unintended activities within a mobile device.

The RE process can be split in two main types of analysis:

• Static analysis allows inspection of an application without actually executing it. This process often

requires the disassembly and/or decompilation of binary code to run tests and obtain human-

readable code.

• Dynamic analysis refers to techniques that execute an application and allow inspection of its state

at various points in execution. Using this approach one can analyse the behaviour of a binary

application at runtime through the injection of instrumentation code.

6

1.3 Goals

The main drive for this thesis is to understand how to assess the security of mobile applications. We

want to be able to assert if the application was tampered with, check critical areas for vulnerabilities and

investigate if malicious code is present.

We will use RE techniques and explain how to implement test suites for mobile applications, closely

following the Open Web Application Security Project (OWASP) Top 10 Security Risks and the Mobile

Security Testing Guide.

Due to the nature of our work and since we will be describing how to disable basic anti-tampering

methods, we will also discuss some improvements that developers can employ to make their applications

more resilient against RE.

A major part of this thesis focuses on the development of a framework that enables developers and

researchers to analyse Android applications programmatically without requiring too much effort to con-

figure.

Our framework is not intended for piracy and other non-legal uses. It was built and designed solely

to facilitate the security assessment of Android applications.

The framework was named Android Security Framework or DroidSF for short. It combines various

existing tools created and maintained by security experts, to allow a systematic and easy way analyse

mobile applications.

DroidSF Github repository: https://github.com/neskk/droidsf

The DroidSF framework is completely open-source, works in multiple platforms (Windows, Linux,

MacOS) and was planned to be extensible and customizable, so that it can grow with the help of other

developers and adapt to the ever-changing technology found in mobile devices.

We built a platform where static and dynamic analyses co-exist and complement each other. This

key aspect is what sets our work apart from existing frameworks.

The DroidSF framework provides a convenient way to experiment new RE techniques, and to analyse

applications in a fully automated way. Supporting all major OSes is also another important strength of

our framework, since most of the existing frameworks, such as AppMon [18] and DroidStat-X [19], do

not run natively on Windows targeting only Linux and MacOS environments.

Some other mobile security testing frameworks attempt to support both Android and iOS at the

same time, leading to a much greater code-base which is harder to maintain and introduces unnecessary

complexity. We decided to focus our framework just on the Android platform to attempt to minimize

these problems.

We will explain with greater detail, the decisions and design process behind the development of

DroidSF further ahead in chapter 3.

7

https://github.com/neskk/droidsf

2State of the Art

Contents

2.1 Android Platform Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Application Package Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Android Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4 Android API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5 Android Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.6 Signing an APK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.7 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.8 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.9 Static Analysis vs Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . 27

2.10 Anti-Tampering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.11 Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

8

Reverse engineering and tampering techniques have long been associated to the realm of hackers and

malware analysts. For traditional security testers and researchers, Reverse Engineer (RE) has been more

of a complementary skill, but the tides are turning. Testing mobile applications increasingly requires

disassembling compiled applications, applying patches, and tampering with binary code or even live

processes. The fact that many mobile applications implement defences against RE makes things harder

for security analysts.

Reverse engineering a mobile application is the process of analysing the compiled software to extract

information about its source code with intent of understanding how the application work [20].

Tampering is the process of modifying a mobile application, either by changing the compiled byte-code,

instrumenting the running process or its environment, in order to affect the behaviour of the application

being tested [20]. For example, it is common for an application to refuse to run on rooted devices, making

it impossible to run certain tests or use Android’s debugging functionalities. In such cases, we want to

alter the application’s behaviour.

Mobile security testers should have a basic understanding of RE concepts, mobile devices and operating

systems. RE is an art, and describing its every facet could easily fill a whole library. The sheer volume

of techniques and specializations can be overwhelming. One can spend years working on a very concrete

and isolated sub-problem, such as automating malware analysis or developing improved de-obfuscation

methods. Security testers have to be generalists. In order to become an effective reverse engineer, one

must filter through the vast amount of relevant information [20].

In recent years, researchers have developed a variety of tools and methodologies to conduct analysis

of Android applications. While all the respective papers aim at providing a thorough empirical eval-

uation, comparability is hindered by varying or unclear evaluation targets. These limitations make it

nearly impossible to directly compare approaches and we have to accept that there will always be some

techniques more suited for some tasks than others [21]. This fact reinforces the need for security testers

to diversify their knowledge and learn the generic concepts behind RE, rather than learning a specific

tool or methodology.

There is no universal recipe for the RE process that always works. Acknowledging this fact, through

out this chapter we will first focus on describing the current state-of-the-art of the Android platform, in

particular, information regarding its security features. Secondly, we will provide details about commonly

used RE methods and tools. Finally, some examples of tackling the most common anti-reverse defences

will also be analysed.

2.1 Android Platform Architecture

Android Operating System (OS) software stack is composed of several different layers. Each layer defines

interfaces and offers specific services as shown in figure 2.1.

9

Figure 2.1: Android Software Stack [1]

10

The foundation of the Android platform is the Linux kernel. On top of the kernel, the Hardware

Abstraction Layer (HAL) defines a standard interface for interacting with built-in hardware components.

Several HAL implementations are packaged into shared library modules that the Android system includes

when required. This design is what enables applications to interact with the device’s hardware, e.g., it

allows a chat application to use a device’s microphone and speaker.

2.1.1 Dalvik Virtual Machine

Android applications are usually written in Java and compiled to Dalvik byte-code, which is somewhat

different from the traditional Java byte-code. Dalvik byte-code is created by first compiling the Java

code to ‘.class’ files, then converting the Java byte-code to the Dalvik Executable ‘dex’ format with the

‘dx’ tool from the Android Software Development Kit (SDK).

Dalvik Virtual Machine (Dalvik-VM) is the original Android Runtime first deployed on Android 1.0

around 2008. Initially, it consisted on a simple application virtual machine similar to the Java Virtual

Machine (JVM), optimized for mobile devices and able to execute the ‘dex’ byte-code specification.

The ‘dex’ byte-code specification limits the total number of methods that can be referenced within a

single ‘.dex’ file to 65 536 - including Android framework methods, library methods, and methods in our

own code [22]. To move past this limitation, developers can enable a configuration known as ‘multi-dex’,

which allows your application to build and read multiple ‘dex’ files.

Figure 2.2: Java vs. Dalvik [2]

With time, Google felt the need to address perfor-

mance concerns with the Dalvik-VM and to be able

to keep up with hardware advances of the industry.

Google added a Just In Time (JIT) compiler with the

release of Android 2.2, added multi-threading capa-

bilities, and generally tried to improve the platform

piece by piece.

The JIT compiler used by Dalvik-VM is a soft-

ware component which takes application’s byte-code,

analyses it, and actively translates it into a optimized

form that runs faster, doing so while the application

continues to run. As the user progresses through the

applications, additional code is going to be compiled

and cached, so that the system can reuse the code while the application is running.

Because the JIT compiler only compiles a part of the code, it has a smaller memory footprint and

requires less storage space on the device.

11

2.1.2 Android Runtime

Android Runtime (ART) is the successor to the Dalvik-VM and it became the default runtime for devices

running Android 5.0 (API level 21) and higher. Both were originally created specifically for the Android

project [23].

ART was built to be backwards compatible, meaning it retained the ability to execute older ‘dex’

byte-code specifications.

Ahead-Of-Time (AOT) compilation was introduced as well as other improvements over the Dalvik-VM.

The key difference between ART and its predecessor, is the way byte-code is executed. As the name im-

plies, with Ahead-Of-Time compilation, applications are compiled before they are executed for the first

time.

At install time, ART compiles applications using the on-device ‘dex2oat’ tool. This utility accepts

‘.dex’ files as input and generates the compiled application executable for the target device [23]. The

resulting ‘.oat’ files are essentially a ‘ELF’ files that are then executed natively. Instead of having ‘dex’

byte-code that is interpreted by a virtual machine, now we have native machine code that can be executed

directly by the processor. This pre-compiled native machine code is used for all subsequent executions

and improves performance by a factor of two while reducing power consumption [2].

ART has gotten faster and more memory-efficient in pretty much every new Android release. The

amazing part about its improvements is that any changes automatically apply on nearly all applications,

since they run through ART.

Most recently, with Android 9 coming out, ART developers have been working to reduce the size

of the ‘dex’ files. These files are stored twice on an Android device, once in the Application Package

Kit (APK) and again in an extracted form that ART keeps around to speed up the application launch.

They are also loaded into memory, so smaller ‘dex’ files results in storage space saving and reduces the

amount of memory an application allocates.

A new feature introduced in Android P called ‘CompactDex’ aims to help reduce the size of ‘dex’

files. These files still exist in an APK, but now when an APK is installed, ART extracts and rewrites

the ‘dex’ files into ‘cdex’ files. ‘CompactDex’ is a smaller format, with better layout optimization, and

removes duplicated files when dealing with multiple ‘dex’ files.

As we have described, Android’s ‘dex’ byte-code files has a 65 536 method limit, so it’s not unusual

for large applications to have more than one ‘dex’ file. One of the inefficiencies of having multiple ‘dex’

files is that a lot of information is duplicated across these multiple files. As part of the ‘CompactDex’

rewriting, a new shared data section is created for the ‘multi-dex’ applications. The duplicate data across

‘dex’ files is written in the shared data section, so it exists only once. With ‘CompactDex’, the ‘dex’ files

are around 12% smaller [14].

12

Figure 2.3: Diagram of the Android Runtime [3]

13

2.1.3 What is smali?

As we have described in previous sections, Dalvik Executable ‘dex’ files that are included in the APK

contain the application’s compiled Dalvik-VM byte-code. This ‘dex’ byte-code is pretty much unreadable

by humans which is not practical for analysis.

Figure 2.4: APK decompilation

Because Java is a very popular programming language, there are

plenty of tools that attempt to recreate the original Java source code

from ‘dex’ files. We will talk more about this topic ahead. We can

use dex2jar [24] or enjarify [25] to convert the ‘dex’ files to Java

classes zipped inside a Java Archive (JAR) file. Afterwards, we can

use a Java decompiler, such as procyon [26] or CFR [27] to read the

class files contained in the JAR and attempt to export Java source

code.

The decompiled Java source code is easier to read and understand

than ‘dex’, but the decompilation process will likely not produce

working source code. Some sections of the decompiled source code

may also be improperly disassembled, rendering this process not very

efficient nor consistent.

Smali code is an intermediate representation for ‘dex’ byte-code

and it supports the full functionality of the ‘dex’ format (e.g., annotations, debug info, line info, etc.) [28].

Its main purpose is to facilitate the interaction with application’s byte-code.

‘smali’ files are the result of disassembling a ‘dex’ file (baksmaling). The inverse process (smaling) is

also supported, enabling the re-assembling ‘smali’ into ‘dex’ byte-code.

dex ⇔ smali ⇐ Java source code

Because ‘smali’ can consistently be converted back to ‘dex’, it facilitates the repackaging of modified

existing Android applications. One can modify an application without even knowing its original Java

source code.

Smali code is readable, but it’s more of an assembly based language, meaning that it doesn’t even

resemble Java code [28]. It is also worth noting that, converting ‘dex’ to ‘smali’ does not improve our

chances of getting working Java source code.

2.2 Application Package Kit

An Android APK is a collection of components that share a common set of resources, i.e.: database,

preferences, file space and a Linux process [29].

14

An APK file consists of a ’zip’ archive that contains all the files that comprise the application [30].

By default, only APKs downloaded from the official Google Play Store can be installed on Android

mobile devices. Users can deactivate this security feature, simply enabling an option in Android’s security

settings called Unknown Sources.

Figure 2.5: Android Application Development Flow [4]

The structure of the APK archive contains some folders and files, most notably:

• ‘META-INF/’: directory where signature data is stored, it’s used to ensure the integrity of the

APK.

• ‘assets/’: holds application’s assets, which the application can retrieve using an ‘AssetManager’

object.

• ‘lib/’: contains required native code libraries compiled inside a subdirectory for each processor

architecture (e.g., armeabi, armeabi-v7a, arm64-v8a, x86, x86 64, and MIPS).

• ‘res/’: contains resources that aren’t compiled into ‘resources.arsc’.

• ‘AndroidManifest.xml’: mandatory file that describes the name, version, required components,

access rights, minimum required API level, referenced library files and entry point of the application

[31].

• ‘classes.dex’: application code compiled in ’dex’ byte-code format.

• ‘resources.arsc’: includes language strings and styles, as well as paths to content that is not included

directly in this file, such as layout files and images.

15

Android requires that all APKs be digitally signed with a certificate before they can be installed [32].

The signature included in each APK is very important as it is used to establish the authenticity of the

application. During the APK signing process all included files are hashed, in an attempt to detect and

prevent file tampering.

We should point out that unzipping the APK with the standard unzip utility leaves some files un-

readable. Application’s resources are still packaged into a single archive file and ‘AndroidManifest.xml’

is encoded into binary XML format which is not readable with a text editor. One of the most popular

tools to unpack an APK is apktool [33]. It can automatically decode the manifest file to text-based XML

format, extract the contents from ‘resources.arsc’ and it also disassembles the ‘dex’ files to ‘smali’ code.

2.3 Android Applications

An Android application has to define a package name, for instance, ‘com.android.chrome’ or ‘com.facebook.katana’.

This package name acts as an unique identifier, which implies that no two applications can have the same

package name, either on the Google Play Store or on the Android device.

There are three types of structures that together create an Android application: Activities, Tasks and

Processes [30].

• Activities are discrete chunks of functionality that encapsulate a specific behaviour and an execution

context.

• A Task is a collection of Activities which allows a higher abstraction model to group several be-

haviours that work together.

• Processes are standard Linux processes where Activities from the APK are executed. By default

the APK runs in one process with a single thread.

Most programming languages require that the developers implement an entry point for application,

often called the ‘main’ function. The Android system initiates code in an Activity instance by invoking

specific callback methods that correspond to specific stages of its life-cycle [34].

Android applications are built as a combination of components:

• Activity: Represents a single screen with a user interface that acts as an interaction point with

the user. Developers can define which Activity is the main one, which is the first screen to appear

when the user launches the application. It is also possible to allow external applications to start a

specified Activity. Activities have their own life-cycle as seen on figure 2.6.

• Fragment: Represents a behaviour or a portion of the user interface within an activity. Fragments

were introduced in Android 3.0 (API level 11).

16

• Service: Designed to perform an action in the background for some period of time. Services do not

provide a user interface.

• Broadcast Receiver: Application component in charge of responding to system-wide events. It has

a well-defined entry point, similar to what we find in an Activity. The system can deliver these

events even to applications that are currently not running. Example of events: reception signal

change, battery charging, received an SMS, enabled Wi-Fi, etc.

• Content Provider: Manages a shared set of application data. It includes a high-level Application

Programming Interface (API) to access data so that other applications and services can interact

with the stored data. This component type abstracts the storing mechanism so it can be modified

without many changes in the code. The storing mechanism most often employed is an SQLite

database (file-based).

Each component has its own life-cycle methods, which are called by the Android system to start/stop/re-

sume the component.

On Android systems, applications only have direct access to their own data and interacting with other

resources requires them to have an explicitly exposed APIs.

Android creates a unique user ID for each application and runs them in separate processes. Conse-

quently, each application can only access its own resources. This protection is often called sand-boxing

and it is enforced by the Linux kernel. It is widely used in many OSes to offer security through isolation

of the processes running the applications. It allows precise control over resources and applications. For

instance, a crashing application does not affect other applications running on the device. At the same

time, the Android Runtime controls the maximum number of system resources allocated to applications,

preventing any one application from monopolizing too many resources.

One way to enable interaction and data sharing between applications on the same device, is to

configure applications so that they share the same user ID. This can be done by specifying the ‘an-

droid:sharedUserId’ property on ‘AndroidManifest.xml’.

Android also provides signature-based permissions enforcement, so that an application can expose

functionality to another application that is signed with a specified certificate. By signing multiple APKs

with the same certificate and using signature-based permissions checks, your applications can share code

and data in a secure manner [32].

The two main mechanisms available to share data between applications are Intents and Inter-Process

Communication (IPC).

Intents are not designed for long exchanges of information, but instead allow applications to publish-

subscribe to various kinds of events designed to share data between them.

An Intent is a messaging object developers can use to request an action from another application

17

Figure 2.6: Diagram of the Activity life-cycle [5]

18

component. Although intents facilitate communication between components in several ways, there are

three fundamental use cases [35]:

• Starting an activity

• Starting a service

• Delivering a broadcast

Please note that each component has its life-cycle. For instance, it is possible that an Intent to start

an Activity has no effect because the target Activity was already running.

There are two types of intents:

• Explicit intents specify which application will satisfy the intent, by supplying either the target

application’s package name or a fully-qualified component class name. Developers typically use an

explicit intent to start a component in their own application, because they know the class name of

the activity or service they want to start. For example, starting a new activity within the same

application in response to a user action, or starting a service to download a file in the background.

• Implicit intents do not name a specific component, but instead declare a general action to perform

and, optionally, some data, which allows a component from another application to handle the event.

For example, if developers want to show the user a location on a map, they can use an implicit

intent to request that another capable application show a specified location on a map.

To ensure that an Android application is secure, developers should make sure to always use an explicit

intent when starting a Service and never declare intent filters for their services. Using an implicit intent

to start a service is a security hazard because we can not be certain what service will respond to the

intent, and the user is unable to see which service has started. With the release of Android 5.0 (API level

21), the system throws an exception if developers call ‘bindService()’ with an implicit intent [35].

Intent filters are a very powerful and important feature of the Android platform. They provide the

ability to launch an activity based not only on an explicit request, but also an implicit one. For instance,

an explicit intent might tell the system to “Start the Send Email activity in the Gmail app”, while an

implicit intent tells the system to “Start a Send Email screen in any activity that can do such job.”.

When the Android system UI asks a user which application to use in performing a task, that is an intent

filter at work [35].

Activities that developers do not want to make available to other applications should have no intent

filters, and developers can start them in their own application using explicit intents.

Due to the application sand-boxing on Android systems, one process can not normally access the

memory of another process [36]. If developers are implementing a service that will be used by different

applications there is the possibility to use Android Interface Definition Language (AIDL) and define a

19

programming interface that both the client and service agree upon in order to communicate with each

other using IPC. IPC features allow applications to exchange signals and data securely. Instead of

relying on the default Linux IPC methods, Android’s IPC is based on Binder, a custom implementation

of OpenBinder. Most Android system services and all high-level IPC services depend on Binder.

Using AIDL is necessary only if we want to allow clients from different applications to access our

service for IPC [36]. This mechanism allows applications to communicate between processes very fast

and efficiently, but it requires them to be signed using the same certificate and to specify the same shared

user ID (‘android:sharedUserId’) in the ‘AndroidManifest.xml’ [37].

2.4 Android API

Android applications are built on top of the Android framework and its huge variety of APIs. Android

framework includes many APIs from the Java world since it is an extension to Java SDK APIs. The

majority of these services are invoked via normal Java method calls and are translated to IPC calls to

system services that are running in the background.

Some examples of system services that can be accessed through Android APIs:

• Connectivity (Wi-Fi, Bluetooth, NFC, etc.)

• Sensors (Accelerometer, Gyroscope, etc.)

• Geolocation (GPS)

• Cameras

• Microphone

The Android framework also offers common security functions, such as cryptography, integrity and

anti-tampering checks. With every new Android release, the API specification changes. Critical bug fixes

and security patches are usually applied to earlier versions as well.

Noteworthy API versions and some of their relevant security features:

• Android 4.2 Jelly Bean (API 16) in November 2012: introduction of SELinux.

• Android 4.3 Jelly Bean (API 18) in July 2013: SELinux became enabled by default.

• Android 4.4 KitKat (API 19) in October 2013: several new APIs and ART introduced.

• Android 5.0 Lollipop (API 21) in November 2014: ART used by default and many other features

added.

• Android 6.0 Marshmallow (API 23) in October 2015: users can revoke permissions at anytime and

granting detailed permissions at runtime rather than all or nothing during installation.

20

• Android 7.0 Nougat (API 24-25) in August 2016: new JIT compiler on ART and added v2 signing

scheme of APKs.

• Android 8.0 Oreo (API 26-27) in August 2017: improved security in WebView APIs and added new

permissions related to telephony.

• Android 9 Pie (API 28) in August 2018: limited access to sensors in background, privacy and

security improvements.

Google provides a massive amount of documentation about Android’s APIs online:

• https://developer.android.com/docs/

• https://developer.android.com/guide/

• https://developer.android.com/reference/

Applications must define which API-level they target. This functionality prevents application com-

patibility issues when the Android APIs are updated. API levels basically allow developers to opt-in to

new features, knowing that they have changed their applications to deal with any new changes [14].

Since Android APIs are used for interacting with every critical aspect of the mobile device, from a

RE perspective, it is really important that we have a good understanding of how an application employs

them. In our proposed solution we explain how security researchers can monitor and tamper calls to the

Android APIs and what they do and how they can they become vulnerable

2.5 Android Permissions

Permissions system is another core Android security feature that helps users understand the capabilities

of the applications they’re installing. This is a security measure that allows the user to identify when

unnecessary permissions are being requested. For instance, if a game unnecessarily requests permission

to send SMS, the user should probably avoid installing it [38].

Applications running on Android can not access user information and system components (such as

the camera and the microphone) until they request appropriate permissions. Android provides a system

with a predefined set of permissions for certain tasks that the application can request. For example, if we

want our application to use a phone’s camera, you have to request the ‘android.permission.CAMERA’

permission.

Prior to Android 6.0 Marshmallow (API 23), all permissions an application needs were requested

at installation. From Android 6.0 onwards, users are allowed to individually block or grant permission

requests during application’s execution.

21

https://developer.android.com/docs/

https://developer.android.com/guide/

https://developer.android.com/reference/

Each permission predefined in Android system is associated to a group ID in the Android OS. If the

permissions an application requested are granted, the corresponding group ID is added to the application’s

process. For instance, consider that the user ID of an application is 10177 and the application requested

the permission ‘android.permission.INTERNET’. When the permission is granted, the user ID 10177 will

be added to the group ID 3003 (inet) that corresponds to the permission requested.

2.6 Signing an APK

On Android, application signing is the first step to placing an application in its application sandbox. The

signed application certificate defines which user ID is associated with which application. Application

signing ensures that one application cannot access any other application except through well-defined

IPC [39].

In order to sign an APK, developers need to generate a public-key certificate, which contains the

public key of a public/private key pair as well as some other meta-data identifying the owner of the key

(e.g.: name and location). The owner of the certificate holds the corresponding private key that should

be kept private [32].

The public-key certificate used serves as a fingerprint that uniquely associates the APK to the de-

veloper that holds its corresponding private key. This provides a proof of authenticity which helps the

Android system ensure that any future updates to the application come from its original author. Com-

promising the private key would allow an attacker to deploy a malicious version of the application as an

update over an existing install.

Basically, there are two ways for developers to manage signing keys: either opt-in to use Google Play

App Signing to securely manage and store your signing keys or manage and secure your own keystore

and signing keys.

Android Studio massively simplifies the process of generating and signing an APK.

In addition to Android Studio, we discovered various tools that can simplify and even automate the

APK signing process:

• Appium’s ApkSign [40] can be used to automatically sign an APK with the Android test certificate.

• AppMon [18] provides an APK builder module that, among other things, can package and sign

APKs.

• DEX2JAR [24] includes ‘d2j-apk-sign’ which has the ability to automate the APK signing process.

If we want to manually sign an APK we need to use a couple of Command-Line Interface (CLI) tools

that are included in Java Development Kit (JDK) to do it.

22

First we need to generate a keystore using ‘keytool’. Then we want to ‘zipalign’ the APK, ensuring

that application’s uncompressed data starts at a predictable offset inside the APK. Developers are

required to ‘zipalign’ their APKs in order to publish them in Google’s Play Store. Afterwards we can

sign the APK using ‘apksigner’.

Listing 2.1: Signing an APK

1 keytool -genkey -v -keystore my-release-key.jks -keyalg RSA -keysize 2048

↪→ -validity 10000 -alias my-alias

2 zipalign -v -p 4 app-unsigned.apk app-unsigned-aligned.apk

3 apksigner sign --ks my-release-key.jks --out app.apk app-unsigned-aligned.apk

APK signing is based on signed JAR model and has been a part of Android from the beginning. JAR

signing (v1 scheme) does not protect some parts of the APK, such as ZIP meta-data. Android’s APK

validation process needs to analyse untrusted data structures and then discard data not covered by the

signatures, which offers a sizeable attack surface.

Android 7.0 introduced a new APK signature (v2 scheme) that ensures all content in the APK is

hashed and signed. The resulting Signing Block is inserted into the APK so it can be validated later.

During validation, this scheme treats the APK file as a blob and performs signature checking across the

entire file. Any modification to the APK, including ZIP meta-data modifications, invalidates the APK

signature [39].

Applications are also able to declare security permissions at the signature protection level, restricting

access to only allow applications signed with the same key, while maintaining distinct user IDs and

application sandboxes. A closer relationship with a shared application sandbox is allowed via the shared

user ID feature where two or more applications signed with same developer key can declare a shared user

ID in their manifest [39].

We can conclude that the current APK signing mechanism effectively provides authenticity and in-

tegrity checks, which are critical security aspects. This means that a tampered application can be easily

distinguished from an original by comparing APK signatures.

2.7 Static Analysis

Static program analysis is the analysis of computer software performed without actually executing pro-

grams. The term is usually applied to the analysis performed by an automated tool, with human analysis

being called code review.

When compiling the source code of a program into a binary executable, information such as the size

of data structures or variables names, is lost. This loss of information further complicates the task of

23

analysing the code.

Methodologies frequently used in static analysis:

• Signature based detection

• Flow graph analysis

• Taint analysis

• Behaviour based analysis

Byte-code is an intermediate representation output by programming languages to ease interpretation

and to reduce hardware and operating system dependence by allowing the same code to run cross-platform,

on different devices. Byte-code may often be either directly executed on a virtual machine, or it may be

further compiled into machine code for better performance. Since byte-code instructions are processed

by software, they could be arbitrarily complex, but are nonetheless often akin to traditional hardware

instructions. Android platform resorts to this kind of mechanism to ensure that its applications can run

in various mobile devices with different hardware.

2.7.1 Signature-based analysis

The classic static analysis techniques search over the application’s byte-code for the presence of a specific

sequence of instructions, known as signatures. We can apply signature based analysis to search for

vulnerabilities, but signatures utilized are mostly from previously detected malicious software we want

to protect users from. If a signature is found, it is highly probable that the application is insecure.

Static analysis tools can be used to extract useful information of a program. When used proactively

it can find vulnerabilities early in the development cycle. With an adequate test suite, static analysis

allows full exploration of possible program executions. Full call graphs give the analyst an overview of

what the logic flow might be and where these functions are in the code [8].

2.7.2 Taint analysis

Static taint analysis is a popular information flow analysis technique which tracks the flow of sensitive

information from a set of sensitive sources to sensitive sinks. In this context, sources define the information

we want to protect on a mobile device (e.g., phone number, contacts, and location) and sinks define points

of unwanted information release (e.g., methods related to internet communication and SMS transmission).

If data originated from a sensitive source reaches a sink, taint tracking identifies the path from the

source to the sink as an instance of data leakage. Taint analysis can be implemented both statically and

dynamically. Examples of tools that execute static taint analysis include: FlowDroid [41], Amandroid [42],

24

and DroidSafe [43]. Qiu [21] provides a detailed test results and comparison between several existing taint

analysis tools.

2.7.3 Behaviour-based analysis

Most behaviour based analysis frameworks can recognize malware on Android applications, by analysing

the number of times each system call has been issued during the execution of an action that requires user

interaction [44]. This methodology has some similarities with a signature based approach, but instead of

searching for patterns associated with malware in code, it simply monitors the behaviour of the application

with respect to system calls. The approach is based on the premise that a genuine application will differ

from its compromised version, since it issues different types and a different number of system calls.

Andromaly [45] and Crowdroid [44] are two frameworks that that rely on machine-learning techniques

to perform a behaviour based analysis. They work by collecting a list of features (e.g., Android system

calls, sensor data, hardware usage details) which monitors both the mobile device and user behaviours.

Features are then fed into a machine-learning algorithm in order to train it for malware/virus detection.

2.7.4 Challenges to static analysis

Analysing binaries brings along intricate challenges. Consider, for example, that most malware attacks

hosts executing instructions in the IA32 instruction set (32-bit version of x86). The disassembly of

such programs might result in ambiguous results if the binary employs self modifying code techniques.

Additionally, malware relying on values that cannot be statically determined (e.g., current system date,

indirect jump instructions) exacerbate the application of static analysis techniques

A significant disadvantage of a static analysis is that it may suffer from false positives. A great deal

of thinking and experimentation can go into the design of a static analysis abstraction, but the problem

of soundly and precisely identifying security violations is undecidable. This means that in the worst case,

false positives will still be reported no matter how precise we make our analysis technique [46].

Standard challenges that complicate the construction of static analysis systems are scaling to large

applications and maintaining precision in the analysis such that it does not report too many flows that

do not actually exist in the application. One particularly prominent issue with developing static analyses

for Android applications is the size, richness, and complexity of the Android API and runtime [43].

Because sensitive flows are often generated by complex interactions between the Android application,

API, and runtime, any static analysis must work with an accurate model of this runtime to produce

acceptably accurate results. This can be especially challenging if the analysis takes into consideration

Java’s reflection mechanism which is also present in Android.

Accuracy is critical for a static analysis seeking to calculate security properties of an application.

Imprecision in the model used to perform the analysis could lead to results that are unusable due to too

25

many false positives.

2.8 Dynamic Analysis

Dynamic analysis or behaviour-based detection involves running the application in a controlled and

isolated environment in order to analyse its execution traces [44].

Dynamic Binary Instrumentation (DBI) is a technique designed to inject foreign code into existing

binaries, enabling behaviour modifications and runtime information collection. This foreign code is known

as instrumentation code and it executes as part of the normal instruction stream after being injected. A

good overview of automated dynamic malware analysis techniques is provided by Egele [8].

Instrumentation is not the same thing as exploiting, since code injection does not happen via previously

discovered vulnerabilities. It is also not the same this as debugging, since you are not attaching a debugger

to the binary, although you can do very similar things.

Using a DBI framework, security researchers can do things like:

• Access process memory.

• Overwrite functions while the application is running.

• Call functions from imported classes.

• Find object instances on the heap and use them.

• Hook, trace and intercept functions.

One of the most fundamental aspects of DBI is monitoring function calls. While the use of functions

enables easy code re-usability and simplify maintenance, the property that makes functions interesting

for program analysis is that they are commonly used to abstract from implementation details to a se-

mantically richer representation. For instance, one does not need to understand how a cryptographic

encryption algorithm works to understand that a call to a certain function converts to cypher-text a

certain input parameter. Such abstractions help to understand the overall behaviour of the program [8].

One possibility to monitor what functions are called by a program is to intercept these calls. The program

is instrumented in a way that in addition to the intended function, a so-called hook function is invoked.

This hook function implements the required analysis functionality, such as recording its invocation to a

log file, or analyse input parameters.

It is arguable that one could also do all of the above using a debugger, but some applications em-

ploy anti-debugging checks that may be cumbersome to circumvent. Using a code a instrumentation

framework, security researchers can quickly start experimenting, even with black-box processes [47].

26

2.9 Static Analysis vs Dynamic Analysis

To achieve our objective of building a powerful and comprehensive framework for automated analysis of

Android applications we decided to create a symbiotic relationship between static analysis and dynamic

analysis, the two main categories of RE methods.

Static analysis, mostly used by anti-virus companies, is often based on source code or binaries inspec-

tion, looking for suspicious patterns. We believe static analysis can be used to improve the efficiency

of dynamic analysis techniques, e.g., static analysis can remove redundant checks, generate customized

hooks and focus the scope of the analysis [48].

Scanning ‘dex’ byte-code, disassembled ‘smali’ code, or even decompiled Java code, using regular

expressions can be pretty slow if we’re analysing an application with several megabytes of code. This focus

the importance of developing comprehensive tests in order to retrieve important and useful information

for dynamic analysis, without taking too much time or resources.

Although static analysis is very powerful, virus authors and even companies interested in keeping their

proprietary code secret have developed various obfuscation techniques that can be especially effective

against static analysis [49].

Dynamic binary instrumentation has several strong-points, most noticeable, it avoids having to re-

compile or relink, ability to discover code at runtime and analyse dynamically-generated code [47]. This

kind of dynamic analysis can be particularly useful in situations where application’s code was heavily

obfuscated and using pure static analysis may not be achieving acceptable results.

Using a tool like Frida [50], or other dynamic binary instrumentation framework, it might be possible

to trick the application into decrypting important obfuscated strings for us. We may even be able to

isolate the code responsible for decrypting obfuscated strings and then apply it to the obfuscated strings

uncovered during the static analysis of the application.

Instrumenting a large set of applications to check for vulnerabilities can be tricky to execute. Even

with tools like Frida that can be programmed to automate certain instrumentation operations, dynamic

analysis usually requires too much work. It is very time-consuming to install each APK, run it, and

manually test it to reproduce the vulnerability [37]. We attempt to address this concern in our framework

by automating most of the processes just described.

We can conclude that for developers to become proficient in reverse engineering they should master

both static and dynamic analysis, because both approaches can complement each other.

2.10 Anti-Tampering

Android developers can deploy various countermeasures to difficult third-parties from tampering with

their applications. Most methods described in this section are meant to improve security for the end

27

user, but can be defeated relatively easy to enable reverse engineering.

One of the most common anti-tampering methods employed is root detection. The goal is to make

it a bit more difficult to run the application on a rooted device, which in turn obstructs some tools and

techniques reverse engineers like to use.

On Android, root detection also can include the detection of custom ROMs, i.e. verifying whether

the device is a stock Android build or a custom build. As with most other defences, root detection is

not highly effective on its own, but having some root checks throughout the application can improve the

effectiveness of the overall anti-tampering scheme [51].

Some other common root detection methods employed by application developers:

• File existence checks: checking for files typically found on rooted devices, binaries that are usually

installed once a device has been rooted.

• Executing ‘su’ and other commands: search for binaries that are usually installed once a device has

been rooted.

• Search installed application packages: look for commonly used applications that can root devices.

• Checking for writable partitions and system directories.

• Testing for custom Android builds.

Verifying the application’s signing certificate at runtime, is a technique widely used to obstruct RE.

It essentially consists in validating if the APK has been signed by its author with the genuine certificate.

Assuming that the certificate remains consistent, and its private key and keystore are kept private, any

third-party modification to an application implies that a different certificate has to be used to repackage

it [52].

A possible implementation of a security check that validates APK signature consists in hard-coding

the certificate’s public key into the application and, at runtime, validate if the signature included in the

running APK matches the hard-coded one. If APK and hard-coded signatures do not match, we know

that a third-party repackaged our application.

Another simple anti-tampering technique is to check the identifier of the application that installed the

APK. Assuming that the application is only available through Google Play Store, we can use the Android

API to query if the application’s installer matches the Google Play Store identifier (com.android.vending).

Checking if the ‘debuggable’ flag is enabled at runtime, is also a straight-forward anti-tampering

method, as it prevents a debugger from being attached to the application. This check is relevant consid-

ering how simple it is to unpack an APK, change its ’AndroidManifest.xml’ file to enable the ‘debuggable’

flag and repackage it.

28

Typical users will not be running the application using an emulator, so it is common for developers to

have their application check their runtime environment. Using Java’s reflection mechanism, an application

can access some hidden system properties, e.g., ‘ro.hardware’, ‘ro.kernel.qemu’ or ‘ro.product.model’, and

look for known values used by emulators. This method can be used to stop applications from executing

on emulators, which usually are more convenient to use when reverse engineering an application.

It is also possible that developers implement these anti-tampering methods in native code (.so library

files specifically compiled for an hardware architecture). All these efforts are made to throw potential

attackers off track and make it harder to circumvent these security checks.

Although most of these anti-tampering techniques are easy to understand and implement, this also

means that an attacker can learn how to circumvent them.

All these checks run within the process space of an unprivileged application. It may take some time

but, all local checks can eventually be bypassed [53].

It has become a common practice to use code obfuscators in conjunction with this kind of anti-

tampering checks, because they depend mostly on hard-coded information. Code obfuscators employ

various techniques to make it harder for a third-party to find this kind of vital information.

Google has made available a system called SafetyNet to keep the Android ecosystem in check and

gather metrics on on-going attacks. This system relies on internet access, as it partially works remotely,

and provides an alternative to the hard-coded checks we previously described.

SafetyNet offers a set of Android APIs that create a profile of the device using software and hardware

information. This profile is then sent to Google for analysis where it is compared against a list of

white-listed device models that have passed Android compatibility testing [51].

We do not know exactly how SafetyNet works because it is not well documented and its behaviour

may change at any time. When the application first calls its APIs, SafetyNet’s service downloads a binary

package containing the device validation code from Google, which is then dynamically executed using

reflection [53].

SafetyNet’s Attestation API uses collected information from the device to assess its basic integrity, and

to evaluate the genuineness of the APK that holds the calling application. This service helps developers to

determine whether or not a particular device has been rooted, tampered with, or otherwise modified [54].

In addition to the Attestation API, SafetyNet also provides the following set of services:

• SafetyNet Safe Browsing API, provides services for determining whether a URL has been marked

as a known threat by Google.

• SafetyNet reCAPTCHA API, protects the application from malicious traffic.

• SafetyNet Verify Apps API, protects devices against potentially harmful applications.

In theory, to defeat SafetyNet we have to find which pieces of collected data are important. This

29

Figure 2.7: SafetyNet Attestation API protocol [6]

represents a moving target that Google can change at will. Consequentially, we would have to fake data

in meaningful ways, adding much more work and uncertainty about what information is used for the

analysis [53].

2.11 Obfuscation

Obfuscation is the process of transforming code and data to make it more difficult to comprehend. This

process can generate syntactically different code, but semantically equivalent to the original. It is an

integral part of every software protection scheme [51].

In the previous section, we have seen some techniques that developers can implement as anti-tampering

features, but rely on the secrecy of information hard-coded in the application. Without string obfuscation,

these hard-coded values can be easily discovered and modified using various static analysis tools. This

is why obfuscation is so important and can exponentially increase the difficulty of reverse engineering an

application.

Programs can be made incomprehensible, in whole or in part, in many ways and to different degrees.

It is important to keep in mind that obfuscation techniques can also be employed by bad actors, try to

mask their virus and malware against signature based detection.

Below we will provide details about some of the most frequently used obfuscation tools for Android

applications.

ProGuard is an open-source Java class file shrinker, optimizer, obfuscator and pre-verifier. The

shrinking step detects and removes unused classes, fields, methods and attributes. The optimization step

analyses and optimizes the byte-code of the methods. The obfuscation step renames the remaining classes,

30

fields, and methods using short meaningless names. These first steps make the code base smaller, more

efficient, but also harder to reverse engineer. The final pre-verification step adds validation information

to the classes, which is required for Java Micro Edition and for Java 6 and higher [55].

Tools like Android Studio [56] already integrate ProGuard, making it easily accessible to developers

that want to use it to automatically process application code during the build process.

The obfuscation step of ProGuard essentially modifies the class, method and field names to smaller

and abstract names (class A, method c, field b, etc.). This step reduces the size of APK and strips

semantics from code, making it harder to RE. During this step, a mapping file is generated so that

developers can translate debugging information (with obfuscated names) to match the original names

used.

Special attention is required if the application code, or any library included, takes advantage of Java’s

reflection mechanism. ProGuard must be configured to skip obfuscation of entities used by the reflection

code, otherwise functionality that depends on original entity names will not work properly.

DexGuard is the commercial sibling of ProGuard for Android. It can reuse ProGuard’s configuration

and because of their similarities, developers can continue exercising their knowledge and the community’s

expertise on ProGuard [57]. DexGuard optimizes, obfuscates, converts to Dalvik-VM byte-code, packages,

signs and aligns archives in a single seamless process. This optimization streamlines and speeds up the

entire build process.

Obfuscator-LLVM is a project initiated in June 2010 by the information security group of the

University of Applied Sciences and Arts Western Switzerland of Yverdon-les-Bains (HEIG-VD). The aim

of this project is to provide an open-source fork of the LLVM compilation suite able to provide increased

software security through code obfuscation and tamper-proofing [58]. Currently the Obfuscator-LLVM

includes the following features:

• Instructions Substitution: works by replacing standard binary operators (like addition, subtraction

or boolean operators) by functionally equivalent, but more complicated sequences of instructions.

This kind of obfuscation is rather straight-forward and does not add a lot of security, as it can

easily be removed by re-optimizing the generated code.

• Bogus Control Flow: modifies a function call graph by adding a basic block before the current basic

block. This new basic block contains an opaque predicate and then makes a conditional jump to

the original basic block.

• Control Flow Flattening: completely flattens the control flow graph of a program.

A commercial version of Obfuscator-LLVM implementing much more advanced capabilities is available

through strong.codes [58].

31

Strong.codes was a company active in the domain of software protection and they developed

strong.protect, an evolution of a long-time research project Obfuscator-LLVM. Strong.codes was bought

by Snap, inc. and recently their website has been offline so we’re not sure if their product “strong.protect”

is still being commercialized. Strong.protect performs advanced code obfuscation and tamper-proofing,

in one of the most powerful compilation frameworks of the moment and its goal is to make software piracy

much more expensive and complicated [59].

32

3Proposed Solution

Contents

3.1 Design and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2 Framework Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.3 Framework Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.4 Static analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5 Dynamic Binary Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . 47

33

We knew from the beginning we wanted to build a flexible framework capable of helping security

researchers perform a thorough analysis of Android applications using modern techniques.

The framework would need to integrate both static and dynamic analyses in an way that they could

complement each other. It also had to be easy to configure, while being fully automated to allow batch

testing of applications.

Our work focused mostly on integrating tools already employed and well documented by the mobile

security community, into a single framework, to allow a convenient way to leverage their functionalities.

Researching all the topics we have previously covered in chapter 2 allowed us to be conscious about

which tools and techniques the framework had to support in order to achieve a powerful platform to

analyse mobile applications and assess for vulnerabilities.

We decided to name the framework as Android Security Framework, or DroidSF for short.

DroidSF is completely open-source and we invite everyone to contribute through its Github repository:


We decided to start working on top of DroidStat-X [19], a static analysis framework also built on

Python. This framework was especially attractive for us, because it was structured following the Open

Web Application Security Project (OWASP) Mobile Top 10 categories, which we also decided follow.

Our first big task was to port the code from DroidStat-X to support the newest version of Python 3.

We also took the opportunity to fix some of the code-style and ended up reviewing all the code in this

framework. These changes were submitted as a public-request on GitHub and were merged.

3.1 Design and Requirements

In our proposal we established some broad requirements we needed to fulfil in order to create a powerful

and capable mobile security analysis framework. The first requirement was to focus exclusively on the

Android platform to make sure we had a well defined scope for our security analysis. The second re-

quirement was that the whole analysis process had to allow full automation in order to be able to cope

with the significant number of applications released every day. The last requirement established that the

framework would have to be able identify some of most common vulnerabilities found in Android applica-

tions. These requirements distinguish our framework from existing ones (e.g., ApkX [60], Objection [61]

and AppMon [18] - see section 4.2), and were established to help shape its design process.

We wanted to create a useful and attractive framework for developers and security researchers to

build upon. To achieve this goal, we spent a considerable amount of time laying out the most important

design aspects we would have to follow.

34


3.1.1 Design Choices

Below we present the key design aspects we closely followed during the implementation of our framework:

Multi-platform: It was important for us to create a framework that could work natively on all the

major desktop platforms: Windows, Linux and MacOS. This removes the need to use a virtualisation

software which usually adds significant performance overhead and has less resources available to

perform the analysis. Frameworks with similar characteristics to ours have been developed, but

mostly are built to operate on Linux and have dependencies that are not available by default on

Windows. We decided to build our framework using Python because it is an interpreted, high-level,

general-purpose programming language and it is available for all the major Operating Systems (OSes).

Additionally, there are many security analysis tools built on Python, so it made sense for us to follow

this trend in order to take advantage of its vast and active community of developers.

Easy to install: Minimizing the time it takes to setup the framework was a priority for us. To

achieve this goal we tried to reduce the number pre-requisites the user has to install manually. We

also took advantage of Python’s package installer ‘pip’, which is installed by default with Python, to

automate the download and installation of required Python packages.

Batteries included : Our framework uses various third-party external tools to execute several tasks.

Since all the required tools are available for free on the internet, we decided to automate the whole

process of downloading and configuring them to run from within our framework. To respect the

multi-platform design, we made sure to select external tools that can run either directly on Python

or on Java runtime environment, as Java is also available on all major OSes. The goal was to provide

a pleasant user experience and avoid time-consuming tasks where we could.

Configurable: We wanted users to be able to easily customize the analysis and allow the definition

of profiles to avoid manually specifying configuration options on each run. In cases where more than

one external tool can be used to perform a certain task, we decided to give users the ability to choose

which one they want to use. All the configuration options can be defined through command-line

parameters or using a configuration file. Users can specify a configuration file through the command-

line to act as a profile for performing analysis on a batch of applications.

Extendable: In order to be able to keep up with new techniques, we allow custom static analysis

checks to be performed on ‘smali’ code. Users can add regular expressions to be tested through a

configuration parameter, and the results will be output in the analysis report. Our framework also

allows running custom Frida (see section 3.5.1) injection scripts during the dynamic analysis process.

Users just have to specify the location for the custom Frida script they wish to execute through a

configuration parameter.

Instrumentation script templates: To take advantage of the information collected during static

analysis we needed a convenient way to customize the instrumentation scripts. We developed a simple

35

template system that essentially replaces predefined place-holders in the scripts with information

obtained during the static analysis. Another feature we implemented in the template system enables

users to build a script that includes other scripts, allowing a modular approach to scripts.

Fully automated: We knew that one of our major challenges was the automation of the dynamic

analysis, since static analysis is naturally an automated process. Taking advantage of Python’s ability

to easily interact with the native OS, we managed to automate almost every step of the binary

instrumentation we use to perform the dynamic analysis. From setting up the device, to running the

instrumentation script on application, it can all be done without user interaction.

3.1.2 Requirements

As some of our design decisions may have already indicated, our framework requires users to manually

install the following software:

• Android Studio

• Android Software Development Kit (SDK)

• Java Development Kit (JDK)

• Python 3 and its package installer: ‘pip’

We provide more details about the setup and configuration steps in our GitHub repository.

Android Studio is Google’s official Integrated Development Environment (IDE) for authoring Android

applications but it also automates most of the steps required to open, decode, disassemble and decompile

an Application Package Kit (APK) [56].

The major advantage of using Android Studio is that it provides a user-friendly, powerful, all-in-one

solution to manually analyse and debug Android applications. Users just have to click on ‘Profile or

debug APK’ in the starting menu, select the target APK and Android Studio will generate a organized

project containing application’s source code files and resources.

Android Studio has built-in support for ‘smali’ code and it automatically generates ‘smali’ files from

available ‘dex’ byte-code found in the APK. We can also set breakpoints, control execution flow, monitor

objects, and many other debugging functionalities one would expect from a modern IDE, on applications

running in a connected Android device or in an Android Virtual Device (AVD) emulator.

We require Android Studio to be installed because it includes the Android SDK and allows the

creation/management of AVD emulators.

The AVD manager found in Android Studio is our recommended method to configure an emulator

capable of performing the dynamic analysis of applications.

36

Android SDK includes tools that interface with the Android platform, such as ‘adb’, ‘fastboot’, and

‘systrace’. Our framework depends on ‘adb’ to be able to interact with Android devices.

The Android SDK depends on a number of tools from the JDK, most notably the ‘javac’ used in the

first compilation step of an Android application. Usually, JDK will be installed during the installation

of Android SDK, which ends up satisfying another requirement by our framework.

We implemented the framework in Python, so it is only natural that Python needs to be installed

and properly configured for users to be able to execute it. We decided to use Python version 3 because,

although still very popular, Python version 2 is getting deprecated and will stop being supported in 2020.

By default, Python’s installer also installs its package manager (‘pip’), which is required to automate the

installation of Python dependencies.

3.1.3 Dependencies

A common golden-rule in Software Engineering states that, developers should avoid reinventing the wheel

and use what is already available instead.

Our framework depends on many third-party tools to support all the different and complex tasks we

want to perform during the analysis of an Android application.

• APKtool [33]: Can unpack and decode an APK. Produces ‘smali’ code from the ‘dex’ byte-code

found in the APK.

• dex2jar [24]: Various tools work with ‘dex’ files. Can be used to convert ‘dex’ byte-code into a

Java Archive (JAR) file.

• enjarify [25]: Translates ‘dex’ byte-code to equivalent Java byte-code. Outputs Java byte-code

inside a JAR file.

• CFR [27]: Java decompiler with support for modern Java features. Takes a JAR file as input and

outputs Java source code.

• Procyon [26]: Suite of Java meta-programming tools focused on code generation, analysis, and

decompilation. Its decompiler will take a JAR file and output Java source code.

• JADX [62]: Command-Line Interface (CLI) tools that can decompile Java source code from An-

droid APK files.

• Frida [50]: Powerful, well-documented, and very popular toolkit built for dynamic instrumentation

of binaries.

All the tools mentioned above are downloaded automatically by our framework when it executes.

Downloads are cached to avoid unnecessary time delays and extra internet traffic.

37

For our framework to work properly it requires the Python packages to be installed:

• AndroGuard [63]: Framework built in Python that allows analysis and manipulation of APKs.

• Frida: Python bindings to interact with instrumented processes. It offers a convenient way to

programmatically interact with Frida.

• PyElfTools: Library for parsing and analysing ELF files. It allows our framework to inspect native

libraries commonly found in APKs.

• ConfigArgParse: Library that allows configuration parameters to be read from a file.

• Requests: Library that facilitates web requests.

A more detailed guide of how to setup our framework is available in its official repository.

3.2 Framework Workflow

To be able to achieve our goal of assessing the overall security of an Android application, we integrated

various tools into a single framework to help automate the security assessment process. In this section we

will show users how to use our framework and describe the workflow we implemented to analyse Android

applications.

Listing 3.1: DroidSF framework usage

# Minimal usage

python3 s c r i p t . py - a /path/ to /app . apk

# Using a con f i g f i l e

python3 s c r i p t . py - a /path/ to /app . apk - c f /path/ to / c on f i g . i n i

# Only s t a t i c a n a l y s i s

python3 s c r i p t . py - a /path/ to /app . apk - - no - dynamic - a n a l y s i s

# Complete usage

python3 s c r i p t . py [ - h ] [ - c f CONFIG] [ - v ] - a APK FILE

[ - d { di sab led , standard , jadx } ] [ - s SCRIPT]

[ - i t INSTRUMENTATIONTIMEOUT] [ - - f o r c e ] [ - - f o r c e - download ]

[ - - no - s t a t i c - a n a l y s i s ] [ - - no - dynamic - a n a l y s i s ]

[ - - cache - path CACHEPATH] [ - - download - path DOWNLOADPATH]

[ - - log - path LOG PATH] [ - - output - path OUTPUTPATH]

[ - - arch {arm , arm64 , x86 , x86 64 } ] [ - - device - id DEVICE ID ]

[ - - dex - conve r t e r {dex2jar , e n j a r i f y } ]

[ - - java - decompi ler { c f r , procyon } ]

[ - - f r i da - v e r s i on FRIDA VERSION]

[ - - f i l e - e x c l u s i o n s FILE EXCLUSIONS ]

38

[ - - d i r e c to ry - e x c l u s i o n s DIRECTORY EXCLUSIONS]

[ - - custom - checks CUSTOMCHECKS] [ - - java - home JAVAHOME]

[ - - android - sdk ANDROID SDK] [ - - java - xms JAVA XMS]

[ - - java -xmx JAVAXMX]

To take advantage of the available dynamic analysis features, users should start an AVD emulator, or

connect a rooted Android device, before executing our framework.

Performing the dynamic analysis step without a device/emulator visible to ‘adb’ is impossible, causing

the framework to exit gracefully at this step. If users choose to use a physical device, they must ensure

that USB debugging is enabled and give root access to Frida.

Below we provide an overview about each step from the workflow implemented in DroidSF.

APK analysis: We use AndroGuard to analyse application’s manifest and extract useful infor-

mation, e.g., package name, application version and target API level, permissions, activities and

certificates.

APK unpack: Apktool is used to unpack the APK, decode its contents, and baksmali the ‘dex’

files to generate ‘smali’ code.

DEX decompilation: A combination of ‘dex’ converters (dex2jar/enjarify) and Java decompilers

(cfr/procyon/JADX) can be configured to generate Java source code.

Static analysis: We fully integrated DroidStat-X checks into our framework with various modifica-

tions in order to support Windows. During this step we resort to regular expressions and AndroGuard

to identify code patterns that may indicate a vulnerability.

Export report: Create a text file containing the information collected during the static analysis.

Device/emulator setup: First we push ‘frida-server’ to the device, then we install the APK

currently being tested, and finally we start the ‘frida-server’ process.

Dynamic instrumentation: We tell Frida to spawn the application on the device, then we attach

it to the application process and inject a instrumentation script that contain various hooks to methods

and classes we want to observe.

Interacting with application: During this step we allow users to interact with the application

being instrumented. This allows users to test features in the application, which might trigger hooked

functions. Users can use external tools, like AndroidViewClient and DroidBot, to completely automate

this interaction step.

Export Results: Generate a text file containing results from the instrumentation script.

39

3.3 Framework Configuration

We wanted to let users configure virtually every aspect of the framework, while providing a convenient

way for them to define a configuration profile, i.e., reading configuration parameters from a text file that

can be reused. Every configuration parameter available on DroidSF is shown in table 3.1.

Parameter Description-h, –help Show help message and exit.-cf, –config Configuration file.-a, –apk-file APK file to analyse.-d, –decompiler Decompile APK to Java source code. Default: disabled

Choices: disabled, standard, jadxStandard uses ‘–dex-converter’ and ‘–java-decompiler’.

-s, –script Instrumentation script to execute. Default: class list.js-it, –instrumentation-timeout Time in seconds for frida instrumention.

Default: 0 (indefinitely)–force Overrides previously generated results.–force-download Overrides previously downloaded files.–no-static-analysis Skip static analysis checks.–no-dynamic-analysis Skip dynamic analysis checks.–cache-path Directory where temporary files are saved.–download-path Directory where downloaded files are saved.–log-path Directory where log files are saved.–output-path Directory where generated files are saved.–arch Android device architecture. Default: x86.

Choices: arm, arm64, x86, x86 64–device-id Specify target device ID.

Default: none - list devices interactively.Use ‘*’ to choose the first device available.

–dex-converter DEX to JAR converter. Default: enjarify.Choices: dex2jar, enjarify

–java-decompiler JAR to Java decompiler. Default: procyon.Choices: cfr, procyon

–frida-version Specify which Frida version to use. Default: 12.4.4Note: must match python package version.

–file-exclusions Ignore these paths/files on static analysis–directory-exclusions Ignore these directories on static analysis.–custom-checks Additional REGEX checks for ‘smali’ code.–java-home Directory that contains Java binaries.–android-sdk Directory that contains Android SDK binaries.–java-xms Initial RAM allocated for Java VM. Default: 128m–java-xmx Maximum RAM allocated for Java VM. Default: 1024m

Table 3.1: DroidSF: Configuration parameters

Our framework sets default values for all parameters, except for ‘-a –apk-file’, the APK file. Every

parameter can be adjusted and tweaked to customize the analysis of the application.

40

3.3.1 Default settings

The default configuration assumes that the instrumentation will be held in an x86 emulator. We rec-

ommend using the AVD Manager, included in Android Studio, to create an emulator with the following

characteristics:

• Nexus 5X

• x86 images: Oreo - API level 27 - ABI x86

This setup worked consistently during our tests with the current release of Frida (12.4.4). We assume

everything will work fine in different emulators/devices as long as Frida is able to run on it.

Using just the ‘-a, –apk-file’ parameter to indicate which APK to analyse, our framework will disas-

semble ‘dex’ to ‘smali’, skip decompilation to Java source code, run DroidStat-X static analysis, ask user

which device to use, install and launch the APK in the device, and inject the instrumentation script. At

this point the user can manually interact with the application for as long as they want. Users have to

manually terminate the instrumentation to let DroidSF process and export all the output from Frida’s

instrumentation process.

3.3.2 Fully automated

To be able to fully automate the analysis the process users must specify the following configuration

parameters:

• –device-id: Use a specific device ID, or use ‘*’ to automatically select the first device available.

• –instrumentation-timeout: Set a maximum amount of time for the instrumentation process.

In contrast with the scenario previously described in Default settings, providing these two configuration

parameters ensures that the whole analysis process does not require manual intervention.

As referred in the overview of the framework’s workflow, users can leverage some external tools

designed to automate the interaction with the application.

During our tests we experimented DroidBot [64], a lightweight test input generator for Android, with

some pretty good results. It systematically explores the application while interacting with it in a similar

way a human would. We decided to not incorporate DroidBot in our framework and leave it as an optional

tool users can choose whether to use or not, during the instrumentation process.

3.4 Static analysis

In this section we describe how our framework performs the static analysis of application code in search

for vulnerabilities.

41

We will briefly introduce Apktool and AndroGuard before going into detail about the static analysis

checks performed by DroidStat-X.

3.4.1 APKtool

APKtool is a CLI tool for reverse engineering third-party binary Android applications. It can decode

resources to nearly original form and rebuild them after making some modifications [33]. It outputs a

project-like file structure containing APK’s contents, and automates some repetitive tasks, like rebuilding

disassembled resources back to an APK.

A relevant feature in APKtool for our framework, consists in its ability to baksmali the ‘dex’ byte-

code found inside the APK, and output ‘smali’ code. This process is faster than performing a full

decompilation to Java source code and offers a good intermediate code representation where we can

search for code patterns analysis.

We can inspect and modify the ‘smali’ code. It is even possible to replace whole classes by generating

‘smali’ from new Java source code. Once all the modifications are done, one can easily package the APK

back up with APKtool again. It is worth noting that the resulting APK is not signed by APKtool.

3.4.2 AndroGuard

AndroGuard is a pure Python framework to experiment and analyse Android files [63]. It supports:

• DEX / ODEX files

• APK files

• Android’s binary XML format

• Android binary encoded resources

• Disassemble DEX/ODEX byte-code

• Decompiler for DEX/ODEX files

Developers can either use the CLI or use AndroGuard purely as a library for their own tools and

scripts. Below you find some of the most notable CLI tools developed by AndroGuard:

• androcg.py: generates call graphs.

• androdd.py: generates control flow graphs.

• androdis.py: disassembler for ’dex’ files.

• androlyze.py: starts a iPython shell with all modules loaded.

42

• androgui.py: androguard graphical user interface.

One of the easiest ways to analyse an APK file, is starting an interactive Python shell by using

‘androlyze.py’ [65].

For analysing and loading APK or ‘dex’ files, we can use ‘AnalyzeAPK(filename)’ and ‘AnalyzeDEX(filename)’

respectively. The three objects returned by these wrapper classes are: an APK object (a), a Dalvik-VM

format object (d) and an Analysis object (dx).

Inside the APK object (a) you can find all information about the application, e.g., package name,

permissions, certificates, the AndroidManifest.xml and its resources.

The Dalvik-VM format (d) corresponds to the ‘dex’ file found inside the APK file. We can fetch

classes, methods or strings from the ‘dex’ file, but when analysing ‘multi-dex’ the Analysis object (dx)

should be used instead, as it contains special classes to handle these settings.

The Analysis object (dx) allows to follow the call flow using cross-references (XREFs), which are

generated for four things: Classes, Methods, Fields and Strings.

Cross-references (XREFs) work in two directions, meaning that we can navigate to the object that

called the current object (xref from), or navigate to another object that is being called by current one

(xref to). This is a very powerful feature, since it can be used to produce function call and control flow

graphs.

We use AndroGuard in our framework essentially to retrieve information from the APK. Some of this

information will later be used to customize the dynamic analysis process.

3.4.3 DroidStat-X

DroidStat-X [19] is a Python framework that generates an XMind map with all the information gathered

and any evidence of possible vulnerabilities identified via static analysis.

The XMind map is structured following the OWASP Mobile Top 10 2016 categories. We provide an

example in figure 3.1. Each category has various topics that security testers should to cover, to guarantee

and highlight coverage. Each topic has a URL to the respective chapter in the OWASP’s Mobile Security

Testing Guide explaining the vulnerability and how to confirm its existence.

We built the DroidSF framework based on DroidStat-X because of its comprehensive testing method-

ology and its solid collection of checks performed to application’s ‘smali’ code. Our first steps consisted

in modifying several aspects that were not compatible with the requirements we had established for

DroidSF.

The first thing we decided to do was migrating DroidStat-X code-base to Python 3. We also took the

opportunity to define a standard code-style and made sure everything was uniform.

43

Figure 3.1: XMind Map generated by DroidStat-X

44

To respect DroidSF’s multi-platform design, the second big change we had to do, consisted in replacing

DroidStat-X dependencies with OS agnostic alternatives. DroidStat-X depends on common software used

in many Linux systems: ‘grep’, ‘sed’, ‘readelf’ and ‘dd’.

• ‘grep’ / ‘sed’: are mostly used to search for regular expressions in ‘smali’ code output by APKtool.

We built equivalent functionality to these tools using only pure Python. We even measured the

difference in performance to make sure we were not making things slower.

• ‘readelf’ / ‘dd’: were only used to extract Xamarin DLLs from native libraries. We have imple-

mented an alternative method in pure Python to achieve this and it only requires the package

PyElfTools. Our alternative method surprised us because it is much faster, when comparing with

the original method. We even tested the extracted DLLs with a dotNET decompiler, and the results

were properly disassembled.

The last major change we performed to DroidStat-X consisted in decoupling the XMind map gener-

ation. We wanted to have a simple textual report output, instead of depending on an external software

and SDK.

3.4.4 Implemented tests

Below we describe the information that can be obtained from the DroidStat-X module found in DroidSF:

• Package Name

• Version Name and Code

• APK file SHA256 hash

• Minimum/Target SDK Version (API level)

• Determine if the backup option is enabled

• Determine if the package is ‘multi-dex’

• Export Permissions and permission levels

• Determine APK signing scheme and used certificates

• Identify signature files

• Check for presence of secret codes in IntentFilters

• Exported Components with respective IntentFilters and Permissions

• List files contained in the APK

45

Technology/Framework fingerprinting:

• OutSystems: test if certain known classes are present in the application.

• Cordova: identify usage and lists used plugins.

• Xamarin: identify usage and extract its DLLs automatically.

Vulnerability testing:

• Object Usage

– WebViews loadUrl method

– Cryptography Functions

• Improper Platform Usage: Components security related checks

– Activities vulnerable to Fragment Injection

– Lack of ‘FLAG SECURE’ or ‘android:excludeFromRecents’ in Activities

– Path Traversal in exported ContentProviders

– SQL Injection in exported ContentProviders

• Reverse Engineering: Package security related checks

– Determine if the application is debuggable.

• Improper Platform Usage: WebViews security related checks

– Usage of AddJavascriptInterface in WebViews

(on API level < 16, this might indicate a Remote Code Execution vulnerability)

– Usage of Javascript enabled WebViews

– Usage of fileAccess enabled WebViews

– Usage of UniversalAccessFromFileURLs enabled WebViews

• Insecure Communication Topic: TLS security related checks

– Vulnerable TrustManagers

– Vulnerable HostnameVerifiers

– Webviews Vulnerable onReceivedSslError Method

– Direct usage of Socket without HostnameVerifier

– Determine the usage of Certificate Pinning (okHTTP and custom implementations)

46

– Determine the usage of NetworkSecurityConfig file (API level ≥ 24)

∗ Check if clear-text is allowed

∗ Check if Certificate Pinning is enabled

∗ Validate Certificate Pinning expiration date

∗ Determine if User CA’s are trusted

• Insufficient Cryptography: Cryptography security related checks

– Usage of AES with ECB cryptography functions

– Usage of DES or 3DES cryptography functions

– Determine the usage of Android Keystore - no usage may indicate vulnerabilities

This information is stored inside the DroidStatX class, so it can easily be accessed to complement the

dynamic analysis process. All the information collected during the static analysis is exported to a text

file once it finishes, for posterior analysis.

3.5 Dynamic Binary Instrumentation

This process was the hardest to automate as it required interaction with multiple external tools and

making sure the device/emulator is properly configured.

As we have indicated in section 3.2, we integrated Frida’s powerful instrumentation framework, in

order to perform the dynamic analysis of applications. We take information obtained from the static

analysis step to generate specialized hooks and customize the instrumentation script.

3.5.1 Frida

Frida [50] is an immensely powerful toolkit, used to build scripts for dynamic instrumentation of ap-

plications. It lets security testers inject snippets of JavaScript, or even a complete libraries, into native

applications running on Windows, macOS, GNU/Linux, iOS, Android, and QNX.

The core of Frida is written in C and injects Google’s V8 engine into the target processes, where

injected snippets of Javascript get executed with full access to memory, hooking functions, and can even

calling native functions inside the process. A bi-directional communication channel is also established

with the instrumented process, allowing users to interact with the Javascript script running inside the

target process.

Frida also provides some simple CLI tools built on top of the Frida API. These can be used as-

is, tweaked to user needs, or serve as examples of how to use the Frida’s Application Programming

Interface (API).

47

There are essentially two approaches one can take to instrument Android applications using Frida.

The first approach requires users to have a rooted Android device, and consists on running ‘frida-

server’ as root (it uses ’ptrace’ internally) on the device. There are several reasons for choosing this

approach, the most important one being that it does not require repackaging of the APK we want to

instrument. After attaching to the target application, ‘frida-server’, injects the ‘frida-gadget’ library into

the memory space of the process.

The second approach does not require a rooted Android device, and essentially requires users to

repackage the APK to include the ‘frida-gadget’ library. This approach can be useful because, it can

avoid possible side-effects on applications that implement ptracing/debugging checks, but on the other

hand, it can trigger checks against repackaging [66].

Since we are interested in emulators, which are necessarily rooted, and we wanted to avoid repackaging

APKs in order to instrument them, we decided our framework would follow the first approach.

3.5.2 Included instrumentation scripts

We decided to include Frida hooks capable of monitoring the following Android APIs:

• Bluetooth

• Clipboard

• Cryptography Cipher-suites

• Cryptography Hashing functions

• Database (includes SQLiteDatabase)

• File-system Input/Output

• SharedPreferences

• Local Storage

• FlagSecure

• IPC

• Networking/Communications (HTTP/HTTPS)

• System calls to ‘libc’ native library

• WebView usage

48

These hooking scripts can be helpful to perform call stack traces. They can even be used to develop

a behaviour-based analysis that takes into account every system call performed to produce a profile of

the application.

Taking advantage of the template system we implemented in our framework, it is possible for security

testers to merge various instrumentation scripts into a single script that will injected into the application.

This feature is very important since it allows scripts to be built in a modular fashion, enabling re-usability,

while avoiding code duplication.

We have also included in DroidSF, some basic instrumentation scripts meant to be used as guides:

• class list.js: Outputs class names for every loaded class by the application.

• some class.js: Shows how to manipulate classes and create instances on runtime.

• change method.js: Shows how to modify the implementation of a class method.

• anti re.js: Overloads ‘java.lang.System::exit()’ method to prevent applications from exiting.

• rpc.js: Example of how to build a RPC interactive script that lets users arbitrarily call its methods

during the instrumentation process.

Frida’s Javascript API offers two ways for the script to send data back to our framework: ‘send(data)’

and ‘console.log(data)’. In order to get information sent by the ‘send()’ method, we have to specify a

message handler that will be responsible for parsing the data.

Since each script will have slightly different outputs, we created a mechanism that allows users to

specify which functions are responsible for handling the output, and, at the end of the instrumentation,

export the collected data.

This mechanism is better explained below in listing 3.2. Albeit it is fully functional, we realized it

could be improved and refactored to completely abstract the instrumentation testing suite, i.e., the script,

message handlers, data analysis and export relevant results.

49

Listing 3.2: Mechanism to handle Frida’s output

1 a p p c l a s s l i s t = [ ]

2 de f p a r s e c l a s s l i s t (message , data ) :

3 i f message [ ' type ' ] == ' send ' :

4 a p p c l a s s l i s t . append (message [ ' payload ' ] )

5

6 de f e x p o r t c l a s s l i s t ( apk ) :

7 f i l ename = apk . output name + ” - c l a s s l i s t . txt ”

8 d r o i d s f . u t i l s . e x p o r t f i l e ( apk . output path , f i l ename , a p p c l a s s l i s t )

9 l og . i n f o ( ”Exported c l a s s l i s t : %s ” , f i l ename )

10

11 on message handler s = {

12 ” c l a s s l i s t . j s ” : p a r s e c l a s s l i s t ,

13 . . .

14 }

15

16 on resume handlers = {

17 ” c l a s s l i s t . j s ” : e x p o r t c l a s s l i s t ,

18 . . .

19 }

20

21 . . .

22 # Spec i f y the message handler

23 i f a rgs . s c r i p t in on message handler s :

24 s c r i p t . on ( 'message ' , on message handler s [ a rgs . s c r i p t ] )

25 e l s e :

26 s c r i p t . on ( 'message ' , on message )

27

28 . . .

29 # Spec i f y the r e s u l t par s ing handler

30 i f a rgs . s c r i p t in on resume handlers :

31 on resume handlers [ a rgs . s c r i p t ] ( apk )

Listing 3.2 describes how DroidSF is able to handle the execution of the default instrumentation

script: ‘class list.js’.

During the instrumentation process, every message sent by the script ‘class list.js’ will be parsed by

‘parse class list()’, and when the process terminates ‘export class list(apk)’ is called to analyse and export

the results.

50

4Evaluating the Solution

Contents

4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2 Comparison with other tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.3 Selected testing applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.4 OWASP Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

51

We created a framework with the ability to automate several Reverse Engineer (RE) tasks and perform

a security analysis of Android applications, through the use of several existing tools. Our focus was on

building a framework that could implement most of the approaches described throughout this dissertation.

Evaluating the techniques, methodologies and tools we discussed in the present report is not a linear

task. Having no simple way to directly compare results with other frameworks with similar goals and

motivations, we focused on evaluating the performance and basic ability to perform certain tasks.

4.1 Methodology

To evaluate our framework we devised a three step process.

The first step on our tests was to make sure that our framework had the ability to consistently

disassemble, decompile and run static analysis tests.

Taking advantage of the findings produced by static analysis, the second step in our tests assessed

the ability of customizing the instrumentation code to the application under analysis.

Afterwards, we attempt to run the application on our Android Virtual Device (AVD) environment. If

we are able to execute the application, then the final test consists on injecting the instrumentation code

and, after some interaction with the application, export the results.

4.2 Comparison with other tools

Our framework, DroidSF, actively seeks to reduce the time spent setting up a working test environment

for testing Android applications. It takes very little time to configure, can be easily installed on either

Linux, macOS or Windows, and provides static and dynamic analyses capabilities.

4.2.1 Apkx

Apkx [60] is Python wrapper to popular free ‘dex’ converters and Java decompilers. It extracts Java source

code directly from the APK and can be useful for experimenting with different converters/decompilers

without having to worry about ‘classpath’ settings and configuration parameters.

Apkx is the recommended tool to accompany the Open Web Application Security Project (OWASP)

- Mobile Security Testing Guide, but it has not been updated in a while and some of the tools included

with it are outdated.

Our framework, DroidSF, could easily replace APKX since it offers the same decompilation function-

alities. It will also automatically download new versions of the third-party tools employed, guaranteeing

that they will remain up-to-date with current releases.

53

Using DroidSF instead APKX also allows users to configure the amount of RAM available to the Java

Virtual Machine (VM) executing the third-party tools, which can be important when analysing large

applications.

4.2.2 Objection

Objection [61] is yet another framework built upon Frida (see sub-section 3.5.1). It provides an interactive

framework for security testing of mobile applications, with the particularity that it utilizes TypeScript

to generate the JavaScript scripts that Frida injects. This creates an abstraction level that may hinder

maintainability and produces a steeper learning curve for security testers. It effectively makes it harder

for developers start manipulating the code base and contribute to the framework.

Another downside resides in the fact that, Objection was designed and created to be an interactive

exploratory session, whereas we wanted a framework with the ability to be process batches of Application

Package Kit (APK)s autonomously.

4.2.3 AppMon

Appmon [18] is an automated framework for monitoring and tampering system API calls of native macOS,

iOS and Android applications. It is based also based on Frida (see sub-section 3.5.1) and can automate

the process of unpacking the APK, adding ’frida-gadget’ shared library and repackaging it.

There are several sub-components of this project, each provides developers with some short-cuts and

predefined recipes to perform some useful reverse engineering tasks:

• AppMon Sniffer - Intercept API calls to figure out interesting operations performed by an applica-

tion.

• Appmon Intruder - Manipulate API calls data to create change application’s original behaviour.

• AppMon Android Tracer - Automatically traces Java classes, methods, its arguments and their

data-types in APKs.

• AppMon IPA Installer - Creates and installs “inspectable” IPAs on non-jailbroken iOS devices.

• AppMon APK Builder - Creates APKs “inspectable” on non-rooted Android devices.

Although AppMon is another very powerful framework for security testing of mobile applications,

it does not focus on Android, and it is suited towards injecting Frida into an APK rather than taking

advantage of rooted devices.

This tool supports macOS, iOS and Android and it is designed to automate monitoring and tampering

of system Application Programming Interface (API) calls. Taking into account everything we researched

54

about RE techniques, it is our opinion that, the broader the scope of software we are analysing, the harder

it is to be precise and keep complexity low. To avoid this issue, we focused on developing a specialized

framework for Android.

4.3 Selected testing applications

Most of the applications were selected through an exhaustive search to maximize the amount of imple-

mented checks we covered in our tests.

We wanted to ensure the correctness and completeness of the checks implemented in our framework.

To do so, we required applications that exposed vulnerabilities and bad security practices. We searched

various different sample applications and found some that were built especially for the purpose of training

security testers.

Intentionally vulnerable Android applications:

• InsecureBank v.2

https://github.com/dineshshetty/Android-InsecureBankv2

• PIVAA v.1

https://github.com/HTBridge/pivaa

• DVHMA-FeatherWeight v.6.3.0

https://github.com/logicalhacking/DVHMA

• DVHMA-OpenUI v.6.3.0


• Sieve v.2.3.4

https://github.com/mwrlabs/drozer/releases/

• OWAPS MSTG Challenges

Small applications built as didactic examples to accompany the OWASP: Mobile Security Testing

Guide. https://github.com/OWASP/owasp-mstg/tree/master/Crackmes

These applications allowed us to assess if some features of our analysis were working properly.

For instance, we knew that ‘DVHMA-OpenUI’ was built on Apache Cordova and we wanted to check

if our framework was able to correctly detect this.

55

https://github.com/dineshshetty/Android-InsecureBankv2

https://github.com/HTBridge/pivaa



https://github.com/mwrlabs/drozer/releases/

https://github.com/OWASP/owasp-mstg/tree/master/Crackmes

Disassembly Decompilation Static Analysis Dynamic AnalysisInsecureBank X X X XPIVAA X X X XDVHMA-FeatherWeight X X X XDVHMA-OpenUI X X X XSieve X X X X

Table 4.1: DroidSF basic tests - April 2019Successful: X, Failed: X

4.4 OWASP Methodology

OWASP is a worldwide not-for-profit charitable organization focused on improving the security of software

[67].

This methodology is based on OWASP’s Top 10 mobile application vulnerabilities of 2016, which many

developers use as the standard source for information on how to test the security of mobile applications.

Important areas we need to analyse in a mobile application to evaluate its overall security level are [68]:

M1 Improper Platform Usage - This category covers misuse of a platform feature or failure to use

platform security controls. It might include Android intents, platform permissions, or some other

security control that is part of the mobile operating system.

M2 Insecure Data Storage - This covers insecure data storage and unintended data leakage.

M3 Insecure Communication - This covers poor handshaking, incorrect SSL versions, weak key negoti-

ation, clear-text communication of sensitive assets, etc.

M4 Insecure Authentication - This category captures notions of authenticating the end user or bad

session management.

M5 Insufficient Cryptography - Investigate the code that applies cryptography to a sensitive information

asset. This category is for issues where cryptography was attempted, but it wasn’t done correctly.

M6 Insecure Authorization - This is a category to capture any failures in authorization (e.g.: autho-

rization decisions in the client side, forced browsing, etc.).

M7 Client Code Quality - A catch-all category for code-level implementation problems in the mobile

application client. This would capture things like buffer overflows, format string vulnerabilities,

and various other code-level mistakes in the client.

M8 Code Tampering - Covers binary patching, local resource modification, method hooking, method

swizzling, and dynamic memory modification.

M9 Reverse Engineering - Analysis of the final binary to determine its source code, libraries, algorithms,

and other assets. This may be used to exploit other nascent vulnerabilities in the application, as

56

well as revealing information about back end servers, cryptography constants and ciphers, and

intellectual property.

M10 Extraneous Functionality - Developers may have included hidden backdoors or other internal de-

velopment security controls that are not intended to be released into a production environment.

We ran test applications we selected through the DroidSF framework to identify potential vulnerabil-

ities and obtained the following results:

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10InsecureBank X X X X X X X X X XPIVAA X X X X X X X X X XDVHMA-FeatherWeight X X X X X X X X X XDVHMA-OpenUI X X X X X X X X X XSieve X X X X X X X X X X

Table 4.2: DroidSF findings - April 2019Detected vulnerability: X, No vulnerability found: X

4.5 Limitations

Around 98% of Android mobile devices use ARM CPUs [69]. Due to this fact, a growing number of

developers choose not to include native libraries compiled for x86/amd64 on the APK, which effectively

prevents their applications from being executed natively on x86/amd64 CPUs. Android will even refuse

to install the application if it does not match system’s architecture where it is running.

Because emulating the ARM architecture on x86/amd64 CPUs introduces severe performance losses,

we conducted our tests using a AVD emulator running a x86 64 Android ROM.

If developers do not possess a rooted ARM Android device, their only option is to configure an

AVD emulator to use an Android ARM ROM and work through the very slow emulation process over

x86/amd64, while hoping that none of the processes crash.

Another limitation in our framework has to do with interacting with the target application when

running the instrumentation checks. We can setup hooks to methods and classes, but we will not neces-

sarily see them being called. This happens because application’s logic flow might not be executing these

methods

DroidBot [64] performs an exhaustive and systematic interaction with the application, but this still

does not guarantee that its interaction with the application will trigger the hooked functions we wanted

to inspect. Ideally one could trace APIs and system calls in an attempt to identify the call stack required

to reach the desired functions, but the process is not consistent and can be very time-consuming.

57

5Conclusion

Contents

5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

58

5.1 Future Work

We struggled to implement in DroidSF everything we wanted to. Security testing of mobile applications

involves such a large variety of topics that we he had to focus on implementing essential features for our

framework.

The DroidSF framework is fully functional but there are plenty features and improvements we did

not find the time to implement. Below we suggest some new features and improvements we think would

complement the current framework.

• Improve the reports generated by static analysis to categorize each test following the Open Web

Application Security Project (OWASP) Top 10 structure.

• Implement more checks for static and dynamic analyses.

• Improve efficiency of dynamic analysis using the information from static analysis.

• Migrate some of the checks performed in ‘smali’ code, to use Androguard’s APIs and perform

in-memory regular expression searches.

• Implement dynamic analysis hooks to detect packers and to intercept SSL communications.

• Implement static analysis checks to detect anti-tampering measures.

• Implement checks to analyse native libraries included in APKs.

• Refactor the mechanism we use to handle output from Frida’s instrumentation.

• Integrate AndroidViewClient or DroidBot directly into the framework to streamline the dynamic

analysis and to perform an exhaustive coverage of all possible interactions.

• Automate interaction with Android Virtual Device (AVD) manager: add the possibility of creating,

starting, stopping and deleting AVD emulators directly from DroidSF.

• Develop a custom lightweight Android emulator using Docker to avoid having to install Android

Studio.

• Add web interface using Flask Python package to allow users to submit and check results from

analysed APKs.

59

5.2 Conclusions

The security testing of mobile applications is a relatively new area, and it has proved to be a very

interesting and challenging area due to its dynamic and ever-evolving ecosystem.

We feel it is of a tremendous importance to maintain mobile devices secure given the importance

of the data and functionality they hold. To this end, we understand the purpose behind the thorough

scrutiny that official application stores do on every application submitted.

While working on this thesis we understood how to leverage existing tools to perform the kind of

analysis an official application store does to submitted applications. Of course, we do not know for sure

which checks Google, Apple and other companies employ to detect vulnerable applications, but we are

confident we took a step in the right direction.

The sheer amount of information we found on topics such as Android platform, Vulnerability detection,

Malware detection, Reverse Engineering, etc., was enormous. One of our biggest challenges was to filter

out which information was relevant to this thesis.

It became clear to us during our research that, no matter how complex certain security features are,

bad actors will always try to find new ways to circumvent them. Even though the Android documentation

presents many standard security practices for developers to follow, there is always room for human error

and this just reinforces the need to use an automated testing framework that can identify problems before

the application is deployed.

We feel that we were successful in creating a flexible framework built around robust multi-platform

software, that can be extended to perform very complex tasks. We also believe that most developers, from

students to expert security testers, will find the DroidSF framework very useful to conduct multi-prone

analysis on Android mobile applications. It still has many areas that require improvements but we are

satisfied with what we built so far.

60

Bibliography

[1] Google, “Android platform guide,” https://developer.android.com/guide/platform, accessed:

10/04/2019.

[2] OWASP, “Android platform overview,” https://mobile-security.gitbook.io/

mobile-security-testing-guide/android-testing-guide/0x05a-platform-overview, accessed:

20/10/2018.

[3] A. Frumusanu, “A diagram of the android runtime architecture,” https://commons.wikimedia.org/

wiki/File:ART view.png, 2014, accessed: 09/04/2019.

[4] J. Huang, “Practice of android reverse engineering,” https://www.slideshare.net/jserv/

practice-of-android-reverse-engineering, accessed: 18/05/2017.

[5] Google, “Activity lifecycle,” https://developer.android.com/guide/components/activities/

activity-lifecycle.html, accessed: 26/04/2019.

[6] Google, “Safetynet attestation api,” https://developer.android.com/training/safetynet/attestation.

html, accessed: 26/04/2019.

[7] Ericsson, “The ericsson mobility report,” https://www.ericsson.com/en/mobility-report/reports/

november-2018/key-figures, accessed: 19/03/2019.

[8] M. Egele, T. Scholte, E. Kirda, and C. Kruegel, “A survey on automated dynamic malware-analysis

techniques and tools,” ACM computing surveys (CSUR), vol. 44, no. 2, p. 6, 2012.

[9] A. Moser, C. Kruegel, and E. Kirda, “Exploring multiple execution paths for malware analysis,” in

2007 IEEE Symposium on Security and Privacy (SP’07). IEEE, 2007, pp. 231–245.

[10] Symantec, “Internet security threat report vol 23,” https://www.symantec.com/content/dam/

symantec/docs/reports/istr-23-executive-summary-en.pdf, 2018, accessed: 09/01/2019.

[11] Symantec, “Internet security threat report vol 24,” https://www.symantec.com/security-center/

threat-report, 2019, accessed: 10/04/2019.

61

https://developer.android.com/guide/platform

https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/0x05a-platform-overview

https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/0x05a-platform-overview

https://commons.wikimedia.org/wiki/File:ART_view.png

https://commons.wikimedia.org/wiki/File:ART_view.png

https://www.slideshare.net/jserv/practice-of-android-reverse-engineering

https://www.slideshare.net/jserv/practice-of-android-reverse-engineering

https://developer.android.com/guide/components/activities/activity-lifecycle.html

https://developer.android.com/guide/components/activities/activity-lifecycle.html

https://developer.android.com/training/safetynet/attestation.html

https://developer.android.com/training/safetynet/attestation.html

https://www.ericsson.com/en/mobility-report/reports/november-2018/key-figures

https://www.ericsson.com/en/mobility-report/reports/november-2018/key-figures

https://www.symantec.com/content/dam/symantec/docs/reports/istr-23-executive-summary-en.pdf

https://www.symantec.com/content/dam/symantec/docs/reports/istr-23-executive-summary-en.pdf

https://www.symantec.com/security-center/threat-report

https://www.symantec.com/security-center/threat-report

[12] StatCounter, “Os worldwide market share,” http://gs.statcounter.com/os-market-share/mobile/

worldwide, accessed: 09/04/2019.

[13] K. Lab, “Mobile malware evolution 2018,” https://securelist.com/mobile-malware-evolution-2018/

89689/, accessed: 10/04/2019.

[14] R. Amadeo, “Android 9 pie, thoroughly reviewed,” https://arstechnica.com/gadgets/2018/09/

android-9-pie-thoroughly-reviewed/, Sep. 2018, accessed: 05/04/2019.

[15] D. R. Thomas, A. R. Beresford, T. Coudray, T. Sutcliffe, and A. Taylor, “The lifetime of android api

vulnerabilities: case study on the javascript-to-java interface,” in Cambridge International Workshop

on Security Protocols. Springer, 2015, pp. 126–138.

[16] P. Samuelson and S. Scotchmer, “The law and economics of reverse engineering,” The

Yale Law Journal, vol. 111, no. 7, pp. 1607–1663, 2002. [Online]. Available: http:

//www.jstor.org/stable/797533

[17] C. Hertel, “Samba: An introduction,” https://www.samba.org/samba/docs/SambaIntro.html, ac-

cessed: 06/04/2019.

[18] N. D. Patnaik, “Appmon - automated framework for monitoring and tampering mobile applications,”

https://github.com/dpnishant/appmon, accessed: 28/04/2018.

[19] C. Andre, “droidstat-x - android applications security analyser,” https://github.com/clviper/

droidstatx, accessed: 27/04/2018.

[20] OWASP, “Tampering and reverse engineering on android,” https://mobile-security.gitbook.io/

mobile-security-testing-guide/android-testing-guide/0x05c-reverse-engineering-and-tampering, ac-

cessed: 25/09/2018.

[21] L. Qiu, Y. Wang, and J. Rubin, “Analyzing the analyzers: Flowdroid/iccta, amandroid, and droid-

safe,” in Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and

Analysis. ACM, 2018, pp. 176–186.

[22] Google, “Enable multidex for apps with over 64k methods,” https://developer.android.com/studio/

build/multidex, accessed: 26/05/2018.

[23] Google, “Art and dalvik,” https://source.android.com/devices/tech/dalvik/, accessed: 26/05/2018.

[24] B. Pan, “Dex2jar,” https://github.com/pxb1988/dex2jar, accessed: 28/04/2018.

[25] Google, “Enjarify,” https://github.com/google/enjarify, accessed: 06/10/2018.

62

http://gs.statcounter.com/os-market-share/mobile/worldwide

http://gs.statcounter.com/os-market-share/mobile/worldwide

https://securelist.com/mobile-malware-evolution-2018/89689/

https://securelist.com/mobile-malware-evolution-2018/89689/

https://arstechnica.com/gadgets/2018/09/android-9-pie-thoroughly-reviewed/

https://arstechnica.com/gadgets/2018/09/android-9-pie-thoroughly-reviewed/

http://www.jstor.org/stable/797533

http://www.jstor.org/stable/797533

https://www.samba.org/samba/docs/SambaIntro.html

https://github.com/dpnishant/appmon

https://github.com/clviper/droidstatx

https://github.com/clviper/droidstatx

https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/0x05c-reverse-engineering-and-tampering

https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/0x05c-reverse-engineering-and-tampering

https://developer.android.com/studio/build/multidex

https://developer.android.com/studio/build/multidex

https://source.android.com/devices/tech/dalvik/

https://github.com/pxb1988/dex2jar

https://github.com/google/enjarify

[26] M. Strobel, “Procyon java decompiler,” https://bitbucket.org/mstrobel/procyon, accessed:

03/03/2019.

[27] [email protected], “Cfr - another java decompiler,” http://www.benf.org/other/cfr/, accessed:

03/03/2019.

[28] B. Gruver, “Smali assembler/disassembler,” https://github.com/JesusFreke/smali, accessed:

06/04/2018.

[29] D. Morrill, “Inside the android aplication framework,” https://sites.google.com/site/io/

inside-the-android-application-framework, accessed: 30/03/2018.

[30] Google, “Understand the apk structure,” https://developer.android.com/topic/performance/

reduce-apk-size#apk-structure, accessed: 30/03/2018.

[31] Google, “Application fundamentals,” https://developer.android.com/guide/components/

fundamentals, accessed: 26/05/2018.

[32] Google, “Sign your app,” https://developer.android.com/studio/publish/app-signing, accessed:

15/05/2018.

[33] C. Tumbleson, “Apktool repository,” https://github.com/iBotPeaches/Apktool, accessed:

24/03/2018.

[34] Google, “Introduction to activities,” https://developer.android.com/guide/components/activities/

intro-activities, accessed: 26/04/2019.

[35] Google, “Intents and intent filters,” https://developer.android.com/guide/components/

intents-filters, accessed: 26/04/2019.

[36] Google, “Android interface definition language,” https://developer.android.com/guide/components/

aidl, accessed: 26/05/2018.

[37] Y.-C. Lin, “Androbugs framework - an android application secu-

rity vulnerability scanner,” https://www.blackhat.com/docs/eu-15/materials/

eu-15-Lin-Androbugs-Framework-An-Android-Application-Security-Vulnerability-Scanner.pdf,

accessed: 05/04/2018.

[38] H. Lockheimer, “Android and security,” http://googlemobile.blogspot.pt/2012/02/

android-and-security.html, accessed: 17/05/2018.

[39] Google, “Application signing,” https://source.android.com/security/apksigning/, accessed:

26/05/2018.

63

https://bitbucket.org/mstrobel/procyon

http://www.benf.org/other/cfr/

https://github.com/JesusFreke/smali

https://sites.google.com/site/io/inside-the-android-application-framework

https://sites.google.com/site/io/inside-the-android-application-framework

https://developer.android.com/topic/performance/reduce-apk-size#apk-structure

https://developer.android.com/topic/performance/reduce-apk-size#apk-structure

https://developer.android.com/guide/components/fundamentals

https://developer.android.com/guide/components/fundamentals

https://developer.android.com/studio/publish/app-signing

https://github.com/iBotPeaches/Apktool

https://developer.android.com/guide/components/activities/intro-activities

https://developer.android.com/guide/components/activities/intro-activities

https://developer.android.com/guide/components/intents-filters

https://developer.android.com/guide/components/intents-filters

https://developer.android.com/guide/components/aidl

https://developer.android.com/guide/components/aidl

https://www.blackhat.com/docs/eu-15/materials/eu-15-Lin-Androbugs-Framework-An-Android-Application-Security-Vulnerability-Scanner.pdf

https://www.blackhat.com/docs/eu-15/materials/eu-15-Lin-Androbugs-Framework-An-Android-Application-Security-Vulnerability-Scanner.pdf

http://googlemobile.blogspot.pt/2012/02/android-and-security.html

http://googlemobile.blogspot.pt/2012/02/android-and-security.html

https://source.android.com/security/apksigning/

[40] appium, “objection - runtime mobile exploration,” https://github.com/appium/sign, accessed:

28/04/2018.

[41] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. Mc-

Daniel, “Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis

for android apps,” Acm Sigplan Notices, vol. 49, no. 6, pp. 259–269, 2014.

[42] F. Wei, S. Roy, X. Ou et al., “Amandroid: A precise and general inter-component data flow analysis

framework for security vetting of android apps,” in Proceedings of the 2014 ACM SIGSAC Conference

on Computer and Communications Security. ACM, 2014, pp. 1329–1341.

[43] M. I. Gordon, D. Kim, J. H. Perkins, L. Gilham, N. Nguyen, and M. C. Rinard, “Information flow

analysis of android applications in droidsafe.” in NDSS, vol. 15, 2015, p. 110.

[44] I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani, “Crowdroid: behavior-based malware detection

system for android,” in Proceedings of the 1st ACM workshop on Security and privacy in smartphones

and mobile devices. ACM, 2011, pp. 15–26.

[45] A. Shabtai and Y. Elovici, “Applying behavioral detection on android-based devices,” in Interna-

tional Conference on Mobile Wireless Middleware, Operating Systems, and Applications. Springer,

2010, pp. 235–249.

[46] B. Livshits, Improving software security with precise static and runtime analysis. Standford Uni-

versity, 2006, vol. 67, no. 11.

[47] L. Zhiqiang, “Dynamic Binary Instrumentation,” 2012. [Online]. Available: https://pdfs.

semanticscholar.org/presentation/17b7/9b6d7f232d02073593accd00570e124bc031.pdf

[48] M. Christodorescu and S. Jha, “Static analysis of executables to detect malicious patterns,” WIS-

CONSIN UNIV-MADISON DEPT OF COMPUTER SCIENCES, Tech. Rep., 2006.

[49] A. Moser, C. Kruegel, and E. Kirda, “Limits of static analysis for malware detection,” in Twenty-

Third Annual Computer Security Applications Conference (ACSAC 2007). IEEE, 2007, pp. 421–430.

[50] O. A. V. Ravnas, “Frida: A dynamic instrumentation toolkit,” https://frida.re/, accessed:

27/05/2018.

[51] OWASP, “Android anti-reversing defenses,” https://mobile-security.gitbook.io/

mobile-security-testing-guide/android-testing-guide/0x05j-testing-resiliency-against-reverse-engineering,

accessed: 25/05/2018.

[52] S. Alexander-Bown, “Android security: Adding tampering detection to your app,” https://www.

airpair.com/android/posts/adding-tampering-detection-to-your-android-app.

64

https://github.com/appium/sign

https://pdfs.semanticscholar.org/presentation/17b7/9b6d7f232d02073593accd00570e124bc031.pdf

https://pdfs.semanticscholar.org/presentation/17b7/9b6d7f232d02073593accd00570e124bc031.pdf

https://frida.re/

https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/0x05j-testing-resiliency-against-reverse-engineering

https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/0x05j-testing-resiliency-against-reverse-engineering

https://www.airpair.com/android/posts/adding-tampering-detection-to-your-android-app

https://www.airpair.com/android/posts/adding-tampering-detection-to-your-android-app

[53] C. M. John Kozyrakis, “Inside android’s safetynet attestation,” https://www.mulliner.org/

collin/publications/eu-17-Mulliner-Kozyrakis-Inside-Androids-SafetyNet-Attestation.pdf, accessed:

28/05/2018.

[54] Google, “Safetynet attestation api,” https://developer.android.com/training/safetynet/attestation,

accessed: 28/05/2018.

[55] E. Lafortune and GuardSquare, “Proguard manual,” https://www.guardsquare.com/en/proguard/

manual/introduction, accessed: 18/05/2018.

[56] Google, “Android studio,” https://developer.android.com/studio/, accessed: 05/04/2018.

[57] E. Lafortune and GuardSquare, “Dexguard,” https://www.guardsquare.com/en/dexguard, accessed:

22/04/2018.

[58] LLVM, “Obfuscator-llvm wiki,” https://github.com/obfuscator-llvm/obfuscator/wiki.

[59] strong.codes, “strong.codes sa,” https://www.linkedin.com/company/strong-codes/, accessed:

27/05/2018.

[60] B. Mueller, “Apkx - android apk decompilation for the lazy,” https://github.com/b-mueller/apkx,

accessed: 27/01/2019.

[61] sensepost, “objection - runtime mobile exploration,” https://github.com/sensepost/objection, ac-

cessed: 28/04/2018.

[62] skylot, “jadx - dex to java decompiler,” https://github.com/skylot/jadx, accessed: 27/04/2018.

[63] A. Desnos and G. Gueguen, “Androguard repository,” https://github.com/androguard/androguard,

accessed: 24/03/2018.

[64] Y. Li, “droidbot - lightweight test input generator for android,” https://github.com/honeynet/

droidbot, accessed: 25/03/2019.

[65] A. Desnos and G. Gueguen, “Androguard documentation,” http://androguard.readthedocs.io/en/

latest/index.html, accessed: 24/03/2018.

[66] J. Kozyrakis, “Using frida on android without root,” https://koz.io/

using-frida-on-android-without-root/, accessed: 28/05/2018.

[67] OWASP, “Welcome to owasp,” https://www.owasp.org/index.php/Main Page, accessed:

06/04/2018.

[68] OWASP, “Mobile top 10 2016,” https://www.owasp.org/index.php/Mobile Top 10 2016-Top 10, ac-

cessed: 06/04/2018.

65

https://www.mulliner.org/collin/publications/eu-17-Mulliner-Kozyrakis-Inside-Androids-SafetyNet-Attestation.pdf

https://www.mulliner.org/collin/publications/eu-17-Mulliner-Kozyrakis-Inside-Androids-SafetyNet-Attestation.pdf

https://developer.android.com/training/safetynet/attestation

https://www.guardsquare.com/en/proguard/manual/introduction

https://www.guardsquare.com/en/proguard/manual/introduction

https://developer.android.com/studio/

https://www.guardsquare.com/en/dexguard

https://github.com/obfuscator-llvm/obfuscator/wiki

https://www.linkedin.com/company/strong-codes/

https://github.com/b-mueller/apkx

https://github.com/sensepost/objection

https://github.com/skylot/jadx

https://github.com/androguard/androguard

https://github.com/honeynet/droidbot

https://github.com/honeynet/droidbot

http://androguard.readthedocs.io/en/latest/index.html

http://androguard.readthedocs.io/en/latest/index.html

https://koz.io/using-frida-on-android-without-root/

https://koz.io/using-frida-on-android-without-root/

https://www.owasp.org/index.php/Main_Page

https://www.owasp.org/index.php/Mobile_Top_10_2016-Top_10

[69] Unity3D, “Android hardware stats,” https://web.archive.org/web/20170808222202/http://hwstats.

unity3d.com:80/mobile/cpu-android.html, accessed: 04/04/2019.

66

https://web.archive.org/web/20170808222202/http://hwstats.unity3d.com:80/mobile/cpu-android.html

https://web.archive.org/web/20170808222202/http://hwstats.unity3d.com:80/mobile/cpu-android.html

DroidSF - A framework for security analysis of mobile ......Durante a realiza˘c~ao desta tese,...

Documents

Transcript of DroidSF - A framework for security analysis of mobile ......Durante a realiza˘c~ao desta tese,...