DroidSF - A framework for security analysis of mobile ......Durante a realiza˘c~ao desta tese,...
Transcript of DroidSF - A framework for security analysis of mobile ......Durante a realiza˘c~ao desta tese,...
DroidSF - A framework for security analysis of
mobile applications
Joao Miguel Martins Nunes
Thesis to obtain the Master of Science Degree in
Information Systems and Computer Engineering
Supervisor: Prof. Pedro Miguel dos Santos Alves Madeira Adao
Examination Committee
Chairperson: Prof. Francisco Antonio Chaves Saraiva de MeloSupervisor: Prof. Pedro Miguel dos Santos Alves Madeira Adao
Member of the Committee: Prof. Nuno Miguel Carvalho dos Santos
May 2019
Acknowledgments
I would like to thank my parents and my brother for their friendship, encouragement and caring over
all these years, for always being there for me and without whom this dissertation would not be possible.
A special acknowledgment goes to my grandparents, which unfortunately passed away earlier this year,
for teaching me to respect everyone in this world and to not take anything for granted. I will forever be
grateful to them for providing my family with all the love and support we could ask for throughout all
these years.
A special mention to Pedro Durao Lino, a very dear friend who I miss everyday, for always trying to
protect those he cared about from the unfair and terrifying life obstacles, showing nothing but endless
strength when facing adversities.
I would also like to acknowledge my thesis supervisor Prof. Pedro Adao, for his insight, patience,
support and sharing of knowledge that has made this dissertation possible.
Last but not least, to all my friends and colleagues that helped me grow as a person and were always
there for me during the good and bad times in my life. Thank you.
To each and every one of you – Thank you.
Abstract
Mobile devices, specially smart-phones, are an increasingly valuable target for bad actors as they often
hold important personal information, that can potentially be exploited against its user.
With the growing number of mobile devices connected to the internet, it’s imperative that we develop
tools and document how to perform an in-depth analysis of mobile applications. We believe this knowledge
will help software developers, and even users, to be more conscious about security and implement better
code following the recommended practices.
This thesis will cover techniques and software one can use to analyse how an Android application
was built and gain insight to what it does in background. To be able to evaluate the security of an
Android mobile application it’s imperative to understand how they are developed, assembled and how
they operate on devices at runtime. With this in mind, we will provide details about the inner-workings
of the Android platform, with special attention to its security features.
A framework was produced along side this thesis, that aggregates frequently used tools to facilitate
the security analysis process. Our framework was designed to be a fully automated, easy to use and
extendable. These characteristics seek to promote a good starting point for anyone that wants to analyse
the behaviour of mobile applications and develop systematic tests to help assert their overall security
level.
We’ll focus on methodologies that allow us to inspect critical components of the application as de-
scribed by the Open Web Application Security Project (OWASP) Top 10 Security Risks.
Keywords
Security analysis; Android applications; Mobile security; Reverse engineering; Static analysis; Dynamic
analysis; Binary instrumentation
iii
Resumo
Dispositivos moveis, especialmente smartphones, sao cada vez mais um alvo valioso para agentes mal-
intencionados, pois e habitual conterem informacao importante que pode ser utilizada contra o seu uti-
lizador.
Com o aumento de dispositivos moveis ligados a Internet, e essencial que existam ferramentas e
documentacao que possibilitem uma analise completa de seguranca em aplicacoes moveis. Acreditamos
que este conhecimento pode ajudar programadores a serem mais conscientes sobre perigos de seguranca
e a implementar codigo robusto seguindo as praticas recomendadas pela industria.
Para podermos analisar a seguranca de uma aplicacao movel Android e imperativo entender como
estas sao desenvolvidas e como operam nos dispositivos em tempo de execucao. Com isto em mente,
forneceremos detalhes sobre o funcionamento interno da plataforma Android, com especial atencao a
funcionalidades relacionadas com seguranca e privacidade.
Esta dissertacao procura identificar tecnicas e ferramentas capazes de analisar aplicacoes concebidas
para dispositivos moveis Android, com o objetivo de obter informacao sobre as operacoes que estas
realizam em segundo plano.
Durante a realizacao desta tese, implementamos uma plataforma que aglomera diversas ferramentas
frequentemente usadas para facilitar o processo introspecao e analise de aplicacoes Android. Desenhamos
a nossa plataforma para ser completamente automatica, facil de utilizar e extensıvel.
Ao longo da dissertacao vamos focar-nos em tecnicas de engenharia reversa que permitam inspecionar
os componentes crıticos de uma aplicacao, identificados pelo documento OWASP Top 10 Security Risks.
Palavras-Chave
Analise de seguranca; Aplicacoes Android; Seguranca em dispositivos moveis; Engenharia reversa; Analise
estatica; Analise dinamica; Instrumentacao em executaveis
v
Contents
1 Introduction 2
1.1 Android platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 State of the Art 8
2.1 Android Platform Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Dalvik Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Android Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3 What is smali? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Application Package Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Android Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Android API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Android Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Signing an APK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.1 Signature-based analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.7.2 Taint analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.7.3 Behaviour-based analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7.4 Challenges to static analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.8 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.9 Static Analysis vs Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.10 Anti-Tampering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.11 Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 Proposed Solution 33
3.1 Design and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.1 Design Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
vii
3.1.3 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Framework Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Framework Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.1 Default settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.2 Fully automated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Static analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.1 APKtool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 AndroGuard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.3 DroidStat-X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.4 Implemented tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Dynamic Binary Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5.1 Frida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5.2 Included instrumentation scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Evaluating the Solution 51
4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Comparison with other tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.1 Apkx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.2 Objection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.3 AppMon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Selected testing applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 OWASP Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5 Conclusion 58
5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
viii
List of Figures
2.1 Android Software Stack [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Java vs. Dalvik [2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Diagram of the Android Runtime [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 APK decompilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Android Application Development Flow [4] . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Diagram of the Activity life-cycle [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.7 SafetyNet Attestation Application Programming Interface (API) protocol [6] . . . . . . . 30
3.1 XMind Map generated by DroidStat-X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
List of Tables
1.1 Ericsson Mobility Report - November 2018 [7] . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1 DroidSF: Configuration parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 DroidSF basic tests - April 2019 Successful: X, Failed: X . . . . . . . . . . . . . . . . . . 56
4.2 DroidSF findings - April 2019 Detected vulnerability: X, No vulnerability found: X . . . . 57
ix
Listings
2.1 Signing an APK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 DroidSF framework usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 Mechanism to handle Frida’s output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
x
Acronyms
RE Reverse Engineer
APK Application Package Kit
JAR Java Archive
JDK Java Development Kit
Dalvik-VM Dalvik Virtual Machine
SDK Software Development Kit
VM Virtual Machine
OS Operating System
IDE Integrated Development Environment
CLI Command-Line Interface
opcode operation code
API Application Programming Interface
IPC Inter-Process Communication
ASLR Address Space Layout Randomisation
ART Android Runtime
AIDL Android Interface Definition Language
JIT Just In Time
DBI Dynamic Binary Instrumentation
OWASP Open Web Application Security Project
AVD Android Virtual Device
HAL Hardware Abstraction Layer
JVM Java Virtual Machine
1
1Introduction
Contents
1.1 Android platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2
The Internet has become an essential part of the daily life of many people. It has evolved from a
basic communication network to an interconnected set of data sources with market places for the sale of
products and services. Services like online banking or advertising are some of most successful areas on
the Internet for commercial purposes [8].
In an effort to support all the modern functionalities and inter-connectivity that we have come to
expect from recent mobile devices, software is getting increasingly more complex and often includes
multiple libraries from external sources. This kind of complexity and inter-connectivity increases the
risk of security vulnerabilities which, in turn, can have severe consequences to people and systems that
interact with compromised software.
Just as in the physical world, there are people on the Internet with malevolent intents that relentlessly
search for vulnerabilities to exploit so they can enrich themselves, while taking advantage of oblivious
users.
Vulnerabilities enable a variety of attacks. The analysis of these attacks can determine the severity
of damage that can be inflicted and the likelihood that the attack can be further replicated.
Software that ”deliberately fulfils the harmful intent of an attacker” is commonly referred to as
malicious software or malware [9]. Malware helps bad actors to accomplish their goals and its prevalence
in third-party application stores indicates that this threat is not going away soon. Notably, in 2017 only
0.1 percent of discovered mobile malware was found on official application stores, with 99.9 percent being
hosted on third-party sites [10].
With each passing year, not only has the sheer volume of security threats to mobile devices increased,
but the threat landscape has become more diverse. The number of new mobile malware variants increased
by 54 percent in 2017, as compared to 2016 [10]. There was also a marked increase in the number of
ransomware infections on mobile devices during 2018, up by a third when compared to 2017 [11].
Attackers keep developing new methods of infection, new means of generating revenue from devices
and hacks to remain on compromised devices as long as possible. Being able to think like an attacker,
knowing its tools and having confidence that we’ve minimized the attacking surface is very important.
The task of screening and validating if an application is secure can be very time-consuming and easily
overwhelms analysts that try to perform this task manually. Due to the very substantial number of
sample applications submitted for security review every day, it is paramount that we use an automated
approach to quickly differentiate between samples that deserve an in-depth manual analysis, and those
that are a variation of already known threats [8].
Investigating current methods of analysis employed by security experts to detect malware is relevant
to the context of this thesis.
Through out chapter 2 of this thesis, we will review state-of-the-art approaches currently used by
security researchers and implemented in some anti-malware and anti-virus software.
3
1.1 Android platform
Android is a Operating System (OS) for mobile devices based on the Linux OS and it includes additional
system libraries, middle-ware, and a suite of pre-installed applications. Android applications, also com-
monly known as ‘apps’, are mainly written in Java by using a rich collection of Application Programming
Interfaces (APIs) provided by the Android Software Development Kit (SDK). Compiled code is packed
into an archive file, alongside data and resources required by the application. This archive file is known
as an Application Package Kit (APK) and once it is installed on an Android device, it runs by using the
Android Runtime (ART) environment.
We chosen the Android platform because it has around 75% worldwide market share in the mobile
device space [12]. Its open-source nature was also a major factor for us, as it contributes to better
documentation, bigger developer communities and the availability of tools to interact with the Android
OS.
Accordingly to The Ericsson Mobility Report, current and forecast figures for smart-phone subscrip-
tions are:
2017 2018 20244 350 million 5 010 million 7 210 million
Table 1.1: Ericsson Mobility Report - November 2018 [7]
Table 1.1 clearly shows a growing number of mobile devices connected to the internet, which represents
a huge security concern in order to keep each device updated and secure. This task is particularly difficult
since there are many different device manufacturers and a variety of modified versions of the Android
OS.
The Android system has evolved quite a bit from its first commercial device launch in 2008, to its
latest version 9 (codename Pie) deployed in August 2018.
The upcoming major version of Android is codenamed Q, updates the Android API to level 28 and
it is already in open beta testing. It promises an OS that will give users more control over privacy and
finer authority on what applications have access to. Some of the relevant new features and changes that
might affect our work:
• Scoped storage: new permissions and APIs for accessing files in external storage.
• More user control over location permissions.
• Improved constraints on activities launching from the background.
• New restrictions on accessing device serial and IMEI.
• Permission for wireless scanning: Wi-Fi and Bluetooth will require fine location permission
4
• Ability to run embedded DEX code directly from APK.
• Executable segments of system binaries and libraries are mapped into execute-only (non-readable)
memory, as a hardening technique against code-reuse attacks.
• Calls to ‘ptrace’ are unaffected, so ‘ptrace’ debugging is not impacted.
• Applications can no longer invoke ’exec()’ on files within their home directory.
• Restrict application in-memory modification of executable code, from files which have been open
with ‘dlopen()’. This includes any shared object (.so) files with text relocations.
Substantial changes in security features were introduced in its 10+ years of existence, so we felt it
was important to investigate current system features, particularly ones that are relevant for Reverse
Engineer (RE).
It is important to notice that many users continue to make life easy for attackers by continuing to
use older versions of Android. Only around 23% of devices are running the newest versions of Android
(version 8.1 codename Oreo and version 9 codename Pie) [11]. The lack of security awareness from users
is still one of the main ways devices are infected by malicious software and there has been a step-up in
the use of tried-and-tested distribution schemes like SMS and email spam [13].
With the launch of Android 9 (Pie) applications targeting older Android API levels (beginning with
Android 4.2) display a warning when launched. Google Play Store, the official application store for the
Android platform, now requires all applications to target an API level released within the past year, and
will also mandate 64-bit support in 2019 [14].
Modern OS architectures, like Android, have many built-in security features (e.g., process sand-boxing,
Address Space Layout Randomisation (ASLR), permission based access, etc.) that seek to minimize the
attacking surface on applications. However, as a side effect of providing flexibility to its ecosystem of
applications and programmers, there have been plenty of vulnerabilities uncovered over the years in the
Android platform.
Fixing API vulnerabilities, like fixing deployed protocols, is often hard because fixes may require
changes to the API which break backwards compatibility. It takes nearly a year (346 days) for 50% of the
Android devices using the Google Play Store to update to a new version of Android. Full deployment to
95% of devices takes a little more than 3 years (1230 days) [15]. From the time a new release is available,
which has fixed the vulnerability, to the moment when devices are updated, there is a very big window
of opportunity for attackers.
Rooted mobile devices give users special permissions and enable capabilities that break security as-
sumptions, e.g., read private data, circumvent permissions, instrument applications. A rooted device can
become a liability in terms of security, especially for an uninformed user. Android personal devices have
root ratio of 1 to 23 non-rooted devices, and enterprise devices have a root ratio of only 1 to 3890 [11].
5
1.2 Reverse Engineering
RE is the process of reconstructing the semantics of a compiled program’s source code.
This thesis will focus on several techniques and tools that can be leveraged to analyse the security
Android mobile devices, in particular RE techniques.
The motivation behind our efforts to RE an application is solely focused on assessing its overall
security.
The legality of two common forms of RE in software, namely, decompilation and disassembly of binary
code, has been challenged on trade secret, copyright, and contract law theories. Although courts and
legal commentators have overwhelmingly supported the legality of RE, it remains somewhat in a grey
area [16].
Disassembly is the process of converting the different binary sequences into their original operation
codes (opcodes). It relies on identifying the hardware architecture and instruction set the binary was
compiled for.
Decompilation consists in the process of interpreting opcodes and attempting to generate equivalent
source code. Due to compiler optimizations and obfuscation techniques, the decompilation process will
almost always generate source code different from the original.
One of the first well-known cases of RE was the Samba project. Andrew Tridgell wrote a packet
sniffer, reverse engineered the SMB protocol and implemented it on a Unix machine. Thus, he made the
Unix system appear to be a PC file server, which allowed him to mount shared file-systems from the Unix
server while concurrently running NetBIOS applications [17]. RE was instrumental to the development
of the Samba software because no public information was available about the SMB protocol.
Samba project’s history is a prime example that, with balance, interoperability has more beneficial
than harmful economic consequences. Hence, a legal rule permitting the RE of programs to achieve
interoperability is economically sound [16].
In the context of this thesis, we are only interested in RE to test if applications operate correctly and
do not perform malicious or unintended activities within a mobile device.
The RE process can be split in two main types of analysis:
• Static analysis allows inspection of an application without actually executing it. This process often
requires the disassembly and/or decompilation of binary code to run tests and obtain human-
readable code.
• Dynamic analysis refers to techniques that execute an application and allow inspection of its state
at various points in execution. Using this approach one can analyse the behaviour of a binary
application at runtime through the injection of instrumentation code.
6
1.3 Goals
The main drive for this thesis is to understand how to assess the security of mobile applications. We
want to be able to assert if the application was tampered with, check critical areas for vulnerabilities and
investigate if malicious code is present.
We will use RE techniques and explain how to implement test suites for mobile applications, closely
following the Open Web Application Security Project (OWASP) Top 10 Security Risks and the Mobile
Security Testing Guide.
Due to the nature of our work and since we will be describing how to disable basic anti-tampering
methods, we will also discuss some improvements that developers can employ to make their applications
more resilient against RE.
A major part of this thesis focuses on the development of a framework that enables developers and
researchers to analyse Android applications programmatically without requiring too much effort to con-
figure.
Our framework is not intended for piracy and other non-legal uses. It was built and designed solely
to facilitate the security assessment of Android applications.
The framework was named Android Security Framework or DroidSF for short. It combines various
existing tools created and maintained by security experts, to allow a systematic and easy way analyse
mobile applications.
DroidSF Github repository: https://github.com/neskk/droidsf
The DroidSF framework is completely open-source, works in multiple platforms (Windows, Linux,
MacOS) and was planned to be extensible and customizable, so that it can grow with the help of other
developers and adapt to the ever-changing technology found in mobile devices.
We built a platform where static and dynamic analyses co-exist and complement each other. This
key aspect is what sets our work apart from existing frameworks.
The DroidSF framework provides a convenient way to experiment new RE techniques, and to analyse
applications in a fully automated way. Supporting all major OSes is also another important strength of
our framework, since most of the existing frameworks, such as AppMon [18] and DroidStat-X [19], do
not run natively on Windows targeting only Linux and MacOS environments.
Some other mobile security testing frameworks attempt to support both Android and iOS at the
same time, leading to a much greater code-base which is harder to maintain and introduces unnecessary
complexity. We decided to focus our framework just on the Android platform to attempt to minimize
these problems.
We will explain with greater detail, the decisions and design process behind the development of
DroidSF further ahead in chapter 3.
7
2State of the Art
Contents
2.1 Android Platform Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Application Package Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Android Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Android API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Android Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Signing an APK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.8 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.9 Static Analysis vs Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . 27
2.10 Anti-Tampering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.11 Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8
Reverse engineering and tampering techniques have long been associated to the realm of hackers and
malware analysts. For traditional security testers and researchers, Reverse Engineer (RE) has been more
of a complementary skill, but the tides are turning. Testing mobile applications increasingly requires
disassembling compiled applications, applying patches, and tampering with binary code or even live
processes. The fact that many mobile applications implement defences against RE makes things harder
for security analysts.
Reverse engineering a mobile application is the process of analysing the compiled software to extract
information about its source code with intent of understanding how the application work [20].
Tampering is the process of modifying a mobile application, either by changing the compiled byte-code,
instrumenting the running process or its environment, in order to affect the behaviour of the application
being tested [20]. For example, it is common for an application to refuse to run on rooted devices, making
it impossible to run certain tests or use Android’s debugging functionalities. In such cases, we want to
alter the application’s behaviour.
Mobile security testers should have a basic understanding of RE concepts, mobile devices and operating
systems. RE is an art, and describing its every facet could easily fill a whole library. The sheer volume
of techniques and specializations can be overwhelming. One can spend years working on a very concrete
and isolated sub-problem, such as automating malware analysis or developing improved de-obfuscation
methods. Security testers have to be generalists. In order to become an effective reverse engineer, one
must filter through the vast amount of relevant information [20].
In recent years, researchers have developed a variety of tools and methodologies to conduct analysis
of Android applications. While all the respective papers aim at providing a thorough empirical eval-
uation, comparability is hindered by varying or unclear evaluation targets. These limitations make it
nearly impossible to directly compare approaches and we have to accept that there will always be some
techniques more suited for some tasks than others [21]. This fact reinforces the need for security testers
to diversify their knowledge and learn the generic concepts behind RE, rather than learning a specific
tool or methodology.
There is no universal recipe for the RE process that always works. Acknowledging this fact, through
out this chapter we will first focus on describing the current state-of-the-art of the Android platform, in
particular, information regarding its security features. Secondly, we will provide details about commonly
used RE methods and tools. Finally, some examples of tackling the most common anti-reverse defences
will also be analysed.
2.1 Android Platform Architecture
Android Operating System (OS) software stack is composed of several different layers. Each layer defines
interfaces and offers specific services as shown in figure 2.1.
9
Figure 2.1: Android Software Stack [1]
10
The foundation of the Android platform is the Linux kernel. On top of the kernel, the Hardware
Abstraction Layer (HAL) defines a standard interface for interacting with built-in hardware components.
Several HAL implementations are packaged into shared library modules that the Android system includes
when required. This design is what enables applications to interact with the device’s hardware, e.g., it
allows a chat application to use a device’s microphone and speaker.
2.1.1 Dalvik Virtual Machine
Android applications are usually written in Java and compiled to Dalvik byte-code, which is somewhat
different from the traditional Java byte-code. Dalvik byte-code is created by first compiling the Java
code to ‘.class’ files, then converting the Java byte-code to the Dalvik Executable ‘dex’ format with the
‘dx’ tool from the Android Software Development Kit (SDK).
Dalvik Virtual Machine (Dalvik-VM) is the original Android Runtime first deployed on Android 1.0
around 2008. Initially, it consisted on a simple application virtual machine similar to the Java Virtual
Machine (JVM), optimized for mobile devices and able to execute the ‘dex’ byte-code specification.
The ‘dex’ byte-code specification limits the total number of methods that can be referenced within a
single ‘.dex’ file to 65 536 - including Android framework methods, library methods, and methods in our
own code [22]. To move past this limitation, developers can enable a configuration known as ‘multi-dex’,
which allows your application to build and read multiple ‘dex’ files.
Figure 2.2: Java vs. Dalvik [2]
With time, Google felt the need to address perfor-
mance concerns with the Dalvik-VM and to be able
to keep up with hardware advances of the industry.
Google added a Just In Time (JIT) compiler with the
release of Android 2.2, added multi-threading capa-
bilities, and generally tried to improve the platform
piece by piece.
The JIT compiler used by Dalvik-VM is a soft-
ware component which takes application’s byte-code,
analyses it, and actively translates it into a optimized
form that runs faster, doing so while the application
continues to run. As the user progresses through the
applications, additional code is going to be compiled
and cached, so that the system can reuse the code while the application is running.
Because the JIT compiler only compiles a part of the code, it has a smaller memory footprint and
requires less storage space on the device.
11
2.1.2 Android Runtime
Android Runtime (ART) is the successor to the Dalvik-VM and it became the default runtime for devices
running Android 5.0 (API level 21) and higher. Both were originally created specifically for the Android
project [23].
ART was built to be backwards compatible, meaning it retained the ability to execute older ‘dex’
byte-code specifications.
Ahead-Of-Time (AOT) compilation was introduced as well as other improvements over the Dalvik-VM.
The key difference between ART and its predecessor, is the way byte-code is executed. As the name im-
plies, with Ahead-Of-Time compilation, applications are compiled before they are executed for the first
time.
At install time, ART compiles applications using the on-device ‘dex2oat’ tool. This utility accepts
‘.dex’ files as input and generates the compiled application executable for the target device [23]. The
resulting ‘.oat’ files are essentially a ‘ELF’ files that are then executed natively. Instead of having ‘dex’
byte-code that is interpreted by a virtual machine, now we have native machine code that can be executed
directly by the processor. This pre-compiled native machine code is used for all subsequent executions
and improves performance by a factor of two while reducing power consumption [2].
ART has gotten faster and more memory-efficient in pretty much every new Android release. The
amazing part about its improvements is that any changes automatically apply on nearly all applications,
since they run through ART.
Most recently, with Android 9 coming out, ART developers have been working to reduce the size
of the ‘dex’ files. These files are stored twice on an Android device, once in the Application Package
Kit (APK) and again in an extracted form that ART keeps around to speed up the application launch.
They are also loaded into memory, so smaller ‘dex’ files results in storage space saving and reduces the
amount of memory an application allocates.
A new feature introduced in Android P called ‘CompactDex’ aims to help reduce the size of ‘dex’
files. These files still exist in an APK, but now when an APK is installed, ART extracts and rewrites
the ‘dex’ files into ‘cdex’ files. ‘CompactDex’ is a smaller format, with better layout optimization, and
removes duplicated files when dealing with multiple ‘dex’ files.
As we have described, Android’s ‘dex’ byte-code files has a 65 536 method limit, so it’s not unusual
for large applications to have more than one ‘dex’ file. One of the inefficiencies of having multiple ‘dex’
files is that a lot of information is duplicated across these multiple files. As part of the ‘CompactDex’
rewriting, a new shared data section is created for the ‘multi-dex’ applications. The duplicate data across
‘dex’ files is written in the shared data section, so it exists only once. With ‘CompactDex’, the ‘dex’ files
are around 12% smaller [14].
12
Figure 2.3: Diagram of the Android Runtime [3]
13
2.1.3 What is smali?
As we have described in previous sections, Dalvik Executable ‘dex’ files that are included in the APK
contain the application’s compiled Dalvik-VM byte-code. This ‘dex’ byte-code is pretty much unreadable
by humans which is not practical for analysis.
Figure 2.4: APK decompilation
Because Java is a very popular programming language, there are
plenty of tools that attempt to recreate the original Java source code
from ‘dex’ files. We will talk more about this topic ahead. We can
use dex2jar [24] or enjarify [25] to convert the ‘dex’ files to Java
classes zipped inside a Java Archive (JAR) file. Afterwards, we can
use a Java decompiler, such as procyon [26] or CFR [27] to read the
class files contained in the JAR and attempt to export Java source
code.
The decompiled Java source code is easier to read and understand
than ‘dex’, but the decompilation process will likely not produce
working source code. Some sections of the decompiled source code
may also be improperly disassembled, rendering this process not very
efficient nor consistent.
Smali code is an intermediate representation for ‘dex’ byte-code
and it supports the full functionality of the ‘dex’ format (e.g., annotations, debug info, line info, etc.) [28].
Its main purpose is to facilitate the interaction with application’s byte-code.
‘smali’ files are the result of disassembling a ‘dex’ file (baksmaling). The inverse process (smaling) is
also supported, enabling the re-assembling ‘smali’ into ‘dex’ byte-code.
dex ⇔ smali ⇐ Java source code
Because ‘smali’ can consistently be converted back to ‘dex’, it facilitates the repackaging of modified
existing Android applications. One can modify an application without even knowing its original Java
source code.
Smali code is readable, but it’s more of an assembly based language, meaning that it doesn’t even
resemble Java code [28]. It is also worth noting that, converting ‘dex’ to ‘smali’ does not improve our
chances of getting working Java source code.
2.2 Application Package Kit
An Android APK is a collection of components that share a common set of resources, i.e.: database,
preferences, file space and a Linux process [29].
14
An APK file consists of a ’zip’ archive that contains all the files that comprise the application [30].
By default, only APKs downloaded from the official Google Play Store can be installed on Android
mobile devices. Users can deactivate this security feature, simply enabling an option in Android’s security
settings called Unknown Sources.
Figure 2.5: Android Application Development Flow [4]
The structure of the APK archive contains some folders and files, most notably:
• ‘META-INF/’: directory where signature data is stored, it’s used to ensure the integrity of the
APK.
• ‘assets/’: holds application’s assets, which the application can retrieve using an ‘AssetManager’
object.
• ‘lib/’: contains required native code libraries compiled inside a subdirectory for each processor
architecture (e.g., armeabi, armeabi-v7a, arm64-v8a, x86, x86 64, and MIPS).
• ‘res/’: contains resources that aren’t compiled into ‘resources.arsc’.
• ‘AndroidManifest.xml’: mandatory file that describes the name, version, required components,
access rights, minimum required API level, referenced library files and entry point of the application
[31].
• ‘classes.dex’: application code compiled in ’dex’ byte-code format.
• ‘resources.arsc’: includes language strings and styles, as well as paths to content that is not included
directly in this file, such as layout files and images.
15
Android requires that all APKs be digitally signed with a certificate before they can be installed [32].
The signature included in each APK is very important as it is used to establish the authenticity of the
application. During the APK signing process all included files are hashed, in an attempt to detect and
prevent file tampering.
We should point out that unzipping the APK with the standard unzip utility leaves some files un-
readable. Application’s resources are still packaged into a single archive file and ‘AndroidManifest.xml’
is encoded into binary XML format which is not readable with a text editor. One of the most popular
tools to unpack an APK is apktool [33]. It can automatically decode the manifest file to text-based XML
format, extract the contents from ‘resources.arsc’ and it also disassembles the ‘dex’ files to ‘smali’ code.
2.3 Android Applications
An Android application has to define a package name, for instance, ‘com.android.chrome’ or ‘com.facebook.katana’.
This package name acts as an unique identifier, which implies that no two applications can have the same
package name, either on the Google Play Store or on the Android device.
There are three types of structures that together create an Android application: Activities, Tasks and
Processes [30].
• Activities are discrete chunks of functionality that encapsulate a specific behaviour and an execution
context.
• A Task is a collection of Activities which allows a higher abstraction model to group several be-
haviours that work together.
• Processes are standard Linux processes where Activities from the APK are executed. By default
the APK runs in one process with a single thread.
Most programming languages require that the developers implement an entry point for application,
often called the ‘main’ function. The Android system initiates code in an Activity instance by invoking
specific callback methods that correspond to specific stages of its life-cycle [34].
Android applications are built as a combination of components:
• Activity: Represents a single screen with a user interface that acts as an interaction point with
the user. Developers can define which Activity is the main one, which is the first screen to appear
when the user launches the application. It is also possible to allow external applications to start a
specified Activity. Activities have their own life-cycle as seen on figure 2.6.
• Fragment: Represents a behaviour or a portion of the user interface within an activity. Fragments
were introduced in Android 3.0 (API level 11).
16
• Service: Designed to perform an action in the background for some period of time. Services do not
provide a user interface.
• Broadcast Receiver: Application component in charge of responding to system-wide events. It has
a well-defined entry point, similar to what we find in an Activity. The system can deliver these
events even to applications that are currently not running. Example of events: reception signal
change, battery charging, received an SMS, enabled Wi-Fi, etc.
• Content Provider: Manages a shared set of application data. It includes a high-level Application
Programming Interface (API) to access data so that other applications and services can interact
with the stored data. This component type abstracts the storing mechanism so it can be modified
without many changes in the code. The storing mechanism most often employed is an SQLite
database (file-based).
Each component has its own life-cycle methods, which are called by the Android system to start/stop/re-
sume the component.
On Android systems, applications only have direct access to their own data and interacting with other
resources requires them to have an explicitly exposed APIs.
Android creates a unique user ID for each application and runs them in separate processes. Conse-
quently, each application can only access its own resources. This protection is often called sand-boxing
and it is enforced by the Linux kernel. It is widely used in many OSes to offer security through isolation
of the processes running the applications. It allows precise control over resources and applications. For
instance, a crashing application does not affect other applications running on the device. At the same
time, the Android Runtime controls the maximum number of system resources allocated to applications,
preventing any one application from monopolizing too many resources.
One way to enable interaction and data sharing between applications on the same device, is to
configure applications so that they share the same user ID. This can be done by specifying the ‘an-
droid:sharedUserId’ property on ‘AndroidManifest.xml’.
Android also provides signature-based permissions enforcement, so that an application can expose
functionality to another application that is signed with a specified certificate. By signing multiple APKs
with the same certificate and using signature-based permissions checks, your applications can share code
and data in a secure manner [32].
The two main mechanisms available to share data between applications are Intents and Inter-Process
Communication (IPC).
Intents are not designed for long exchanges of information, but instead allow applications to publish-
subscribe to various kinds of events designed to share data between them.
An Intent is a messaging object developers can use to request an action from another application
17
Figure 2.6: Diagram of the Activity life-cycle [5]
18
component. Although intents facilitate communication between components in several ways, there are
three fundamental use cases [35]:
• Starting an activity
• Starting a service
• Delivering a broadcast
Please note that each component has its life-cycle. For instance, it is possible that an Intent to start
an Activity has no effect because the target Activity was already running.
There are two types of intents:
• Explicit intents specify which application will satisfy the intent, by supplying either the target
application’s package name or a fully-qualified component class name. Developers typically use an
explicit intent to start a component in their own application, because they know the class name of
the activity or service they want to start. For example, starting a new activity within the same
application in response to a user action, or starting a service to download a file in the background.
• Implicit intents do not name a specific component, but instead declare a general action to perform
and, optionally, some data, which allows a component from another application to handle the event.
For example, if developers want to show the user a location on a map, they can use an implicit
intent to request that another capable application show a specified location on a map.
To ensure that an Android application is secure, developers should make sure to always use an explicit
intent when starting a Service and never declare intent filters for their services. Using an implicit intent
to start a service is a security hazard because we can not be certain what service will respond to the
intent, and the user is unable to see which service has started. With the release of Android 5.0 (API level
21), the system throws an exception if developers call ‘bindService()’ with an implicit intent [35].
Intent filters are a very powerful and important feature of the Android platform. They provide the
ability to launch an activity based not only on an explicit request, but also an implicit one. For instance,
an explicit intent might tell the system to “Start the Send Email activity in the Gmail app”, while an
implicit intent tells the system to “Start a Send Email screen in any activity that can do such job.”.
When the Android system UI asks a user which application to use in performing a task, that is an intent
filter at work [35].
Activities that developers do not want to make available to other applications should have no intent
filters, and developers can start them in their own application using explicit intents.
Due to the application sand-boxing on Android systems, one process can not normally access the
memory of another process [36]. If developers are implementing a service that will be used by different
applications there is the possibility to use Android Interface Definition Language (AIDL) and define a
19
programming interface that both the client and service agree upon in order to communicate with each
other using IPC. IPC features allow applications to exchange signals and data securely. Instead of
relying on the default Linux IPC methods, Android’s IPC is based on Binder, a custom implementation
of OpenBinder. Most Android system services and all high-level IPC services depend on Binder.
Using AIDL is necessary only if we want to allow clients from different applications to access our
service for IPC [36]. This mechanism allows applications to communicate between processes very fast
and efficiently, but it requires them to be signed using the same certificate and to specify the same shared
user ID (‘android:sharedUserId’) in the ‘AndroidManifest.xml’ [37].
2.4 Android API
Android applications are built on top of the Android framework and its huge variety of APIs. Android
framework includes many APIs from the Java world since it is an extension to Java SDK APIs. The
majority of these services are invoked via normal Java method calls and are translated to IPC calls to
system services that are running in the background.
Some examples of system services that can be accessed through Android APIs:
• Connectivity (Wi-Fi, Bluetooth, NFC, etc.)
• Sensors (Accelerometer, Gyroscope, etc.)
• Geolocation (GPS)
• Cameras
• Microphone
The Android framework also offers common security functions, such as cryptography, integrity and
anti-tampering checks. With every new Android release, the API specification changes. Critical bug fixes
and security patches are usually applied to earlier versions as well.
Noteworthy API versions and some of their relevant security features:
• Android 4.2 Jelly Bean (API 16) in November 2012: introduction of SELinux.
• Android 4.3 Jelly Bean (API 18) in July 2013: SELinux became enabled by default.
• Android 4.4 KitKat (API 19) in October 2013: several new APIs and ART introduced.
• Android 5.0 Lollipop (API 21) in November 2014: ART used by default and many other features
added.
• Android 6.0 Marshmallow (API 23) in October 2015: users can revoke permissions at anytime and
granting detailed permissions at runtime rather than all or nothing during installation.
20
• Android 7.0 Nougat (API 24-25) in August 2016: new JIT compiler on ART and added v2 signing
scheme of APKs.
• Android 8.0 Oreo (API 26-27) in August 2017: improved security in WebView APIs and added new
permissions related to telephony.
• Android 9 Pie (API 28) in August 2018: limited access to sensors in background, privacy and
security improvements.
Google provides a massive amount of documentation about Android’s APIs online:
• https://developer.android.com/docs/
• https://developer.android.com/guide/
• https://developer.android.com/reference/
Applications must define which API-level they target. This functionality prevents application com-
patibility issues when the Android APIs are updated. API levels basically allow developers to opt-in to
new features, knowing that they have changed their applications to deal with any new changes [14].
Since Android APIs are used for interacting with every critical aspect of the mobile device, from a
RE perspective, it is really important that we have a good understanding of how an application employs
them. In our proposed solution we explain how security researchers can monitor and tamper calls to the
Android APIs and what they do and how they can they become vulnerable
2.5 Android Permissions
Permissions system is another core Android security feature that helps users understand the capabilities
of the applications they’re installing. This is a security measure that allows the user to identify when
unnecessary permissions are being requested. For instance, if a game unnecessarily requests permission
to send SMS, the user should probably avoid installing it [38].
Applications running on Android can not access user information and system components (such as
the camera and the microphone) until they request appropriate permissions. Android provides a system
with a predefined set of permissions for certain tasks that the application can request. For example, if we
want our application to use a phone’s camera, you have to request the ‘android.permission.CAMERA’
permission.
Prior to Android 6.0 Marshmallow (API 23), all permissions an application needs were requested
at installation. From Android 6.0 onwards, users are allowed to individually block or grant permission
requests during application’s execution.
21
Each permission predefined in Android system is associated to a group ID in the Android OS. If the
permissions an application requested are granted, the corresponding group ID is added to the application’s
process. For instance, consider that the user ID of an application is 10177 and the application requested
the permission ‘android.permission.INTERNET’. When the permission is granted, the user ID 10177 will
be added to the group ID 3003 (inet) that corresponds to the permission requested.
2.6 Signing an APK
On Android, application signing is the first step to placing an application in its application sandbox. The
signed application certificate defines which user ID is associated with which application. Application
signing ensures that one application cannot access any other application except through well-defined
IPC [39].
In order to sign an APK, developers need to generate a public-key certificate, which contains the
public key of a public/private key pair as well as some other meta-data identifying the owner of the key
(e.g.: name and location). The owner of the certificate holds the corresponding private key that should
be kept private [32].
The public-key certificate used serves as a fingerprint that uniquely associates the APK to the de-
veloper that holds its corresponding private key. This provides a proof of authenticity which helps the
Android system ensure that any future updates to the application come from its original author. Com-
promising the private key would allow an attacker to deploy a malicious version of the application as an
update over an existing install.
Basically, there are two ways for developers to manage signing keys: either opt-in to use Google Play
App Signing to securely manage and store your signing keys or manage and secure your own keystore
and signing keys.
Android Studio massively simplifies the process of generating and signing an APK.
In addition to Android Studio, we discovered various tools that can simplify and even automate the
APK signing process:
• Appium’s ApkSign [40] can be used to automatically sign an APK with the Android test certificate.
• AppMon [18] provides an APK builder module that, among other things, can package and sign
APKs.
• DEX2JAR [24] includes ‘d2j-apk-sign’ which has the ability to automate the APK signing process.
If we want to manually sign an APK we need to use a couple of Command-Line Interface (CLI) tools
that are included in Java Development Kit (JDK) to do it.
22
First we need to generate a keystore using ‘keytool’. Then we want to ‘zipalign’ the APK, ensuring
that application’s uncompressed data starts at a predictable offset inside the APK. Developers are
required to ‘zipalign’ their APKs in order to publish them in Google’s Play Store. Afterwards we can
sign the APK using ‘apksigner’.
Listing 2.1: Signing an APK
1 keytool -genkey -v -keystore my-release-key.jks -keyalg RSA -keysize 2048
↪→ -validity 10000 -alias my-alias
2 zipalign -v -p 4 app-unsigned.apk app-unsigned-aligned.apk
3 apksigner sign --ks my-release-key.jks --out app.apk app-unsigned-aligned.apk
APK signing is based on signed JAR model and has been a part of Android from the beginning. JAR
signing (v1 scheme) does not protect some parts of the APK, such as ZIP meta-data. Android’s APK
validation process needs to analyse untrusted data structures and then discard data not covered by the
signatures, which offers a sizeable attack surface.
Android 7.0 introduced a new APK signature (v2 scheme) that ensures all content in the APK is
hashed and signed. The resulting Signing Block is inserted into the APK so it can be validated later.
During validation, this scheme treats the APK file as a blob and performs signature checking across the
entire file. Any modification to the APK, including ZIP meta-data modifications, invalidates the APK
signature [39].
Applications are also able to declare security permissions at the signature protection level, restricting
access to only allow applications signed with the same key, while maintaining distinct user IDs and
application sandboxes. A closer relationship with a shared application sandbox is allowed via the shared
user ID feature where two or more applications signed with same developer key can declare a shared user
ID in their manifest [39].
We can conclude that the current APK signing mechanism effectively provides authenticity and in-
tegrity checks, which are critical security aspects. This means that a tampered application can be easily
distinguished from an original by comparing APK signatures.
2.7 Static Analysis
Static program analysis is the analysis of computer software performed without actually executing pro-
grams. The term is usually applied to the analysis performed by an automated tool, with human analysis
being called code review.
When compiling the source code of a program into a binary executable, information such as the size
of data structures or variables names, is lost. This loss of information further complicates the task of
23
analysing the code.
Methodologies frequently used in static analysis:
• Signature based detection
• Flow graph analysis
• Taint analysis
• Behaviour based analysis
Byte-code is an intermediate representation output by programming languages to ease interpretation
and to reduce hardware and operating system dependence by allowing the same code to run cross-platform,
on different devices. Byte-code may often be either directly executed on a virtual machine, or it may be
further compiled into machine code for better performance. Since byte-code instructions are processed
by software, they could be arbitrarily complex, but are nonetheless often akin to traditional hardware
instructions. Android platform resorts to this kind of mechanism to ensure that its applications can run
in various mobile devices with different hardware.
2.7.1 Signature-based analysis
The classic static analysis techniques search over the application’s byte-code for the presence of a specific
sequence of instructions, known as signatures. We can apply signature based analysis to search for
vulnerabilities, but signatures utilized are mostly from previously detected malicious software we want
to protect users from. If a signature is found, it is highly probable that the application is insecure.
Static analysis tools can be used to extract useful information of a program. When used proactively
it can find vulnerabilities early in the development cycle. With an adequate test suite, static analysis
allows full exploration of possible program executions. Full call graphs give the analyst an overview of
what the logic flow might be and where these functions are in the code [8].
2.7.2 Taint analysis
Static taint analysis is a popular information flow analysis technique which tracks the flow of sensitive
information from a set of sensitive sources to sensitive sinks. In this context, sources define the information
we want to protect on a mobile device (e.g., phone number, contacts, and location) and sinks define points
of unwanted information release (e.g., methods related to internet communication and SMS transmission).
If data originated from a sensitive source reaches a sink, taint tracking identifies the path from the
source to the sink as an instance of data leakage. Taint analysis can be implemented both statically and
dynamically. Examples of tools that execute static taint analysis include: FlowDroid [41], Amandroid [42],
24
and DroidSafe [43]. Qiu [21] provides a detailed test results and comparison between several existing taint
analysis tools.
2.7.3 Behaviour-based analysis
Most behaviour based analysis frameworks can recognize malware on Android applications, by analysing
the number of times each system call has been issued during the execution of an action that requires user
interaction [44]. This methodology has some similarities with a signature based approach, but instead of
searching for patterns associated with malware in code, it simply monitors the behaviour of the application
with respect to system calls. The approach is based on the premise that a genuine application will differ
from its compromised version, since it issues different types and a different number of system calls.
Andromaly [45] and Crowdroid [44] are two frameworks that that rely on machine-learning techniques
to perform a behaviour based analysis. They work by collecting a list of features (e.g., Android system
calls, sensor data, hardware usage details) which monitors both the mobile device and user behaviours.
Features are then fed into a machine-learning algorithm in order to train it for malware/virus detection.
2.7.4 Challenges to static analysis
Analysing binaries brings along intricate challenges. Consider, for example, that most malware attacks
hosts executing instructions in the IA32 instruction set (32-bit version of x86). The disassembly of
such programs might result in ambiguous results if the binary employs self modifying code techniques.
Additionally, malware relying on values that cannot be statically determined (e.g., current system date,
indirect jump instructions) exacerbate the application of static analysis techniques
A significant disadvantage of a static analysis is that it may suffer from false positives. A great deal
of thinking and experimentation can go into the design of a static analysis abstraction, but the problem
of soundly and precisely identifying security violations is undecidable. This means that in the worst case,
false positives will still be reported no matter how precise we make our analysis technique [46].
Standard challenges that complicate the construction of static analysis systems are scaling to large
applications and maintaining precision in the analysis such that it does not report too many flows that
do not actually exist in the application. One particularly prominent issue with developing static analyses
for Android applications is the size, richness, and complexity of the Android API and runtime [43].
Because sensitive flows are often generated by complex interactions between the Android application,
API, and runtime, any static analysis must work with an accurate model of this runtime to produce
acceptably accurate results. This can be especially challenging if the analysis takes into consideration
Java’s reflection mechanism which is also present in Android.
Accuracy is critical for a static analysis seeking to calculate security properties of an application.
Imprecision in the model used to perform the analysis could lead to results that are unusable due to too
25
many false positives.
2.8 Dynamic Analysis
Dynamic analysis or behaviour-based detection involves running the application in a controlled and
isolated environment in order to analyse its execution traces [44].
Dynamic Binary Instrumentation (DBI) is a technique designed to inject foreign code into existing
binaries, enabling behaviour modifications and runtime information collection. This foreign code is known
as instrumentation code and it executes as part of the normal instruction stream after being injected. A
good overview of automated dynamic malware analysis techniques is provided by Egele [8].
Instrumentation is not the same thing as exploiting, since code injection does not happen via previously
discovered vulnerabilities. It is also not the same this as debugging, since you are not attaching a debugger
to the binary, although you can do very similar things.
Using a DBI framework, security researchers can do things like:
• Access process memory.
• Overwrite functions while the application is running.
• Call functions from imported classes.
• Find object instances on the heap and use them.
• Hook, trace and intercept functions.
One of the most fundamental aspects of DBI is monitoring function calls. While the use of functions
enables easy code re-usability and simplify maintenance, the property that makes functions interesting
for program analysis is that they are commonly used to abstract from implementation details to a se-
mantically richer representation. For instance, one does not need to understand how a cryptographic
encryption algorithm works to understand that a call to a certain function converts to cypher-text a
certain input parameter. Such abstractions help to understand the overall behaviour of the program [8].
One possibility to monitor what functions are called by a program is to intercept these calls. The program
is instrumented in a way that in addition to the intended function, a so-called hook function is invoked.
This hook function implements the required analysis functionality, such as recording its invocation to a
log file, or analyse input parameters.
It is arguable that one could also do all of the above using a debugger, but some applications em-
ploy anti-debugging checks that may be cumbersome to circumvent. Using a code a instrumentation
framework, security researchers can quickly start experimenting, even with black-box processes [47].
26
2.9 Static Analysis vs Dynamic Analysis
To achieve our objective of building a powerful and comprehensive framework for automated analysis of
Android applications we decided to create a symbiotic relationship between static analysis and dynamic
analysis, the two main categories of RE methods.
Static analysis, mostly used by anti-virus companies, is often based on source code or binaries inspec-
tion, looking for suspicious patterns. We believe static analysis can be used to improve the efficiency
of dynamic analysis techniques, e.g., static analysis can remove redundant checks, generate customized
hooks and focus the scope of the analysis [48].
Scanning ‘dex’ byte-code, disassembled ‘smali’ code, or even decompiled Java code, using regular
expressions can be pretty slow if we’re analysing an application with several megabytes of code. This focus
the importance of developing comprehensive tests in order to retrieve important and useful information
for dynamic analysis, without taking too much time or resources.
Although static analysis is very powerful, virus authors and even companies interested in keeping their
proprietary code secret have developed various obfuscation techniques that can be especially effective
against static analysis [49].
Dynamic binary instrumentation has several strong-points, most noticeable, it avoids having to re-
compile or relink, ability to discover code at runtime and analyse dynamically-generated code [47]. This
kind of dynamic analysis can be particularly useful in situations where application’s code was heavily
obfuscated and using pure static analysis may not be achieving acceptable results.
Using a tool like Frida [50], or other dynamic binary instrumentation framework, it might be possible
to trick the application into decrypting important obfuscated strings for us. We may even be able to
isolate the code responsible for decrypting obfuscated strings and then apply it to the obfuscated strings
uncovered during the static analysis of the application.
Instrumenting a large set of applications to check for vulnerabilities can be tricky to execute. Even
with tools like Frida that can be programmed to automate certain instrumentation operations, dynamic
analysis usually requires too much work. It is very time-consuming to install each APK, run it, and
manually test it to reproduce the vulnerability [37]. We attempt to address this concern in our framework
by automating most of the processes just described.
We can conclude that for developers to become proficient in reverse engineering they should master
both static and dynamic analysis, because both approaches can complement each other.
2.10 Anti-Tampering
Android developers can deploy various countermeasures to difficult third-parties from tampering with
their applications. Most methods described in this section are meant to improve security for the end
27
user, but can be defeated relatively easy to enable reverse engineering.
One of the most common anti-tampering methods employed is root detection. The goal is to make
it a bit more difficult to run the application on a rooted device, which in turn obstructs some tools and
techniques reverse engineers like to use.
On Android, root detection also can include the detection of custom ROMs, i.e. verifying whether
the device is a stock Android build or a custom build. As with most other defences, root detection is
not highly effective on its own, but having some root checks throughout the application can improve the
effectiveness of the overall anti-tampering scheme [51].
Some other common root detection methods employed by application developers:
• File existence checks: checking for files typically found on rooted devices, binaries that are usually
installed once a device has been rooted.
• Executing ‘su’ and other commands: search for binaries that are usually installed once a device has
been rooted.
• Search installed application packages: look for commonly used applications that can root devices.
• Checking for writable partitions and system directories.
• Testing for custom Android builds.
Verifying the application’s signing certificate at runtime, is a technique widely used to obstruct RE.
It essentially consists in validating if the APK has been signed by its author with the genuine certificate.
Assuming that the certificate remains consistent, and its private key and keystore are kept private, any
third-party modification to an application implies that a different certificate has to be used to repackage
it [52].
A possible implementation of a security check that validates APK signature consists in hard-coding
the certificate’s public key into the application and, at runtime, validate if the signature included in the
running APK matches the hard-coded one. If APK and hard-coded signatures do not match, we know
that a third-party repackaged our application.
Another simple anti-tampering technique is to check the identifier of the application that installed the
APK. Assuming that the application is only available through Google Play Store, we can use the Android
API to query if the application’s installer matches the Google Play Store identifier (com.android.vending).
Checking if the ‘debuggable’ flag is enabled at runtime, is also a straight-forward anti-tampering
method, as it prevents a debugger from being attached to the application. This check is relevant consid-
ering how simple it is to unpack an APK, change its ’AndroidManifest.xml’ file to enable the ‘debuggable’
flag and repackage it.
28
Typical users will not be running the application using an emulator, so it is common for developers to
have their application check their runtime environment. Using Java’s reflection mechanism, an application
can access some hidden system properties, e.g., ‘ro.hardware’, ‘ro.kernel.qemu’ or ‘ro.product.model’, and
look for known values used by emulators. This method can be used to stop applications from executing
on emulators, which usually are more convenient to use when reverse engineering an application.
It is also possible that developers implement these anti-tampering methods in native code (.so library
files specifically compiled for an hardware architecture). All these efforts are made to throw potential
attackers off track and make it harder to circumvent these security checks.
Although most of these anti-tampering techniques are easy to understand and implement, this also
means that an attacker can learn how to circumvent them.
All these checks run within the process space of an unprivileged application. It may take some time
but, all local checks can eventually be bypassed [53].
It has become a common practice to use code obfuscators in conjunction with this kind of anti-
tampering checks, because they depend mostly on hard-coded information. Code obfuscators employ
various techniques to make it harder for a third-party to find this kind of vital information.
Google has made available a system called SafetyNet to keep the Android ecosystem in check and
gather metrics on on-going attacks. This system relies on internet access, as it partially works remotely,
and provides an alternative to the hard-coded checks we previously described.
SafetyNet offers a set of Android APIs that create a profile of the device using software and hardware
information. This profile is then sent to Google for analysis where it is compared against a list of
white-listed device models that have passed Android compatibility testing [51].
We do not know exactly how SafetyNet works because it is not well documented and its behaviour
may change at any time. When the application first calls its APIs, SafetyNet’s service downloads a binary
package containing the device validation code from Google, which is then dynamically executed using
reflection [53].
SafetyNet’s Attestation API uses collected information from the device to assess its basic integrity, and
to evaluate the genuineness of the APK that holds the calling application. This service helps developers to
determine whether or not a particular device has been rooted, tampered with, or otherwise modified [54].
In addition to the Attestation API, SafetyNet also provides the following set of services:
• SafetyNet Safe Browsing API, provides services for determining whether a URL has been marked
as a known threat by Google.
• SafetyNet reCAPTCHA API, protects the application from malicious traffic.
• SafetyNet Verify Apps API, protects devices against potentially harmful applications.
In theory, to defeat SafetyNet we have to find which pieces of collected data are important. This
29
Figure 2.7: SafetyNet Attestation API protocol [6]
represents a moving target that Google can change at will. Consequentially, we would have to fake data
in meaningful ways, adding much more work and uncertainty about what information is used for the
analysis [53].
2.11 Obfuscation
Obfuscation is the process of transforming code and data to make it more difficult to comprehend. This
process can generate syntactically different code, but semantically equivalent to the original. It is an
integral part of every software protection scheme [51].
In the previous section, we have seen some techniques that developers can implement as anti-tampering
features, but rely on the secrecy of information hard-coded in the application. Without string obfuscation,
these hard-coded values can be easily discovered and modified using various static analysis tools. This
is why obfuscation is so important and can exponentially increase the difficulty of reverse engineering an
application.
Programs can be made incomprehensible, in whole or in part, in many ways and to different degrees.
It is important to keep in mind that obfuscation techniques can also be employed by bad actors, try to
mask their virus and malware against signature based detection.
Below we will provide details about some of the most frequently used obfuscation tools for Android
applications.
ProGuard is an open-source Java class file shrinker, optimizer, obfuscator and pre-verifier. The
shrinking step detects and removes unused classes, fields, methods and attributes. The optimization step
analyses and optimizes the byte-code of the methods. The obfuscation step renames the remaining classes,
30
fields, and methods using short meaningless names. These first steps make the code base smaller, more
efficient, but also harder to reverse engineer. The final pre-verification step adds validation information
to the classes, which is required for Java Micro Edition and for Java 6 and higher [55].
Tools like Android Studio [56] already integrate ProGuard, making it easily accessible to developers
that want to use it to automatically process application code during the build process.
The obfuscation step of ProGuard essentially modifies the class, method and field names to smaller
and abstract names (class A, method c, field b, etc.). This step reduces the size of APK and strips
semantics from code, making it harder to RE. During this step, a mapping file is generated so that
developers can translate debugging information (with obfuscated names) to match the original names
used.
Special attention is required if the application code, or any library included, takes advantage of Java’s
reflection mechanism. ProGuard must be configured to skip obfuscation of entities used by the reflection
code, otherwise functionality that depends on original entity names will not work properly.
DexGuard is the commercial sibling of ProGuard for Android. It can reuse ProGuard’s configuration
and because of their similarities, developers can continue exercising their knowledge and the community’s
expertise on ProGuard [57]. DexGuard optimizes, obfuscates, converts to Dalvik-VM byte-code, packages,
signs and aligns archives in a single seamless process. This optimization streamlines and speeds up the
entire build process.
Obfuscator-LLVM is a project initiated in June 2010 by the information security group of the
University of Applied Sciences and Arts Western Switzerland of Yverdon-les-Bains (HEIG-VD). The aim
of this project is to provide an open-source fork of the LLVM compilation suite able to provide increased
software security through code obfuscation and tamper-proofing [58]. Currently the Obfuscator-LLVM
includes the following features:
• Instructions Substitution: works by replacing standard binary operators (like addition, subtraction
or boolean operators) by functionally equivalent, but more complicated sequences of instructions.
This kind of obfuscation is rather straight-forward and does not add a lot of security, as it can
easily be removed by re-optimizing the generated code.
• Bogus Control Flow: modifies a function call graph by adding a basic block before the current basic
block. This new basic block contains an opaque predicate and then makes a conditional jump to
the original basic block.
• Control Flow Flattening: completely flattens the control flow graph of a program.
A commercial version of Obfuscator-LLVM implementing much more advanced capabilities is available
through strong.codes [58].
31
Strong.codes was a company active in the domain of software protection and they developed
strong.protect, an evolution of a long-time research project Obfuscator-LLVM. Strong.codes was bought
by Snap, inc. and recently their website has been offline so we’re not sure if their product “strong.protect”
is still being commercialized. Strong.protect performs advanced code obfuscation and tamper-proofing,
in one of the most powerful compilation frameworks of the moment and its goal is to make software piracy
much more expensive and complicated [59].
32
3Proposed Solution
Contents
3.1 Design and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Framework Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Framework Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Static analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5 Dynamic Binary Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . 47
33
We knew from the beginning we wanted to build a flexible framework capable of helping security
researchers perform a thorough analysis of Android applications using modern techniques.
The framework would need to integrate both static and dynamic analyses in an way that they could
complement each other. It also had to be easy to configure, while being fully automated to allow batch
testing of applications.
Our work focused mostly on integrating tools already employed and well documented by the mobile
security community, into a single framework, to allow a convenient way to leverage their functionalities.
Researching all the topics we have previously covered in chapter 2 allowed us to be conscious about
which tools and techniques the framework had to support in order to achieve a powerful platform to
analyse mobile applications and assess for vulnerabilities.
We decided to name the framework as Android Security Framework, or DroidSF for short.
DroidSF is completely open-source and we invite everyone to contribute through its Github repository:
https://github.com/neskk/droidsf
We decided to start working on top of DroidStat-X [19], a static analysis framework also built on
Python. This framework was especially attractive for us, because it was structured following the Open
Web Application Security Project (OWASP) Mobile Top 10 categories, which we also decided follow.
Our first big task was to port the code from DroidStat-X to support the newest version of Python 3.
We also took the opportunity to fix some of the code-style and ended up reviewing all the code in this
framework. These changes were submitted as a public-request on GitHub and were merged.
3.1 Design and Requirements
In our proposal we established some broad requirements we needed to fulfil in order to create a powerful
and capable mobile security analysis framework. The first requirement was to focus exclusively on the
Android platform to make sure we had a well defined scope for our security analysis. The second re-
quirement was that the whole analysis process had to allow full automation in order to be able to cope
with the significant number of applications released every day. The last requirement established that the
framework would have to be able identify some of most common vulnerabilities found in Android applica-
tions. These requirements distinguish our framework from existing ones (e.g., ApkX [60], Objection [61]
and AppMon [18] - see section 4.2), and were established to help shape its design process.
We wanted to create a useful and attractive framework for developers and security researchers to
build upon. To achieve this goal, we spent a considerable amount of time laying out the most important
design aspects we would have to follow.
34
3.1.1 Design Choices
Below we present the key design aspects we closely followed during the implementation of our framework:
Multi-platform: It was important for us to create a framework that could work natively on all the
major desktop platforms: Windows, Linux and MacOS. This removes the need to use a virtualisation
software which usually adds significant performance overhead and has less resources available to
perform the analysis. Frameworks with similar characteristics to ours have been developed, but
mostly are built to operate on Linux and have dependencies that are not available by default on
Windows. We decided to build our framework using Python because it is an interpreted, high-level,
general-purpose programming language and it is available for all the major Operating Systems (OSes).
Additionally, there are many security analysis tools built on Python, so it made sense for us to follow
this trend in order to take advantage of its vast and active community of developers.
Easy to install: Minimizing the time it takes to setup the framework was a priority for us. To
achieve this goal we tried to reduce the number pre-requisites the user has to install manually. We
also took advantage of Python’s package installer ‘pip’, which is installed by default with Python, to
automate the download and installation of required Python packages.
Batteries included : Our framework uses various third-party external tools to execute several tasks.
Since all the required tools are available for free on the internet, we decided to automate the whole
process of downloading and configuring them to run from within our framework. To respect the
multi-platform design, we made sure to select external tools that can run either directly on Python
or on Java runtime environment, as Java is also available on all major OSes. The goal was to provide
a pleasant user experience and avoid time-consuming tasks where we could.
Configurable: We wanted users to be able to easily customize the analysis and allow the definition
of profiles to avoid manually specifying configuration options on each run. In cases where more than
one external tool can be used to perform a certain task, we decided to give users the ability to choose
which one they want to use. All the configuration options can be defined through command-line
parameters or using a configuration file. Users can specify a configuration file through the command-
line to act as a profile for performing analysis on a batch of applications.
Extendable: In order to be able to keep up with new techniques, we allow custom static analysis
checks to be performed on ‘smali’ code. Users can add regular expressions to be tested through a
configuration parameter, and the results will be output in the analysis report. Our framework also
allows running custom Frida (see section 3.5.1) injection scripts during the dynamic analysis process.
Users just have to specify the location for the custom Frida script they wish to execute through a
configuration parameter.
Instrumentation script templates: To take advantage of the information collected during static
analysis we needed a convenient way to customize the instrumentation scripts. We developed a simple
35
template system that essentially replaces predefined place-holders in the scripts with information
obtained during the static analysis. Another feature we implemented in the template system enables
users to build a script that includes other scripts, allowing a modular approach to scripts.
Fully automated: We knew that one of our major challenges was the automation of the dynamic
analysis, since static analysis is naturally an automated process. Taking advantage of Python’s ability
to easily interact with the native OS, we managed to automate almost every step of the binary
instrumentation we use to perform the dynamic analysis. From setting up the device, to running the
instrumentation script on application, it can all be done without user interaction.
3.1.2 Requirements
As some of our design decisions may have already indicated, our framework requires users to manually
install the following software:
• Android Studio
• Android Software Development Kit (SDK)
• Java Development Kit (JDK)
• Python 3 and its package installer: ‘pip’
We provide more details about the setup and configuration steps in our GitHub repository.
Android Studio is Google’s official Integrated Development Environment (IDE) for authoring Android
applications but it also automates most of the steps required to open, decode, disassemble and decompile
an Application Package Kit (APK) [56].
The major advantage of using Android Studio is that it provides a user-friendly, powerful, all-in-one
solution to manually analyse and debug Android applications. Users just have to click on ‘Profile or
debug APK’ in the starting menu, select the target APK and Android Studio will generate a organized
project containing application’s source code files and resources.
Android Studio has built-in support for ‘smali’ code and it automatically generates ‘smali’ files from
available ‘dex’ byte-code found in the APK. We can also set breakpoints, control execution flow, monitor
objects, and many other debugging functionalities one would expect from a modern IDE, on applications
running in a connected Android device or in an Android Virtual Device (AVD) emulator.
We require Android Studio to be installed because it includes the Android SDK and allows the
creation/management of AVD emulators.
The AVD manager found in Android Studio is our recommended method to configure an emulator
capable of performing the dynamic analysis of applications.
36
Android SDK includes tools that interface with the Android platform, such as ‘adb’, ‘fastboot’, and
‘systrace’. Our framework depends on ‘adb’ to be able to interact with Android devices.
The Android SDK depends on a number of tools from the JDK, most notably the ‘javac’ used in the
first compilation step of an Android application. Usually, JDK will be installed during the installation
of Android SDK, which ends up satisfying another requirement by our framework.
We implemented the framework in Python, so it is only natural that Python needs to be installed
and properly configured for users to be able to execute it. We decided to use Python version 3 because,
although still very popular, Python version 2 is getting deprecated and will stop being supported in 2020.
By default, Python’s installer also installs its package manager (‘pip’), which is required to automate the
installation of Python dependencies.
3.1.3 Dependencies
A common golden-rule in Software Engineering states that, developers should avoid reinventing the wheel
and use what is already available instead.
Our framework depends on many third-party tools to support all the different and complex tasks we
want to perform during the analysis of an Android application.
• APKtool [33]: Can unpack and decode an APK. Produces ‘smali’ code from the ‘dex’ byte-code
found in the APK.
• dex2jar [24]: Various tools work with ‘dex’ files. Can be used to convert ‘dex’ byte-code into a
Java Archive (JAR) file.
• enjarify [25]: Translates ‘dex’ byte-code to equivalent Java byte-code. Outputs Java byte-code
inside a JAR file.
• CFR [27]: Java decompiler with support for modern Java features. Takes a JAR file as input and
outputs Java source code.
• Procyon [26]: Suite of Java meta-programming tools focused on code generation, analysis, and
decompilation. Its decompiler will take a JAR file and output Java source code.
• JADX [62]: Command-Line Interface (CLI) tools that can decompile Java source code from An-
droid APK files.
• Frida [50]: Powerful, well-documented, and very popular toolkit built for dynamic instrumentation
of binaries.
All the tools mentioned above are downloaded automatically by our framework when it executes.
Downloads are cached to avoid unnecessary time delays and extra internet traffic.
37
For our framework to work properly it requires the Python packages to be installed:
• AndroGuard [63]: Framework built in Python that allows analysis and manipulation of APKs.
• Frida: Python bindings to interact with instrumented processes. It offers a convenient way to
programmatically interact with Frida.
• PyElfTools: Library for parsing and analysing ELF files. It allows our framework to inspect native
libraries commonly found in APKs.
• ConfigArgParse: Library that allows configuration parameters to be read from a file.
• Requests: Library that facilitates web requests.
A more detailed guide of how to setup our framework is available in its official repository.
3.2 Framework Workflow
To be able to achieve our goal of assessing the overall security of an Android application, we integrated
various tools into a single framework to help automate the security assessment process. In this section we
will show users how to use our framework and describe the workflow we implemented to analyse Android
applications.
Listing 3.1: DroidSF framework usage
# Minimal usage
python3 s c r i p t . py - a /path/ to /app . apk
# Using a con f i g f i l e
python3 s c r i p t . py - a /path/ to /app . apk - c f /path/ to / c on f i g . i n i
# Only s t a t i c a n a l y s i s
python3 s c r i p t . py - a /path/ to /app . apk - - no - dynamic - a n a l y s i s
# Complete usage
python3 s c r i p t . py [ - h ] [ - c f CONFIG] [ - v ] - a APK FILE
[ - d { di sab led , standard , jadx } ] [ - s SCRIPT]
[ - i t INSTRUMENTATIONTIMEOUT] [ - - f o r c e ] [ - - f o r c e - download ]
[ - - no - s t a t i c - a n a l y s i s ] [ - - no - dynamic - a n a l y s i s ]
[ - - cache - path CACHEPATH] [ - - download - path DOWNLOADPATH]
[ - - log - path LOG PATH] [ - - output - path OUTPUTPATH]
[ - - arch {arm , arm64 , x86 , x86 64 } ] [ - - device - id DEVICE ID ]
[ - - dex - conve r t e r {dex2jar , e n j a r i f y } ]
[ - - java - decompi ler { c f r , procyon } ]
[ - - f r i da - v e r s i on FRIDA VERSION]
[ - - f i l e - e x c l u s i o n s FILE EXCLUSIONS ]
38
[ - - d i r e c to ry - e x c l u s i o n s DIRECTORY EXCLUSIONS]
[ - - custom - checks CUSTOMCHECKS] [ - - java - home JAVAHOME]
[ - - android - sdk ANDROID SDK] [ - - java - xms JAVA XMS]
[ - - java -xmx JAVAXMX]
To take advantage of the available dynamic analysis features, users should start an AVD emulator, or
connect a rooted Android device, before executing our framework.
Performing the dynamic analysis step without a device/emulator visible to ‘adb’ is impossible, causing
the framework to exit gracefully at this step. If users choose to use a physical device, they must ensure
that USB debugging is enabled and give root access to Frida.
Below we provide an overview about each step from the workflow implemented in DroidSF.
APK analysis: We use AndroGuard to analyse application’s manifest and extract useful infor-
mation, e.g., package name, application version and target API level, permissions, activities and
certificates.
APK unpack: Apktool is used to unpack the APK, decode its contents, and baksmali the ‘dex’
files to generate ‘smali’ code.
DEX decompilation: A combination of ‘dex’ converters (dex2jar/enjarify) and Java decompilers
(cfr/procyon/JADX) can be configured to generate Java source code.
Static analysis: We fully integrated DroidStat-X checks into our framework with various modifica-
tions in order to support Windows. During this step we resort to regular expressions and AndroGuard
to identify code patterns that may indicate a vulnerability.
Export report: Create a text file containing the information collected during the static analysis.
Device/emulator setup: First we push ‘frida-server’ to the device, then we install the APK
currently being tested, and finally we start the ‘frida-server’ process.
Dynamic instrumentation: We tell Frida to spawn the application on the device, then we attach
it to the application process and inject a instrumentation script that contain various hooks to methods
and classes we want to observe.
Interacting with application: During this step we allow users to interact with the application
being instrumented. This allows users to test features in the application, which might trigger hooked
functions. Users can use external tools, like AndroidViewClient and DroidBot, to completely automate
this interaction step.
Export Results: Generate a text file containing results from the instrumentation script.
39
3.3 Framework Configuration
We wanted to let users configure virtually every aspect of the framework, while providing a convenient
way for them to define a configuration profile, i.e., reading configuration parameters from a text file that
can be reused. Every configuration parameter available on DroidSF is shown in table 3.1.
Parameter Description-h, –help Show help message and exit.-cf, –config Configuration file.-a, –apk-file APK file to analyse.-d, –decompiler Decompile APK to Java source code. Default: disabled
Choices: disabled, standard, jadxStandard uses ‘–dex-converter’ and ‘–java-decompiler’.
-s, –script Instrumentation script to execute. Default: class list.js-it, –instrumentation-timeout Time in seconds for frida instrumention.
Default: 0 (indefinitely)–force Overrides previously generated results.–force-download Overrides previously downloaded files.–no-static-analysis Skip static analysis checks.–no-dynamic-analysis Skip dynamic analysis checks.–cache-path Directory where temporary files are saved.–download-path Directory where downloaded files are saved.–log-path Directory where log files are saved.–output-path Directory where generated files are saved.–arch Android device architecture. Default: x86.
Choices: arm, arm64, x86, x86 64–device-id Specify target device ID.
Default: none - list devices interactively.Use ‘*’ to choose the first device available.
–dex-converter DEX to JAR converter. Default: enjarify.Choices: dex2jar, enjarify
–java-decompiler JAR to Java decompiler. Default: procyon.Choices: cfr, procyon
–frida-version Specify which Frida version to use. Default: 12.4.4Note: must match python package version.
–file-exclusions Ignore these paths/files on static analysis–directory-exclusions Ignore these directories on static analysis.–custom-checks Additional REGEX checks for ‘smali’ code.–java-home Directory that contains Java binaries.–android-sdk Directory that contains Android SDK binaries.–java-xms Initial RAM allocated for Java VM. Default: 128m–java-xmx Maximum RAM allocated for Java VM. Default: 1024m
Table 3.1: DroidSF: Configuration parameters
Our framework sets default values for all parameters, except for ‘-a –apk-file’, the APK file. Every
parameter can be adjusted and tweaked to customize the analysis of the application.
40
3.3.1 Default settings
The default configuration assumes that the instrumentation will be held in an x86 emulator. We rec-
ommend using the AVD Manager, included in Android Studio, to create an emulator with the following
characteristics:
• Nexus 5X
• x86 images: Oreo - API level 27 - ABI x86
This setup worked consistently during our tests with the current release of Frida (12.4.4). We assume
everything will work fine in different emulators/devices as long as Frida is able to run on it.
Using just the ‘-a, –apk-file’ parameter to indicate which APK to analyse, our framework will disas-
semble ‘dex’ to ‘smali’, skip decompilation to Java source code, run DroidStat-X static analysis, ask user
which device to use, install and launch the APK in the device, and inject the instrumentation script. At
this point the user can manually interact with the application for as long as they want. Users have to
manually terminate the instrumentation to let DroidSF process and export all the output from Frida’s
instrumentation process.
3.3.2 Fully automated
To be able to fully automate the analysis the process users must specify the following configuration
parameters:
• –device-id: Use a specific device ID, or use ‘*’ to automatically select the first device available.
• –instrumentation-timeout: Set a maximum amount of time for the instrumentation process.
In contrast with the scenario previously described in Default settings, providing these two configuration
parameters ensures that the whole analysis process does not require manual intervention.
As referred in the overview of the framework’s workflow, users can leverage some external tools
designed to automate the interaction with the application.
During our tests we experimented DroidBot [64], a lightweight test input generator for Android, with
some pretty good results. It systematically explores the application while interacting with it in a similar
way a human would. We decided to not incorporate DroidBot in our framework and leave it as an optional
tool users can choose whether to use or not, during the instrumentation process.
3.4 Static analysis
In this section we describe how our framework performs the static analysis of application code in search
for vulnerabilities.
41
We will briefly introduce Apktool and AndroGuard before going into detail about the static analysis
checks performed by DroidStat-X.
3.4.1 APKtool
APKtool is a CLI tool for reverse engineering third-party binary Android applications. It can decode
resources to nearly original form and rebuild them after making some modifications [33]. It outputs a
project-like file structure containing APK’s contents, and automates some repetitive tasks, like rebuilding
disassembled resources back to an APK.
A relevant feature in APKtool for our framework, consists in its ability to baksmali the ‘dex’ byte-
code found inside the APK, and output ‘smali’ code. This process is faster than performing a full
decompilation to Java source code and offers a good intermediate code representation where we can
search for code patterns analysis.
We can inspect and modify the ‘smali’ code. It is even possible to replace whole classes by generating
‘smali’ from new Java source code. Once all the modifications are done, one can easily package the APK
back up with APKtool again. It is worth noting that the resulting APK is not signed by APKtool.
3.4.2 AndroGuard
AndroGuard is a pure Python framework to experiment and analyse Android files [63]. It supports:
• DEX / ODEX files
• APK files
• Android’s binary XML format
• Android binary encoded resources
• Disassemble DEX/ODEX byte-code
• Decompiler for DEX/ODEX files
Developers can either use the CLI or use AndroGuard purely as a library for their own tools and
scripts. Below you find some of the most notable CLI tools developed by AndroGuard:
• androcg.py: generates call graphs.
• androdd.py: generates control flow graphs.
• androdis.py: disassembler for ’dex’ files.
• androlyze.py: starts a iPython shell with all modules loaded.
42
• androgui.py: androguard graphical user interface.
One of the easiest ways to analyse an APK file, is starting an interactive Python shell by using
‘androlyze.py’ [65].
For analysing and loading APK or ‘dex’ files, we can use ‘AnalyzeAPK(filename)’ and ‘AnalyzeDEX(filename)’
respectively. The three objects returned by these wrapper classes are: an APK object (a), a Dalvik-VM
format object (d) and an Analysis object (dx).
Inside the APK object (a) you can find all information about the application, e.g., package name,
permissions, certificates, the AndroidManifest.xml and its resources.
The Dalvik-VM format (d) corresponds to the ‘dex’ file found inside the APK file. We can fetch
classes, methods or strings from the ‘dex’ file, but when analysing ‘multi-dex’ the Analysis object (dx)
should be used instead, as it contains special classes to handle these settings.
The Analysis object (dx) allows to follow the call flow using cross-references (XREFs), which are
generated for four things: Classes, Methods, Fields and Strings.
Cross-references (XREFs) work in two directions, meaning that we can navigate to the object that
called the current object (xref from), or navigate to another object that is being called by current one
(xref to). This is a very powerful feature, since it can be used to produce function call and control flow
graphs.
We use AndroGuard in our framework essentially to retrieve information from the APK. Some of this
information will later be used to customize the dynamic analysis process.
3.4.3 DroidStat-X
DroidStat-X [19] is a Python framework that generates an XMind map with all the information gathered
and any evidence of possible vulnerabilities identified via static analysis.
The XMind map is structured following the OWASP Mobile Top 10 2016 categories. We provide an
example in figure 3.1. Each category has various topics that security testers should to cover, to guarantee
and highlight coverage. Each topic has a URL to the respective chapter in the OWASP’s Mobile Security
Testing Guide explaining the vulnerability and how to confirm its existence.
We built the DroidSF framework based on DroidStat-X because of its comprehensive testing method-
ology and its solid collection of checks performed to application’s ‘smali’ code. Our first steps consisted
in modifying several aspects that were not compatible with the requirements we had established for
DroidSF.
The first thing we decided to do was migrating DroidStat-X code-base to Python 3. We also took the
opportunity to define a standard code-style and made sure everything was uniform.
43
Figure 3.1: XMind Map generated by DroidStat-X
44
To respect DroidSF’s multi-platform design, the second big change we had to do, consisted in replacing
DroidStat-X dependencies with OS agnostic alternatives. DroidStat-X depends on common software used
in many Linux systems: ‘grep’, ‘sed’, ‘readelf’ and ‘dd’.
• ‘grep’ / ‘sed’: are mostly used to search for regular expressions in ‘smali’ code output by APKtool.
We built equivalent functionality to these tools using only pure Python. We even measured the
difference in performance to make sure we were not making things slower.
• ‘readelf’ / ‘dd’: were only used to extract Xamarin DLLs from native libraries. We have imple-
mented an alternative method in pure Python to achieve this and it only requires the package
PyElfTools. Our alternative method surprised us because it is much faster, when comparing with
the original method. We even tested the extracted DLLs with a dotNET decompiler, and the results
were properly disassembled.
The last major change we performed to DroidStat-X consisted in decoupling the XMind map gener-
ation. We wanted to have a simple textual report output, instead of depending on an external software
and SDK.
3.4.4 Implemented tests
Below we describe the information that can be obtained from the DroidStat-X module found in DroidSF:
• Package Name
• Version Name and Code
• APK file SHA256 hash
• Minimum/Target SDK Version (API level)
• Determine if the backup option is enabled
• Determine if the package is ‘multi-dex’
• Export Permissions and permission levels
• Determine APK signing scheme and used certificates
• Identify signature files
• Check for presence of secret codes in IntentFilters
• Exported Components with respective IntentFilters and Permissions
• List files contained in the APK
45
Technology/Framework fingerprinting:
• OutSystems: test if certain known classes are present in the application.
• Cordova: identify usage and lists used plugins.
• Xamarin: identify usage and extract its DLLs automatically.
Vulnerability testing:
• Object Usage
– WebViews loadUrl method
– Cryptography Functions
• Improper Platform Usage: Components security related checks
– Activities vulnerable to Fragment Injection
– Lack of ‘FLAG SECURE’ or ‘android:excludeFromRecents’ in Activities
– Path Traversal in exported ContentProviders
– SQL Injection in exported ContentProviders
• Reverse Engineering: Package security related checks
– Determine if the application is debuggable.
• Improper Platform Usage: WebViews security related checks
– Usage of AddJavascriptInterface in WebViews
(on API level < 16, this might indicate a Remote Code Execution vulnerability)
– Usage of Javascript enabled WebViews
– Usage of fileAccess enabled WebViews
– Usage of UniversalAccessFromFileURLs enabled WebViews
• Insecure Communication Topic: TLS security related checks
– Vulnerable TrustManagers
– Vulnerable HostnameVerifiers
– Webviews Vulnerable onReceivedSslError Method
– Direct usage of Socket without HostnameVerifier
– Determine the usage of Certificate Pinning (okHTTP and custom implementations)
46
– Determine the usage of NetworkSecurityConfig file (API level ≥ 24)
∗ Check if clear-text is allowed
∗ Check if Certificate Pinning is enabled
∗ Validate Certificate Pinning expiration date
∗ Determine if User CA’s are trusted
• Insufficient Cryptography: Cryptography security related checks
– Usage of AES with ECB cryptography functions
– Usage of DES or 3DES cryptography functions
– Determine the usage of Android Keystore - no usage may indicate vulnerabilities
This information is stored inside the DroidStatX class, so it can easily be accessed to complement the
dynamic analysis process. All the information collected during the static analysis is exported to a text
file once it finishes, for posterior analysis.
3.5 Dynamic Binary Instrumentation
This process was the hardest to automate as it required interaction with multiple external tools and
making sure the device/emulator is properly configured.
As we have indicated in section 3.2, we integrated Frida’s powerful instrumentation framework, in
order to perform the dynamic analysis of applications. We take information obtained from the static
analysis step to generate specialized hooks and customize the instrumentation script.
3.5.1 Frida
Frida [50] is an immensely powerful toolkit, used to build scripts for dynamic instrumentation of ap-
plications. It lets security testers inject snippets of JavaScript, or even a complete libraries, into native
applications running on Windows, macOS, GNU/Linux, iOS, Android, and QNX.
The core of Frida is written in C and injects Google’s V8 engine into the target processes, where
injected snippets of Javascript get executed with full access to memory, hooking functions, and can even
calling native functions inside the process. A bi-directional communication channel is also established
with the instrumented process, allowing users to interact with the Javascript script running inside the
target process.
Frida also provides some simple CLI tools built on top of the Frida API. These can be used as-
is, tweaked to user needs, or serve as examples of how to use the Frida’s Application Programming
Interface (API).
47
There are essentially two approaches one can take to instrument Android applications using Frida.
The first approach requires users to have a rooted Android device, and consists on running ‘frida-
server’ as root (it uses ’ptrace’ internally) on the device. There are several reasons for choosing this
approach, the most important one being that it does not require repackaging of the APK we want to
instrument. After attaching to the target application, ‘frida-server’, injects the ‘frida-gadget’ library into
the memory space of the process.
The second approach does not require a rooted Android device, and essentially requires users to
repackage the APK to include the ‘frida-gadget’ library. This approach can be useful because, it can
avoid possible side-effects on applications that implement ptracing/debugging checks, but on the other
hand, it can trigger checks against repackaging [66].
Since we are interested in emulators, which are necessarily rooted, and we wanted to avoid repackaging
APKs in order to instrument them, we decided our framework would follow the first approach.
3.5.2 Included instrumentation scripts
We decided to include Frida hooks capable of monitoring the following Android APIs:
• Bluetooth
• Clipboard
• Cryptography Cipher-suites
• Cryptography Hashing functions
• Database (includes SQLiteDatabase)
• File-system Input/Output
• SharedPreferences
• Local Storage
• FlagSecure
• IPC
• Networking/Communications (HTTP/HTTPS)
• System calls to ‘libc’ native library
• WebView usage
48
These hooking scripts can be helpful to perform call stack traces. They can even be used to develop
a behaviour-based analysis that takes into account every system call performed to produce a profile of
the application.
Taking advantage of the template system we implemented in our framework, it is possible for security
testers to merge various instrumentation scripts into a single script that will injected into the application.
This feature is very important since it allows scripts to be built in a modular fashion, enabling re-usability,
while avoiding code duplication.
We have also included in DroidSF, some basic instrumentation scripts meant to be used as guides:
• class list.js: Outputs class names for every loaded class by the application.
• some class.js: Shows how to manipulate classes and create instances on runtime.
• change method.js: Shows how to modify the implementation of a class method.
• anti re.js: Overloads ‘java.lang.System::exit()’ method to prevent applications from exiting.
• rpc.js: Example of how to build a RPC interactive script that lets users arbitrarily call its methods
during the instrumentation process.
Frida’s Javascript API offers two ways for the script to send data back to our framework: ‘send(data)’
and ‘console.log(data)’. In order to get information sent by the ‘send()’ method, we have to specify a
message handler that will be responsible for parsing the data.
Since each script will have slightly different outputs, we created a mechanism that allows users to
specify which functions are responsible for handling the output, and, at the end of the instrumentation,
export the collected data.
This mechanism is better explained below in listing 3.2. Albeit it is fully functional, we realized it
could be improved and refactored to completely abstract the instrumentation testing suite, i.e., the script,
message handlers, data analysis and export relevant results.
49
Listing 3.2: Mechanism to handle Frida’s output
1 a p p c l a s s l i s t = [ ]
2 de f p a r s e c l a s s l i s t (message , data ) :
3 i f message [ ' type ' ] == ' send ' :
4 a p p c l a s s l i s t . append (message [ ' payload ' ] )
5
6 de f e x p o r t c l a s s l i s t ( apk ) :
7 f i l ename = apk . output name + ” - c l a s s l i s t . txt ”
8 d r o i d s f . u t i l s . e x p o r t f i l e ( apk . output path , f i l ename , a p p c l a s s l i s t )
9 l og . i n f o ( ”Exported c l a s s l i s t : %s ” , f i l ename )
10
11 on message handler s = {
12 ” c l a s s l i s t . j s ” : p a r s e c l a s s l i s t ,
13 . . .
14 }
15
16 on resume handlers = {
17 ” c l a s s l i s t . j s ” : e x p o r t c l a s s l i s t ,
18 . . .
19 }
20
21 . . .
22 # Spec i f y the message handler
23 i f a rgs . s c r i p t in on message handler s :
24 s c r i p t . on ( 'message ' , on message handler s [ a rgs . s c r i p t ] )
25 e l s e :
26 s c r i p t . on ( 'message ' , on message )
27
28 . . .
29 # Spec i f y the r e s u l t par s ing handler
30 i f a rgs . s c r i p t in on resume handlers :
31 on resume handlers [ a rgs . s c r i p t ] ( apk )
Listing 3.2 describes how DroidSF is able to handle the execution of the default instrumentation
script: ‘class list.js’.
During the instrumentation process, every message sent by the script ‘class list.js’ will be parsed by
‘parse class list()’, and when the process terminates ‘export class list(apk)’ is called to analyse and export
the results.
50
4Evaluating the Solution
Contents
4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Comparison with other tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Selected testing applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 OWASP Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
51
52
We created a framework with the ability to automate several Reverse Engineer (RE) tasks and perform
a security analysis of Android applications, through the use of several existing tools. Our focus was on
building a framework that could implement most of the approaches described throughout this dissertation.
Evaluating the techniques, methodologies and tools we discussed in the present report is not a linear
task. Having no simple way to directly compare results with other frameworks with similar goals and
motivations, we focused on evaluating the performance and basic ability to perform certain tasks.
4.1 Methodology
To evaluate our framework we devised a three step process.
The first step on our tests was to make sure that our framework had the ability to consistently
disassemble, decompile and run static analysis tests.
Taking advantage of the findings produced by static analysis, the second step in our tests assessed
the ability of customizing the instrumentation code to the application under analysis.
Afterwards, we attempt to run the application on our Android Virtual Device (AVD) environment. If
we are able to execute the application, then the final test consists on injecting the instrumentation code
and, after some interaction with the application, export the results.
4.2 Comparison with other tools
Our framework, DroidSF, actively seeks to reduce the time spent setting up a working test environment
for testing Android applications. It takes very little time to configure, can be easily installed on either
Linux, macOS or Windows, and provides static and dynamic analyses capabilities.
4.2.1 Apkx
Apkx [60] is Python wrapper to popular free ‘dex’ converters and Java decompilers. It extracts Java source
code directly from the APK and can be useful for experimenting with different converters/decompilers
without having to worry about ‘classpath’ settings and configuration parameters.
Apkx is the recommended tool to accompany the Open Web Application Security Project (OWASP)
- Mobile Security Testing Guide, but it has not been updated in a while and some of the tools included
with it are outdated.
Our framework, DroidSF, could easily replace APKX since it offers the same decompilation function-
alities. It will also automatically download new versions of the third-party tools employed, guaranteeing
that they will remain up-to-date with current releases.
53
Using DroidSF instead APKX also allows users to configure the amount of RAM available to the Java
Virtual Machine (VM) executing the third-party tools, which can be important when analysing large
applications.
4.2.2 Objection
Objection [61] is yet another framework built upon Frida (see sub-section 3.5.1). It provides an interactive
framework for security testing of mobile applications, with the particularity that it utilizes TypeScript
to generate the JavaScript scripts that Frida injects. This creates an abstraction level that may hinder
maintainability and produces a steeper learning curve for security testers. It effectively makes it harder
for developers start manipulating the code base and contribute to the framework.
Another downside resides in the fact that, Objection was designed and created to be an interactive
exploratory session, whereas we wanted a framework with the ability to be process batches of Application
Package Kit (APK)s autonomously.
4.2.3 AppMon
Appmon [18] is an automated framework for monitoring and tampering system API calls of native macOS,
iOS and Android applications. It is based also based on Frida (see sub-section 3.5.1) and can automate
the process of unpacking the APK, adding ’frida-gadget’ shared library and repackaging it.
There are several sub-components of this project, each provides developers with some short-cuts and
predefined recipes to perform some useful reverse engineering tasks:
• AppMon Sniffer - Intercept API calls to figure out interesting operations performed by an applica-
tion.
• Appmon Intruder - Manipulate API calls data to create change application’s original behaviour.
• AppMon Android Tracer - Automatically traces Java classes, methods, its arguments and their
data-types in APKs.
• AppMon IPA Installer - Creates and installs “inspectable” IPAs on non-jailbroken iOS devices.
• AppMon APK Builder - Creates APKs “inspectable” on non-rooted Android devices.
Although AppMon is another very powerful framework for security testing of mobile applications,
it does not focus on Android, and it is suited towards injecting Frida into an APK rather than taking
advantage of rooted devices.
This tool supports macOS, iOS and Android and it is designed to automate monitoring and tampering
of system Application Programming Interface (API) calls. Taking into account everything we researched
54
about RE techniques, it is our opinion that, the broader the scope of software we are analysing, the harder
it is to be precise and keep complexity low. To avoid this issue, we focused on developing a specialized
framework for Android.
4.3 Selected testing applications
Most of the applications were selected through an exhaustive search to maximize the amount of imple-
mented checks we covered in our tests.
We wanted to ensure the correctness and completeness of the checks implemented in our framework.
To do so, we required applications that exposed vulnerabilities and bad security practices. We searched
various different sample applications and found some that were built especially for the purpose of training
security testers.
Intentionally vulnerable Android applications:
• InsecureBank v.2
https://github.com/dineshshetty/Android-InsecureBankv2
• PIVAA v.1
https://github.com/HTBridge/pivaa
• DVHMA-FeatherWeight v.6.3.0
https://github.com/logicalhacking/DVHMA
• DVHMA-OpenUI v.6.3.0
https://github.com/logicalhacking/DVHMA
• Sieve v.2.3.4
https://github.com/mwrlabs/drozer/releases/
• OWAPS MSTG Challenges
Small applications built as didactic examples to accompany the OWASP: Mobile Security Testing
Guide. https://github.com/OWASP/owasp-mstg/tree/master/Crackmes
These applications allowed us to assess if some features of our analysis were working properly.
For instance, we knew that ‘DVHMA-OpenUI’ was built on Apache Cordova and we wanted to check
if our framework was able to correctly detect this.
55
Disassembly Decompilation Static Analysis Dynamic AnalysisInsecureBank X X X XPIVAA X X X XDVHMA-FeatherWeight X X X XDVHMA-OpenUI X X X XSieve X X X X
Table 4.1: DroidSF basic tests - April 2019Successful: X, Failed: X
4.4 OWASP Methodology
OWASP is a worldwide not-for-profit charitable organization focused on improving the security of software
[67].
This methodology is based on OWASP’s Top 10 mobile application vulnerabilities of 2016, which many
developers use as the standard source for information on how to test the security of mobile applications.
Important areas we need to analyse in a mobile application to evaluate its overall security level are [68]:
M1 Improper Platform Usage - This category covers misuse of a platform feature or failure to use
platform security controls. It might include Android intents, platform permissions, or some other
security control that is part of the mobile operating system.
M2 Insecure Data Storage - This covers insecure data storage and unintended data leakage.
M3 Insecure Communication - This covers poor handshaking, incorrect SSL versions, weak key negoti-
ation, clear-text communication of sensitive assets, etc.
M4 Insecure Authentication - This category captures notions of authenticating the end user or bad
session management.
M5 Insufficient Cryptography - Investigate the code that applies cryptography to a sensitive information
asset. This category is for issues where cryptography was attempted, but it wasn’t done correctly.
M6 Insecure Authorization - This is a category to capture any failures in authorization (e.g.: autho-
rization decisions in the client side, forced browsing, etc.).
M7 Client Code Quality - A catch-all category for code-level implementation problems in the mobile
application client. This would capture things like buffer overflows, format string vulnerabilities,
and various other code-level mistakes in the client.
M8 Code Tampering - Covers binary patching, local resource modification, method hooking, method
swizzling, and dynamic memory modification.
M9 Reverse Engineering - Analysis of the final binary to determine its source code, libraries, algorithms,
and other assets. This may be used to exploit other nascent vulnerabilities in the application, as
56
well as revealing information about back end servers, cryptography constants and ciphers, and
intellectual property.
M10 Extraneous Functionality - Developers may have included hidden backdoors or other internal de-
velopment security controls that are not intended to be released into a production environment.
We ran test applications we selected through the DroidSF framework to identify potential vulnerabil-
ities and obtained the following results:
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10InsecureBank X X X X X X X X X XPIVAA X X X X X X X X X XDVHMA-FeatherWeight X X X X X X X X X XDVHMA-OpenUI X X X X X X X X X XSieve X X X X X X X X X X
Table 4.2: DroidSF findings - April 2019Detected vulnerability: X, No vulnerability found: X
4.5 Limitations
Around 98% of Android mobile devices use ARM CPUs [69]. Due to this fact, a growing number of
developers choose not to include native libraries compiled for x86/amd64 on the APK, which effectively
prevents their applications from being executed natively on x86/amd64 CPUs. Android will even refuse
to install the application if it does not match system’s architecture where it is running.
Because emulating the ARM architecture on x86/amd64 CPUs introduces severe performance losses,
we conducted our tests using a AVD emulator running a x86 64 Android ROM.
If developers do not possess a rooted ARM Android device, their only option is to configure an
AVD emulator to use an Android ARM ROM and work through the very slow emulation process over
x86/amd64, while hoping that none of the processes crash.
Another limitation in our framework has to do with interacting with the target application when
running the instrumentation checks. We can setup hooks to methods and classes, but we will not neces-
sarily see them being called. This happens because application’s logic flow might not be executing these
methods
DroidBot [64] performs an exhaustive and systematic interaction with the application, but this still
does not guarantee that its interaction with the application will trigger the hooked functions we wanted
to inspect. Ideally one could trace APIs and system calls in an attempt to identify the call stack required
to reach the desired functions, but the process is not consistent and can be very time-consuming.
57
5Conclusion
Contents
5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
58
5.1 Future Work
We struggled to implement in DroidSF everything we wanted to. Security testing of mobile applications
involves such a large variety of topics that we he had to focus on implementing essential features for our
framework.
The DroidSF framework is fully functional but there are plenty features and improvements we did
not find the time to implement. Below we suggest some new features and improvements we think would
complement the current framework.
• Improve the reports generated by static analysis to categorize each test following the Open Web
Application Security Project (OWASP) Top 10 structure.
• Implement more checks for static and dynamic analyses.
• Improve efficiency of dynamic analysis using the information from static analysis.
• Migrate some of the checks performed in ‘smali’ code, to use Androguard’s APIs and perform
in-memory regular expression searches.
• Implement dynamic analysis hooks to detect packers and to intercept SSL communications.
• Implement static analysis checks to detect anti-tampering measures.
• Implement checks to analyse native libraries included in APKs.
• Refactor the mechanism we use to handle output from Frida’s instrumentation.
• Integrate AndroidViewClient or DroidBot directly into the framework to streamline the dynamic
analysis and to perform an exhaustive coverage of all possible interactions.
• Automate interaction with Android Virtual Device (AVD) manager: add the possibility of creating,
starting, stopping and deleting AVD emulators directly from DroidSF.
• Develop a custom lightweight Android emulator using Docker to avoid having to install Android
Studio.
• Add web interface using Flask Python package to allow users to submit and check results from
analysed APKs.
59
5.2 Conclusions
The security testing of mobile applications is a relatively new area, and it has proved to be a very
interesting and challenging area due to its dynamic and ever-evolving ecosystem.
We feel it is of a tremendous importance to maintain mobile devices secure given the importance
of the data and functionality they hold. To this end, we understand the purpose behind the thorough
scrutiny that official application stores do on every application submitted.
While working on this thesis we understood how to leverage existing tools to perform the kind of
analysis an official application store does to submitted applications. Of course, we do not know for sure
which checks Google, Apple and other companies employ to detect vulnerable applications, but we are
confident we took a step in the right direction.
The sheer amount of information we found on topics such as Android platform, Vulnerability detection,
Malware detection, Reverse Engineering, etc., was enormous. One of our biggest challenges was to filter
out which information was relevant to this thesis.
It became clear to us during our research that, no matter how complex certain security features are,
bad actors will always try to find new ways to circumvent them. Even though the Android documentation
presents many standard security practices for developers to follow, there is always room for human error
and this just reinforces the need to use an automated testing framework that can identify problems before
the application is deployed.
We feel that we were successful in creating a flexible framework built around robust multi-platform
software, that can be extended to perform very complex tasks. We also believe that most developers, from
students to expert security testers, will find the DroidSF framework very useful to conduct multi-prone
analysis on Android mobile applications. It still has many areas that require improvements but we are
satisfied with what we built so far.
60
Bibliography
[1] Google, “Android platform guide,” https://developer.android.com/guide/platform, accessed:
10/04/2019.
[2] OWASP, “Android platform overview,” https://mobile-security.gitbook.io/
mobile-security-testing-guide/android-testing-guide/0x05a-platform-overview, accessed:
20/10/2018.
[3] A. Frumusanu, “A diagram of the android runtime architecture,” https://commons.wikimedia.org/
wiki/File:ART view.png, 2014, accessed: 09/04/2019.
[4] J. Huang, “Practice of android reverse engineering,” https://www.slideshare.net/jserv/
practice-of-android-reverse-engineering, accessed: 18/05/2017.
[5] Google, “Activity lifecycle,” https://developer.android.com/guide/components/activities/
activity-lifecycle.html, accessed: 26/04/2019.
[6] Google, “Safetynet attestation api,” https://developer.android.com/training/safetynet/attestation.
html, accessed: 26/04/2019.
[7] Ericsson, “The ericsson mobility report,” https://www.ericsson.com/en/mobility-report/reports/
november-2018/key-figures, accessed: 19/03/2019.
[8] M. Egele, T. Scholte, E. Kirda, and C. Kruegel, “A survey on automated dynamic malware-analysis
techniques and tools,” ACM computing surveys (CSUR), vol. 44, no. 2, p. 6, 2012.
[9] A. Moser, C. Kruegel, and E. Kirda, “Exploring multiple execution paths for malware analysis,” in
2007 IEEE Symposium on Security and Privacy (SP’07). IEEE, 2007, pp. 231–245.
[10] Symantec, “Internet security threat report vol 23,” https://www.symantec.com/content/dam/
symantec/docs/reports/istr-23-executive-summary-en.pdf, 2018, accessed: 09/01/2019.
[11] Symantec, “Internet security threat report vol 24,” https://www.symantec.com/security-center/
threat-report, 2019, accessed: 10/04/2019.
61
[12] StatCounter, “Os worldwide market share,” http://gs.statcounter.com/os-market-share/mobile/
worldwide, accessed: 09/04/2019.
[13] K. Lab, “Mobile malware evolution 2018,” https://securelist.com/mobile-malware-evolution-2018/
89689/, accessed: 10/04/2019.
[14] R. Amadeo, “Android 9 pie, thoroughly reviewed,” https://arstechnica.com/gadgets/2018/09/
android-9-pie-thoroughly-reviewed/, Sep. 2018, accessed: 05/04/2019.
[15] D. R. Thomas, A. R. Beresford, T. Coudray, T. Sutcliffe, and A. Taylor, “The lifetime of android api
vulnerabilities: case study on the javascript-to-java interface,” in Cambridge International Workshop
on Security Protocols. Springer, 2015, pp. 126–138.
[16] P. Samuelson and S. Scotchmer, “The law and economics of reverse engineering,” The
Yale Law Journal, vol. 111, no. 7, pp. 1607–1663, 2002. [Online]. Available: http:
//www.jstor.org/stable/797533
[17] C. Hertel, “Samba: An introduction,” https://www.samba.org/samba/docs/SambaIntro.html, ac-
cessed: 06/04/2019.
[18] N. D. Patnaik, “Appmon - automated framework for monitoring and tampering mobile applications,”
https://github.com/dpnishant/appmon, accessed: 28/04/2018.
[19] C. Andre, “droidstat-x - android applications security analyser,” https://github.com/clviper/
droidstatx, accessed: 27/04/2018.
[20] OWASP, “Tampering and reverse engineering on android,” https://mobile-security.gitbook.io/
mobile-security-testing-guide/android-testing-guide/0x05c-reverse-engineering-and-tampering, ac-
cessed: 25/09/2018.
[21] L. Qiu, Y. Wang, and J. Rubin, “Analyzing the analyzers: Flowdroid/iccta, amandroid, and droid-
safe,” in Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and
Analysis. ACM, 2018, pp. 176–186.
[22] Google, “Enable multidex for apps with over 64k methods,” https://developer.android.com/studio/
build/multidex, accessed: 26/05/2018.
[23] Google, “Art and dalvik,” https://source.android.com/devices/tech/dalvik/, accessed: 26/05/2018.
[24] B. Pan, “Dex2jar,” https://github.com/pxb1988/dex2jar, accessed: 28/04/2018.
[25] Google, “Enjarify,” https://github.com/google/enjarify, accessed: 06/10/2018.
62
[26] M. Strobel, “Procyon java decompiler,” https://bitbucket.org/mstrobel/procyon, accessed:
03/03/2019.
[27] [email protected], “Cfr - another java decompiler,” http://www.benf.org/other/cfr/, accessed:
03/03/2019.
[28] B. Gruver, “Smali assembler/disassembler,” https://github.com/JesusFreke/smali, accessed:
06/04/2018.
[29] D. Morrill, “Inside the android aplication framework,” https://sites.google.com/site/io/
inside-the-android-application-framework, accessed: 30/03/2018.
[30] Google, “Understand the apk structure,” https://developer.android.com/topic/performance/
reduce-apk-size#apk-structure, accessed: 30/03/2018.
[31] Google, “Application fundamentals,” https://developer.android.com/guide/components/
fundamentals, accessed: 26/05/2018.
[32] Google, “Sign your app,” https://developer.android.com/studio/publish/app-signing, accessed:
15/05/2018.
[33] C. Tumbleson, “Apktool repository,” https://github.com/iBotPeaches/Apktool, accessed:
24/03/2018.
[34] Google, “Introduction to activities,” https://developer.android.com/guide/components/activities/
intro-activities, accessed: 26/04/2019.
[35] Google, “Intents and intent filters,” https://developer.android.com/guide/components/
intents-filters, accessed: 26/04/2019.
[36] Google, “Android interface definition language,” https://developer.android.com/guide/components/
aidl, accessed: 26/05/2018.
[37] Y.-C. Lin, “Androbugs framework - an android application secu-
rity vulnerability scanner,” https://www.blackhat.com/docs/eu-15/materials/
eu-15-Lin-Androbugs-Framework-An-Android-Application-Security-Vulnerability-Scanner.pdf,
accessed: 05/04/2018.
[38] H. Lockheimer, “Android and security,” http://googlemobile.blogspot.pt/2012/02/
android-and-security.html, accessed: 17/05/2018.
[39] Google, “Application signing,” https://source.android.com/security/apksigning/, accessed:
26/05/2018.
63
[40] appium, “objection - runtime mobile exploration,” https://github.com/appium/sign, accessed:
28/04/2018.
[41] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. Mc-
Daniel, “Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis
for android apps,” Acm Sigplan Notices, vol. 49, no. 6, pp. 259–269, 2014.
[42] F. Wei, S. Roy, X. Ou et al., “Amandroid: A precise and general inter-component data flow analysis
framework for security vetting of android apps,” in Proceedings of the 2014 ACM SIGSAC Conference
on Computer and Communications Security. ACM, 2014, pp. 1329–1341.
[43] M. I. Gordon, D. Kim, J. H. Perkins, L. Gilham, N. Nguyen, and M. C. Rinard, “Information flow
analysis of android applications in droidsafe.” in NDSS, vol. 15, 2015, p. 110.
[44] I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani, “Crowdroid: behavior-based malware detection
system for android,” in Proceedings of the 1st ACM workshop on Security and privacy in smartphones
and mobile devices. ACM, 2011, pp. 15–26.
[45] A. Shabtai and Y. Elovici, “Applying behavioral detection on android-based devices,” in Interna-
tional Conference on Mobile Wireless Middleware, Operating Systems, and Applications. Springer,
2010, pp. 235–249.
[46] B. Livshits, Improving software security with precise static and runtime analysis. Standford Uni-
versity, 2006, vol. 67, no. 11.
[47] L. Zhiqiang, “Dynamic Binary Instrumentation,” 2012. [Online]. Available: https://pdfs.
semanticscholar.org/presentation/17b7/9b6d7f232d02073593accd00570e124bc031.pdf
[48] M. Christodorescu and S. Jha, “Static analysis of executables to detect malicious patterns,” WIS-
CONSIN UNIV-MADISON DEPT OF COMPUTER SCIENCES, Tech. Rep., 2006.
[49] A. Moser, C. Kruegel, and E. Kirda, “Limits of static analysis for malware detection,” in Twenty-
Third Annual Computer Security Applications Conference (ACSAC 2007). IEEE, 2007, pp. 421–430.
[50] O. A. V. Ravnas, “Frida: A dynamic instrumentation toolkit,” https://frida.re/, accessed:
27/05/2018.
[51] OWASP, “Android anti-reversing defenses,” https://mobile-security.gitbook.io/
mobile-security-testing-guide/android-testing-guide/0x05j-testing-resiliency-against-reverse-engineering,
accessed: 25/05/2018.
[52] S. Alexander-Bown, “Android security: Adding tampering detection to your app,” https://www.
airpair.com/android/posts/adding-tampering-detection-to-your-android-app.
64
[53] C. M. John Kozyrakis, “Inside android’s safetynet attestation,” https://www.mulliner.org/
collin/publications/eu-17-Mulliner-Kozyrakis-Inside-Androids-SafetyNet-Attestation.pdf, accessed:
28/05/2018.
[54] Google, “Safetynet attestation api,” https://developer.android.com/training/safetynet/attestation,
accessed: 28/05/2018.
[55] E. Lafortune and GuardSquare, “Proguard manual,” https://www.guardsquare.com/en/proguard/
manual/introduction, accessed: 18/05/2018.
[56] Google, “Android studio,” https://developer.android.com/studio/, accessed: 05/04/2018.
[57] E. Lafortune and GuardSquare, “Dexguard,” https://www.guardsquare.com/en/dexguard, accessed:
22/04/2018.
[58] LLVM, “Obfuscator-llvm wiki,” https://github.com/obfuscator-llvm/obfuscator/wiki.
[59] strong.codes, “strong.codes sa,” https://www.linkedin.com/company/strong-codes/, accessed:
27/05/2018.
[60] B. Mueller, “Apkx - android apk decompilation for the lazy,” https://github.com/b-mueller/apkx,
accessed: 27/01/2019.
[61] sensepost, “objection - runtime mobile exploration,” https://github.com/sensepost/objection, ac-
cessed: 28/04/2018.
[62] skylot, “jadx - dex to java decompiler,” https://github.com/skylot/jadx, accessed: 27/04/2018.
[63] A. Desnos and G. Gueguen, “Androguard repository,” https://github.com/androguard/androguard,
accessed: 24/03/2018.
[64] Y. Li, “droidbot - lightweight test input generator for android,” https://github.com/honeynet/
droidbot, accessed: 25/03/2019.
[65] A. Desnos and G. Gueguen, “Androguard documentation,” http://androguard.readthedocs.io/en/
latest/index.html, accessed: 24/03/2018.
[66] J. Kozyrakis, “Using frida on android without root,” https://koz.io/
using-frida-on-android-without-root/, accessed: 28/05/2018.
[67] OWASP, “Welcome to owasp,” https://www.owasp.org/index.php/Main Page, accessed:
06/04/2018.
[68] OWASP, “Mobile top 10 2016,” https://www.owasp.org/index.php/Mobile Top 10 2016-Top 10, ac-
cessed: 06/04/2018.
65
[69] Unity3D, “Android hardware stats,” https://web.archive.org/web/20170808222202/http://hwstats.
unity3d.com:80/mobile/cpu-android.html, accessed: 04/04/2019.
66