Modeling Mobile Resource Security

redlemonbalmΚινητά – Ασύρματες Τεχνολογίες

10 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

188 εμφανίσεις

Università degli Studi Roma Tre
Laurea Magistrale in Matematica
Modeling Mobile Resource Security
Supervisor:
Dr.Roberto Di Pietro
Assistant Supervisor:
Dr.Flavio Lombardi
Candidate:
Sara Rossicone
Academic Year 2012/2013
1
Mobile devices are rapidly becoming the dominant computing platforms.The mobility and connectiv-
ity these devices afford,provide immense utility.As such,recent years have seen a growth in the number
of security sensitive applications that run on commercially available mobile devices.
In particular,most popular mobile platforms,such as Android,Symbian,iOS,Blackberry,and WinCE,
provide access to application markets that allow third party applications to be downloaded and installed
onto the device.In addition,however some devices permit the installation of apps fromunknown sources.
Several solutions aiming to solve this problem have been proposed and some products are commer-
cially available.We have chosen Android,since it is open source and allows greater experimental activity.
Android is a popular mobile operating system that is installed in millions of devices and accounted
for more than 50% of all smartphone sales in the third quarter of 2011.The popularity of Android and
the open nature of its application marketplace makes it a prime target for attackers (i.e.malware).
The mobile threat model includes three types of threats:malware,grayware,and personal spyware.
Malware gains access to a device for the purpose of stealing data,damaging the device,or annoying the
user,etc.The attacker defrauds the user into installing the malicious application or gains unauthorized
remote access by taking advantage of a device’s vulnerability.Malware provides no legal notice to the
affected user.This threat includes Trojans,worms,botnets,and viruses.Malware is illegal in many
countries,including the United States,and the distribution of it may be punishable by jail time.Spyware
collects personal information such as location or text message history over a period of time.Personal
spyware sends the victim’s information to the person who installed the application onto the victim’s
device,rather than to the author of the application.Grayware spies on users,but the companies that
distribute grayware do not aim to harm users.
The rapid growth of mobile malware calls for effective malware detection on mobile devices.Each
application specifies which resources of the device are required.Users can grant or deny its installation
and their permissions needed.Even if a user can be warned about the risk of having accepted suspicious
permission,the spreading of real malware has demonstrated that users directly trust any application
request and install them on their phones.
Different approaches have been proposed to contain security risks.Many researchers have identified
areas of concern and proposed solutions to problems on smartphone operating systems.
As we will better illustrate,several tools were developed to identify information leaks on smartphone
platforms using dynamic analysis techniques.Another line of research deals with the confused-deputy
problem on Android,where inter-process communication channels can inadvertently expose privileged
functionality to unprivileged callers.Other systems attempt to use or extend the Android permission
system to defend against malware.Besides the above defenses,some work has been proposed to apply
common security techniques from the desktop to mobile devices.In particular we will turn our attention
to dynamic and static analysis,underlying their complementary role in detecting malicious apps’ behav-
ior,combining them with tainting technique.
The principal aimof our work is to model mobile resource security evaluating both static and dynamic
analysis techniques.Static analysis,mostly used by antivirus companies,is based on source code or
binaries inspection looking at suspicious patterns.Through some tools,we will show how to decompile
apk files and retrieve the source code.To this end,we have analyzed two important malicious app which
have been gripped Android’s users for different months:DroidKungFu and Mobile Zeus.
On the other hand,the dynamic analysis or behavior-based detection,involves running the sample in
a controlled and isolated environment in order to analyze its execution traces.
To perform our experimental activity we will examine DECAF/DroidScope,a fine grained dynamic
binary instrumentation tool for Android that rebuilds two levels of semantic information:operating
system and Java.In particular we will make use of the DalvikInstructionTracer plugin which will be
responsible for tracking all events and printing them on a file.
Due to obfuscation and encryption of real malicious apps’ code,we have written an application which
would follow their same behavior,but that would allow us to monitor it at real time.The application at
issue,steals all kinds of sensitive data,i.e.IMEI number,contacts,GPS Coordinates.Giving it numerous
2
of permissions in the Manifest File,we submitted a complete example of what malicious programmers
aim to gain from users.Through tainting technique,we will prove that the plugin really traces events.
In doing so,we will establish what variables could be considered notable.
Furthermore,we will show how to launch an attack by flooding the system with huge amount of
requests and causing a Denial of Service.We recall that a Denial of Service attack is an effort to make
one or more computer systems unavailable and it is typically targeted at web servers,but it can also
be used on mail servers,name servers,and any other type of computer system.Denial of service (DoS)
attacks may be initiated from a single machine,but they typically use many computers to carry out an
attack.Therefore,distributed denial of service (DDoS) attacks are often used to coordinate multiple
systems in a simultaneous attack.Our objective involves the creation of hundreds of threads which steal
the IMEI number and by making lots of un-optimal calculations,cause the halt of the plugin and of the
entire system.Once performed the attack,we will take care to find a solution to the flooding,blocking
the requests.
The solution will be pursued with the string matching technique.String searching algorithms,some-
times called string matching algorithms,are an important class of string algorithms that try to find a
place where one or several strings (also called patterns) are found within a larger string.Our anti-flooding
strategy consists in monitoring the writing of the dalvik file during the activity execution and in stopping
its malicious behavior as soon a particular string is found.
Android Security
Android Security [12] is based on different mechanisms.
It protects applications and data through a combination of two enforcement mechanisms,one at the
system level and the other at the Inter Component Communication level (ICC level).
ICC mediation defines the core security framework,but it builds on the guarantees provided by the un-
derlying Linux system.In the general case,each application runs as a unique user identity,which enables
Android to limit the potential damage of programming flaws.
ICC isn’t limited by user and process boundaries.Because the file must be world-readable and writable for
proper operations,the Linux system has no way of mediating ICC.Although user separation is straight-
forward and easily understood,controlling ICC is much more subtle and warrants careful consideration.
As the central point of security enforcement,the Android middleware mediates all ICC establishment
by reasoning about labels assigned to applications and components.In its simplest form,access to each
component is restricted by assigning it an access permission label;this text string need not be unique.
When a component initiates ICC,the reference monitor looks at the permission labels assigned to its
containing application,and if the target component’s access permission label is in that collection,it allows
ICC establishment to proceed.If the label isn’t in the collection,establishment is denied even if the com-
ponents are in the same application.The developer assigns permission labels via the XML Manifest File
that accompanies every application package.In doing so,the developer defines the application’s security
policy i.e.assigning permission labels to an application specifies its protection domain,whereas assigning
permission to the components in an application specifies an access policy to protect its resources.Because
applications often contain components that other applications should never have access to,another way
to strengthen Android security is to declare a component private in the Manifest File.If a component is
private,the only components that can interact with it are those from the same app or another app that
runs with the same Unique Identification Number (UID).By making a component private,the developer
doesn’t need to worry which permission label to assign it,or how an other application might acquire that
label.
Components aren’t the only resource that require protection.In fact,unprotected intent broadcasts
can unintentionally leak information to explicitly listening attackers.To challenge this,the Android
Application Programming Interfaces (API) for broadcasting intents optionally allow the developer to
specify a permission label to restrict access to the intent object.The access permission label assignment
to a broadcasted intent restricts the set of applications that can receive it.The Manifest File therefore,
doesn’t give the entire picture of the application’s security.Let’s now introduce the concept of a “pending
intent”.
3
A developer can define an intent object as normally done to perform an action (to start an activity,for
example) but instead of performing the action,it can pass the intent to a special method that creates a
Pending Intent object,corresponding to the desired action.
The Pending Intent object is simply a reference pointer that can allow applications included with the
framework,to integrate better with third-party applications.Pending intents let applications direct the
broadcast to a specific private broadcast receiver.This prevents forging without the need to coordinate
permissions with system applications.Used correctly,they can improve an application’s security.In fact,
several Android APIs require pending intents,such as the Location Manager,which has a “proximity up-
date” feature that notifies an application via intent broadcast when a geographic area is entered or exited.
Not all system resources are accessed through components,for example,Android provides direct
API access.In fact,the services that provide indirect access to hardware often use APIs available to
third-party applications.Android protects sensitive APIs with additional permission label checks:an
application must declare a corresponding permission label in its Manifest File to use them.By protecting
sensitive APIs with permissions,Android forces an application developer to declare the desire to interface
with the system in a specific way.Consequently,vulnerable applications can’t gain unknown access if
exploited.The most commonly encountered protected API is for network connections.
Early versions of the Android Software Development Kit (SDK) let developers mark a permission as
“application” or “system”,extending the previous model into four protection levels for permission labels:
• Normal permissions,like the old permissions,are granted to any applications that request them in
its Manifest;
• Dangerous permissions are given only after user confirmation;
• Signature permissions are granted only to applications signed by the same developer key as the
package defining the permission;
• Signature or system permissions act like signature permissions but exist for legacy compatibility
with the older systempermission type,only Google applications can directly interface the telephony
API,for example.
The standard permission system described so far,is often not sufficient when used with content
providers.A content provider may want to protect itself with read and write permissions,while its direct
clients also need to hand in specific Uniform Resource Identifiers (URIs) to other applications for them
to operate on.Recall that Android uses a special content URI to address content providers,optionally
specifying a record within a table.The developer can pass such a URI in an intent’s data field for
example,an intent can specify the VIEW action and a content URI identifying an image file.If used to
start an activity,the system will choose a component in a different application to view the image.If
the target application doesn’t have read permission to the content provider containing the image file,
the developer can use a URI permission instead.In this case,the developer sets a read flag in the
intent that grants the target application access to the specific intent identified record.URI permissions
are essentially capabilities for database records.Although they provide least privilege access to content
providers,the addition of new delegation mechanisms further diverges fromthe original Mandatory Access
Model (MAC).
Android Architecture
Android is an open source software stack for mobile devices.The architecture of Android is distributed
in different levels or layers where the bottom provides a service to the upper.
As shown in Figure 1,these levels are:
• the operating system (OS);
• libraries with the Dalvik Virtual Machine (DVM);
4
LINUX KERNEL
Display
Driver
Wifi
Driver
Audio
Driver
Binder(IPC)
Driver
Power
Management
Process
Management
Memory
Management
APPLICATION FRAMEWORK
Activity
Management
Window
Management
Notification
Management
Package
Management
Resource
Management
Content
Providers
View
System
APPLICATIONS
Native Android Apps Third Party Apps
Android Runtime
Core
Libraries
Dalvik
Virtual
Machine
LIBRARIES
SQL
lite
WebKit
OpenGl ES
FreeType
Surface
Manager
Media
Framework
SSL SGL libc
Figure 1:Android architecture
• the Application Framework;
• applications;
From the bottom,the Linux kernel provides basic services such as memory management,process
scheduling and file system.At a higher level there are the native libraries developed in C and C ++.
These,together constitute the core of Android.
At the third layer,we have the Android Application Framework,consisting of a series of components
and APIs.At the top of the software stack lies the application layer,which contains a set of built-in core
applications and third party applications installed by users.
Android defines four component types:
• Activity components,define applications user interface.Typically,an application developer defines
one activity per “screen”.Activities start each other,possibly passing and returning values.Only
one activity on the system has keyboard and processing focus at a time;all others are suspended.
• Service components perform background processing.When an activity needs to perform some
operations that have to take charge after the user interface has disappeared (such as download a
file or play music),it commonly starts a service specifically designed for that action.Services often
define an interface for Remote Procedure Call (RPC) that other system components can use to
send commands and retrieve data,as well as register callbacks.
• Content provider components store and share data using a relational database interface.Each
content provider has an associated “authority” describing the content it contains.
5
• Broadcast receiver components act as mailboxes for messages from other applications.Broadcast
receivers subscribe to such destinations to receive the messages sent to it.Application code can
also addresses a broadcast receiver explicitly by including the name space assigned to its containing
application.
Android provides several means for applications communication.The primary mechanismfor components
interactions is an Intent,which is simply a message object,containing a destination component address
and data.The APIs define methods that accept intents and use such informations to start each of the
four component type already been said.As these methods are invoked,the Android framework begins
executing code in the target application.This process of ICC is known as an action.ICC is analogous
to inter-process communication (IPC) in Unix-based systems.
Every Android application runs in its own process,with its own instance of the Dalvik virtual machine.
Dalvik has been written so that a device can run multiple VMs efficiently.The Dalvik VM executes files
in the Dalvik Executable (.dex) format which is optimized for minimal memory footprint.The VM is
register-based,and runs classes compiled by a Java language compiler that have been transformed into
the.dex format by the included “dx” tool.Android application package file (APK) is the file format used
to distribute and install application software and middleware onto Google’s Android operating system.
To make an APK file,a program for Android is first compiled,and then all of its parts are packaged
into one file.This holds all of that program’s code (such as.dex files),resources,assets,certificates,and
Manifest File.
Android Malware Detection
Rootkits are a class of malware that infects code and data of OS kernel.By infecting the kernel itself,
they gain control over the layer that is traditionally considered the Trusted Computing Base (TCB) on
most systems.
Bickford et al.,in [3] have focused on security versus energy tradeoffs for host-based rootkit detection.
Some emerging proposals for malware detection have examined how to sidestep the energy constraints
using offloaded architectures in which the malware detector itself executes on a well-provisioned server
and monitors mobile devices.Unfortunately,malware detection offload either incurs significant power
expenditures,due to data upload,or has limited effectiveness,because it is best suited to traditional
signature-based scanning.Such signature scanning is easily defeated with encryption,polymorphism and
other stealth techniques.For this reason,there is growing consensus that signature-based scanning must
be supplemented with powerful host-based agents that,for example,employ behavior-based detection
algorithms.They have presented a framework to quantify the degree of security being traded off when
prolonging battery life,and ways in which such tradeoffs can be implemented.Specifically,they have
studied security tradeoffs along two axes:the surface of attacks that the malware detector will cover,and
the frequency whereby the malware detector will be invoked.The first technique,based on Patagonix,
detects rootkits by monitoring code integrity;the second technique,based on Gibraltar,monitors kernel
data integrity.
Rastogi et al.[24] have developed a systematic framework called DroidChameleon with several common
transformation techniques that may be used to transform Android applications automatically.Some of
these transformations are highly specific to the Android platform only.Based on the framework,they
have passed known malware samples (from different families) through these transformations to generate
new variants of malware,which are verified to possess the originals’ malicious functionality.
In this scenario Burguera et al.in [6] have proposed an approach to analyze the behavior of Android
applications,providing a framework to distinguish between applications that,having the same name and
version,behave differently.The main feature of their work has been the use of a crowd-sourced system
obtaining the traces of applications’ behavior,which helps researchers to collect different samples of
application execution traces.The whole analysis process is performed on a dedicated remote server.This
server will be used exclusively to collect information and detect malicious and suspicious applications
in the Android platform.Then,a lightweight client,called Crowdroid,is in charge of monitoring Linux
Kernel system calls and sending them preprocessed to a centralized server.
One way to extend the control of users and trusted third parties on smartphones is to use context-related
6
policies.So,Conti et al.[8] have presented CRePE,a system that is able both to enforce polices at
run-time and also allow trusted third parties.With its elaborate architecture,CRePE is able to define
contexts and rules over them without reducing Android security.
Static VS.Dynamic Analysis
So far two approaches have been proposed for the analysis and detection of malware:static analysis and
dynamic analysis.Static analysis,mostly used by antivirus companies,is based on source code or binaries
inspection looking at suspicious patterns.On the other hand,dynamic analysis or behavior-based detec-
tion,involves running the sample in a controlled and isolated environment in order to analyze its execution
traces.Static analysis works have also been proposed for malware detection in individual smartphones.
Antivirus companies have adapted their signature-based detection systems to smartphones,but consider-
ing the level of resources needed by antivirus techniques and the power and memory constraints of mobile
devices,in-phone analysis is not a preferred solution to apply in smartphones.Static analysis is known
to be vulnerable to code obfuscation techniques,which are common place for desktop malware and are
expected for Android malware.In fact,the Android SDK includes a tool named Proguard citepproguard
for obfuscating Apps.Researchers have also demonstrated that bytecode randomization techniques can
be used to completely hide the internal logic of a Dalvik bytecode program.Static analysis also falls
short for exploit diagnosis,because a vulnerable runtime execution environment is needed to observe and
analyze an exploit attack and pinpoint the vulnerability.
Dynamic analysis is immune to code obfuscation and is able to see the malicious behavior on an actual
execution path.Its downside is lack of code coverage,although it can be ameliorated by exploiting mul-
tiple execution paths.
Sohr et al.in [27] have employed Java Modeling Language (JML) to specify security requirements for
Java 2 Micro Edition (J2ME) APIs,and check at compile-time if the implementation satisfies the re-
quirements.
An other approach is presented in RiskRanker [18],a tool able to assess risks from existing (untrusted)
apps for zero-day malware detection.Grace et al.[18] have analyzed factory stock apps to identify
permission leakage,a threat that also spurred studies on its runtime mitigations.
In spite of RiskRanker being effective in archiving its own goals,it targets the vulnerabilities that only
represent a subset of component hijacking (i.e.hijacks seeking to access non-permission-protected sensi-
tive resources are not covered).Plus,it doesn’t intend to provide any in-depth detection method suited
for scalable app vetting.
Lu’s et al.in [22] work,therefore have aimed to bridge this gap.CHEX follows a static program analysis
approach,featuring a novel data-flow analyzer specially designed to accommodate Android’s special app
programming paradigms.Static analysis makes sense for vetting benign apps in that,the anti-analysis
techniques that are commonly used in adversarial scenarios are out of scope,and the advantages of static
analysis,such as its completeness and bounded time complexity,are well suited to addressing the vul-
nerability discovery problem.To test CHEX,they have built a generic Android app analysis framework
named Dalysis,which stands for Dalvik bytecode analysis.As suggested by its name,Dalysis directly
works on off-the-shelf app packages (or Dalvik bytecode) without requiring source code access or any
decompilation assistance.
Virtualization and Android
It is widely accepted that dynamic analysis is indispensable,because malware is often heavily obfuscated
to thwart static analysis.Furthermore,runtime information is often needed for exploit diagnosis.Vir-
tualization based analysis has proven effective against evasion,because all of the analysis components
are out of the box and are more privileged than the runtime environment being analyzed,including the
malware.Based on dynamic binary translation and hardware virtualization techniques,several analysis
platforms have been built for analyzing desktop malware.These platforms are able to bridge the semantic
gap between the hardware-level view from the VMM and the OS-level view within the virtual machine
7
using virtual machine introspection techniques.
The advantages of virtualization-based analysis approaches are two-fold:
1.As the analysis runs underneath the entire virtual machine,it is able to analyze even the most
privileged attacks in the kernel;
2.As the analysis is performed externally,it becomes very difficult for an attack within the virtual
machine to disrupt the analysis.
The downside,however,is the loss of semantic contextual information when the analysis component is
moved out of the box.To reconstruct the semantic knowledge,Virtual Machine Introspection (VMI),a
family of techniques that rebuilds a guest context from the VMM,is needed to intercept certain kernel
events and parse kernel data structures.The security benefits of virtualization have been rigorously
and repeatedly established.Traditionally the use of virtualization as a tool for building secure systems
has been the purview of the desktop and server environments.While the recent interest in mobile
virtualization is promising,it is still unclear how to best architect secure systems with this technology.
The system design presented by Gudeth et al.in [19] is based on bare metal virtualization,a design
choice specifically selected to satisfy the minimization of the TCB.They have recommended the use of
a bare metal hypervisor,which typically consists of orders of magnitude fewer lines of code than a full
OS.A bare metal hypervisor runs directly on the hardware with all guest OSs and optionally individual
applications running in their own virtual machine.Hence,any attack exploiting vulnerabilities in OS or
drivers,is thwarted by the bare metal hypervisor.
Despite the fact that Android is based on Linux,it is not straightforward to take the same desktop
analysis approach for Android malware.So,Yan et al.,authors of DroidScope in [34],have aimed to
reconstruct semantic knowledge at two levels all in a unified analysis platform.
These two levels are:
1.OS-level semantics (how information about processes,threads,memory mappings and system calls
are rebuilt at runtime),that understand the activities of the malware process and its native com-
ponents;
2.Java-level semantics,that comprehend the behaviors in the Java components.
Yan et al.in [33] have recorded malware execution using hardware virtualization for transparency,and
then replayed and analyzed the malware’s execution using dynamic binary translation for flexibility and
efficiency of in depth analysis.Their platform,V2E,needs to work under the malicious context:the
emulator should exactly replay the execution recorded from the hardware virtualization platform in spite
of the fact that malware is trying to detect every possible heterogeneous property in these two systems.
Android Domain Isolation
Although virtualization provides strong isolation,it duplicates the entire Android software stack,which
renders those approaches quite heavy-weight in consideration of the scarce battery life of smartphones.
A possible approach to mitigate this problem could be the automatic hibernation of VMs currently not
displayed to users,even if,currently available mobile virtualization technology,does not provide these
features.Default Android,in fact,has no means to group applications and data into domains,where a
domain compromises a set of applications and data belonging to one trust level.
Bugiel et al.in [4] have presented the design and implementation of XManDroid (eXtended Monitoring
on Android),a security framework that extends the monitoring mechanism of Android to detect and
prevent application-level privilege escalation attacks at runtime,based on a system-centric system policy.
Their implementation analyzes dynamically applications’ transitive permissions usage while inducing a
minimal performance overhead unnoticeable for users.In contrast to existing solution,Bugiel et al.in
[5] have presented TrustDroid,a lightweight solution,which doesn’t require duplication of Android’s
middleware and kernel.It enables isolation at different layers of the Android software stack:
• at the middleware layer,to prevent inter-domain application communication and data access;
8
• at the kernel layer to enforce mandatory access control on the file system and on Inter-Process
Communication (IPC) channels;
• and at the network traffic.
In particular,TrustDroid exploits coloring of separate and distinguishable components.The assign-
ment of colors for applications and user data,is based on a certification scheme which can be easily
integrated into Android.Based on the applications colors,TrustDroid organizes applications along with
their data into logical domains.At runtime,TrustDroid monitors all application communications,access
to common shared databases,as well as file-system and network access.It also denies any data exchange
or application communication between different domains.
Data Tainting
Definition 1 (Flow).An operation,or series of operations,that uses the value of some object,say x,to
derive a value for another,say y,causes a flow from x to y.
Two types of flows are defined:explicit flows such as x = y,where we observe an explicit transfer of
a value from x to y,and implicit flows (control flows) were there is no direct transfer of value from a to
b,but when the code is executed,b would obtain the value of a.
Definition 2 (Tainted).If the source of the value of the object X is untrustworthy,we say that X is
tainted.
Definition 3 (To taint).To “taint” user data is to insert some kind of tag or label for each object of the
user data.The tag allow to track the influence of the tainted object along the execution of the program.
Definition 4 (Taint propagation).If an operation uses the value of some tainted object,say X,to derive
a value for another,say Y,then object Y becomes tainted.
Object X taints the object Y,through taint operator t:X →t(Y ).
Taint operator is transitive:
X →t(Y ),t(Y ) →t(Z),⇒X →t(Z)
Two of the most commonly employed dynamic analysis techniques in security research,are dynamic
taint analysis and forward symbolic execution.Dynamic taint analysis runs a programand observes which
computations are affected by predefined taint sources such as user input.Dynamic forward symbolic ex-
ecution automatically builds a logical formula describing a program execution path,which reduces the
problem of reasoning about the execution to the domain of logic.The two analysis can be used in con-
junction to build formulas representing only the parts of an execution that depend upon tainted values.
The principle of dynamic taint analysis is to taint some of the data in a system and then propagate the
taint to data for tracking the information flow in the program.The dynamic taint analysis mechanism
is used primarily for vulnerability detection and protection of sensitive data.To detect the exploitation
of vulnerabilities,the sensitive transactions must be monitored to ensure that they are not tainted by
outside data.But this technique does not detect control flows which can cause an under-tainting problem,
i.e.that some values should be marked as tainted,but are not.An attacker can take advantage of an
indirect control dependency to exploit a vulnerability.
Enck et al.have presented TaintDroid [11],a sophisticated framework which detects unauthorized leak-
age of sensitive data.TaintDroid exploits dynamic taint analysis in order to label privately data with
a taint mark,auditing on track tainted data as it propagates through the system,and alerting users if
tainted data aims to leave the system.TaintDroid mainly addresses data flows,whereas privilege escala-
tion attacks also involve control flows.
A precise definition of dynamic taint analysis or forward symbolic execution must target a specific lan-
guage.Schwartz et al.[26] has used SimpIL:a Simple Intermediate Language.Although the language
is simple,it is powerful enough to express typical languages as varied as Java and assembly code.Indeed,
the language is representative of internal representations used by compilers for a variety of programming
languages.Aprogramin SimpIL language consists of a sequence of numbered statements.In recent years,
9
symbolic execution has advanced a lot.As has already been said,it is usually combined with dynamic
taint analysis and theorem proving,and is becoming a powerful technique in security analysis of software
programs.In particular,symbolic execution has been shown to be useful in discovering trigger-based
code (malicious in many cases,although not necessarily) and finding the corresponding trigger condition.
Wang et al.[29] have challenged the requirement of using cryptographic functions in obfuscation to make
symbolic execution difficult,and proposed a novel automatic obfuscation technique that makes use of
linear unsolved conjectures.There are a few advantages of using only linear operations in the obfuscation
without any cryptographic ones.First,the obfuscated code becomes less suspicious in malware detection.
The obfuscated code produced by their technique only adds a simple loop to the code,making the result-
ing obfuscated code similar to legitimate programs,e.g.,simple number sorting algorithms.Second,such
simple obfuscated code makes it possible for their technique to be combined with other obfuscation and
polymorphism techniques to achieve stronger protection.Third,the size of the obfuscated code is less
than one hundred bytes longer than the original program.Many unsolved conjectures,e.g.the Collatz
conjecture,involve some simple linear operations on integers that loop for an unknown number of times.
Such operations are usually fast and commonly used in basic algorithms in computer science.They are
perfect candidates to be used in obfuscations to make symbolic execution difficult.
a
i
=
￿
n for i = 0
f(a
i−1
) for i > 0
where f(n) =
￿
n
2
if n ≡ 0 (mod 2)
3n +1 if n ≡ 1 (mod 2)
Figure 2:Collatz conjecture:a
i
will eventually reach 1 regardless of the value of n.
Another advantage of using these unsolved conjectures is that they can be used to obfuscate inequality
conditions,a case the previous proposal is unable to handle.Although some inequality conditions could
be transformed to (a set of) equality conditions,it might become impractical when the inequality range
is big.Schwartz et al.have proposed and implemented an automatic obfuscater to incorporate unsolved
conjectures into trigger conditions in program source code.Extensive evaluations show that symbolic
execution would take hundreds of hours in order to figure out the trigger condition.
Haldar et al.,in [20],have presented a technique and a implementation for dynamically tracing tainted
user input in the Java Virtual Machine.Their technique tracks the taintedness of untrusted input through-
out the lifetime of the application.Taintedness is propagated in the obvious way - strings derived from
tainted strings are also considered tainted.That technique is completely transparent and the application
is completely unaware of it.It can be applied to an existing Java classfile,and does not need source code.
Static analysis is the analysis of computer software that is performed without actually executing
programs.
The goal of static analysis is,given a program and a set of initial states,to compute the set of states that
arise during the execution of the program.
A program is specified by a:
Definition 5 (Control flow graph).A control flow graph is denoted by a couple (CFG)G = (V,E),
where:
• V is a set of program locations;
• E ￿ V ×V is a set of edges that represent the flow of control;
The graph is examined to identify the ramifications of the control flow and check the existence of any
anomalies such as unreachable code.
Definition 6 (path).Let’s start and end:E →V be two functions that associate a start node and an
end node,respectively,with each edge,then a path d is a finite sequence of edges e
1
,e
2
,...e
k
such that
end(e
i
) = start(e
i+1
) ∀i = 1,...,k −1.
Definition 7 (Path Condition).A Path Condition (PC),for a given statement,indicates the conditions
that the input must satisfy for an execution to cover a path along which the statement is executed.
10
Dynamic
Static
Looks at a single path
Looks at multiple paths
Determines exact taint values for run
Must either over or under approximate
taint at confluence of paths
Must be run on each execution to detect
attacks
Can be used to add monitoring code for
only vulnerable paths
Table 1:Differences between static and dynamic analysis
We will say “executable path” a path for which there exists a set of input data that satisfies the path
condition.
One can generate an:
Definition 8 (Execution Tree).An execution tree has a node with each statement executed (labeled
with the statement number) and with each transition between statements a directed arc connecting the
associated nodes.For each forking IF statement execution,the associated node has two arcs leaving the
node which are labeled “T” and “F” for the true (THEN) and false (ELSE) parts,respectively.
Gibler et al.in [13] have presented AndroidLeaks,a static analysis framework for automatically
finding potential leaks of sensitive information in Android applications on a massive scale.AndroidLeaks
drastically reduces the number of applications and the number of traces that a security auditor has to
verify manually.Leveraging WALA [30],a program analysis framework for Java source and byte code,
they have created a call graph of an application’s code and then performed a reachability analysis to
determine if sensitive information may be sent over the network.If there is a potential path,they have
used dataflow analysis to determine if private data reaches a network sink.
An interesting tool which provides static analysis is Androguard [10].Androguard is mainly a tool written
in python to play with Dex/Odex (Dalvik virtual machine) (.dex) (disassemble,decompilation),APK
(Android application) (.apk),Android’s binary xml (.xml),Android Resources (.arsc).
Among the most important features it is able to map and manipulate DEX/ODEX/APK/AXM-
L/ARSC format into full Python objects,disassemble/Decompilation/Modification of DEX/ODEX/APK
format,decompile with the first native (directly from dalvik bytecodes to java source codes) dalvik de-
compiler (DAD).Androguard has been used in performing static analysis also by other research projects.
For example Androwarn [9] which is a tool whose main aim is to detect and warn the user about poten-
tial malicious behaviors.The detection is performed with the static analysis of the application’s Dalvik
bytecode,represented as Smali.This analysis leads to the generation of a report,according to a technical
detail level chosen from the user.
Even APKInspector [7] has as a goal to aide analysts and reverse engineers to visualize compiled Android
packages and their corresponding DEX code.APKInspector provides both analysis functions and graphic
features for the users to gain deep insight into the malicious apps.
And finally,Andrubis [21],a tool which analyzes unknown apps for the Android platform (APKs),just
like Anubis does for Windows executables.The report provided by Andrubis gives the human analyst
insight into various behavioral aspects and properties of a submitted app.To achieve comprehensive
results,Andrubis employs both static and dynamic analysis approaches.
Static analysis is useful at the time of application development,when potential vulnerabilities found
by the analysis can be fixed by the programmer in source code.Some human intervention is also needed
because static approaches,in order to be conservative,typically also report a number of false positives.
The programmer must then manually examine the reported errors to determine which are actual vulnera-
bilities and which are not.There are two problems that need to be dealt with.Firstly,the problem must
be specified correctly.This means getting all the rules and corner cases for validating user input correctly.
Secondly,this specification must be implemented faithfully.Static approaches can catch implementation
11
errors,but not bugs of specification.If a dynamic approach independently also performs its own checks,
it may be able to catch more errors than only static checking.However,static approaches do provide
more accurate reports than runtime approaches,enabling fixing vulnerabilities before an application is
deployed,and having no runtime performance overhead.
Because of the serious limitations explained with TaintDroid,Graa et al.in [17] have proposed
a hybrid approach from which it combines and benefits from the advantages of static and dynamic
analysis.To solve the under-tainting problem in the Android system they have used a hybrid approach
that improves the functionality of TaintDroid by integrating the concepts introduced by Trishul.Trishul
is an information flow control system.It is implemented in a Java virtual machine to secure execution
of Java applications by tracking data flow within the environment.It does not require a change to the
operating system kernel because it analyzes the bytecode of an application being executed.Trishul is
based on the hybrid approach to correctly handle implicit flows using the compiled program rather than
the source code at load-time.
Discussing Static Analysis:DroidKungFu and Zeus
DroidKungFu
This malware,which is included in repackaged apps made available through a number of alternative app
markets and forums targeting Chinese-speaking users.The malware adds into the infected app a new
service and a new receiver.The receiver will be notified when the system finishes booting so that it can
automatically launch the service without user interaction.Once the service gets started,DroidKungFu
will collect a variety of information on the infected mobile phone,including the IMEI number,phone
model,as well as the Android OS version.With the collected information,the malware phones home by
making a HTTP Post to a hard-coded remote server.
Specifically,instead of including plaintext remote server URLs,the malware encrypts them and has
three C&C servers for additional redundancy or robustness.Inside the infected app,there exists an
(encrypted) embedded apk that the malware will attempt to install after getting the root privilege.
Specifically,the embedded apk,once decrypted,appears to be a fake Google Update app.If installed,
this embedded apk does not show any icon in the home screen.Our analysis shows that this app is
actually a backdoor,which will connect back to a remote server for instructions.In essence,it effectively
converts the compromised phone into a bot.
Its onCreate() method will attempt to get root access on the phone using two separate exploits.One
of them,which is related to an embedded file named “ratc” (the acronym of “RageAgainstTheCage”),is
encrypted but will be decrypted at runtime (with the copyAssets method) and then executed to exploit
the adb resource exhaustion bug,which affects Android 2.2 or below.If successful,the malware can
elevate its privilege to root.Recent Android versions (2.3+) have patched this bug and this exploit will
not be successful.In this case,the malware will attempt to detect whether the phone has been already
rooted and if so further request the root privilege.In either case,the malware will still phone home with
collected phone information (e.g.,IMEI and phone model etc).After obtaining the root privilege,the
DroidKungFu malware can essentially access arbitrary files in the phone and have the capability to install
or remove any packages.One built-in payload of DroidKungFu is to install a hidden app named legacy
after getting the root privilege.The app is embedded as part of the infected host app and pretends to
be the legitimate Google Search app bearing with the same icon.It turns out that the fake app is a
backdoor.Within a short two-month period from June to August 2011,there were identified three most
important different versions of DroidKungFu malware.Clearly,while the anti-virus companies diligently
push out signatures to detect malware in the wild,the malware authors are also working hard to evolve
malware at a rapid pace to avoid detection.Anyway DroidKungFu now comes in different flavors (5 so
far),discovered by Pr.Xuxian Jiang (and research team) and Lookout.A brief presentation of their
differences can be obtained with its androsim.py tool.
12
1 sara@Sara−Compaq −8510w−KU288ES−ABZ:~/Scrivania/androguard\$./androsim.py −i/home/sara/←￿
Scrivania/Nuovo/droidKungFu.apk/home/sara/Scrivania/Nuovo/droidkungfu2.apk −d
2 Elements:
3 IDENTICAL:45
4 SIMILAR:34
5 NEW:356
6 DELETED:209
7 SKIPPED:0
8 −−> methods:13.900310 % of similarities
Listing 1:androsim.py tool.
As we can see from [28],the package name of the malware is “com.tutusw.phonespeedup”:
1 <manifest android:versionCode="14"android:versionName="1.3.1"android:installLocation="auto"←￿
package="com.tutusw.phonespeedup">
2 <uses−sdk android:minSdkVersion="3"android:targetSdkVersion="8"/>
3 <uses−permission android:name="androi d.permi ssi on.RECEIVE_BOOT_COMPLETED"/>
4 <uses−permission android:name="androi d.permi ssi on.WAKE_LOCK"/>
5 <uses−permission android:name="androi d.permi ssi on.VIBRATE"/>
6 <uses−permission android:name="androi d.permi ssi on.WRITE_EXTERNAL_STORAGE"/>
7 <uses−permission android:name="androi d.permi ssi on.ACCESS_NETWORK_STATE"/>//Check ←￿
connect i vi t y to remote s er ver
8 <uses−permission android:name="androi d.permi ssi on.ACCESS_WIFI_STATE"/>//Uses wi f i f or ←￿
connect i vi t y
9 <uses−permission android:name="androi d.permi ssi on.CHANGE_WIFI_STATE"/>
10 <uses−permission android:name="androi d.permi ssi on.INTERNET"/>//Communicate with remote ←￿
s er ver
11 <uses−permission android:name="androi d.permi ssi on.READ_PHONE_STATE"/>//Get i nf or mat i ons ←￿
from phone
Listing 2:Android Manifest.
The Android system requires that all installed applications must be digitally signed with a certificate
whose private key is held by the application’s developer.The Android system uses the certificate as a
means of identifying the author of an application and establishing trust relationships between applications.
The certificate is not used to control which applications the user can install.The certificate does not need
to be signed by a certificate authority:it is perfectly allowable,and typical,for Android applications to
use self-signed certificates.The system will not install an application on an emulator or a device if it is
not signed,thus,all applications must be signed.To test and debug an application,the build tools sign
your application with a special debug key that is created by the Android SDK build tools.The Android
system will not install or run an application that is not signed appropriately.Through Androguard [10]
we have been able to recover the application’s sign:
1 [ {"SAMPLE":"apks/malwares/kungfu/droi dkungf u.apk"},{"BASE":"AndroidOS","NAME":"←￿
DroidKungfu",
2"SIGNATURE":[
3 {"TYPE":"METHSIM","CN":"Lcom/googl e/s s ear ch/SearchServi ce;","MN":"getPermi ssi on1←￿
","D":"( ) Z"},
4 {"TYPE":"METHSIM","CN":"Lcom/googl e/s s ear ch/SearchServi ce;","MN":"←￿
getPermi ssi on2","D":"( )V"},
5 {"TYPE":"METHSIM","CN":"Lcom/googl e/s s ear ch/SearchServi ce;","MN":"←￿
getPermi ssi on3","D":"( )V"}
6 ],
7"BF":"a && b && c"
8 } ]
Listing 3:droidkungfu.sign from Androguard’s malware database.
When installed,the application checks if the malicious service named
"com.google.ssearch.SearchService“,is already running.
If the service is not found in the running services,it will start the service.We can see from the activities,
found in the Android Manifest File,that,inside the
”com.google.ssearch.GoogleSsearch“ activity,the malware will start its own service and then launch the
application’s primary activity.In the Android Manifest File,we can see a new receiver is declared for the
malware.The receiver will be able to notice when the system has completed boot process so that it can
start the service declared for the malware automatically without user interaction.At first,the malware
13
will check the shared preferences and then check the connectivity using the network information of the
device.Later it will collect information on the device,for example IMEI,operating system type,model
and more.The malware tries to connect a remote server.To get informations about it,we have used
Wireshark [32] which is a free and open-source packet analyzer.It is used for network troubleshooting,
analysis,software and communications protocol development,and education.Wireshark is cross-platform,
using the GTK+ widget toolkit to implement its user interface,and using pcap to capture packets;it runs
on various Unix-like operating systems including Linux,OS X,BSD,Solaris,and on Microsoft Windows.
There is also a terminal-based (non-GUI) version called TShark.Wireshark,and the other programs
distributed with it such as TShark,are free software,released under the terms of the GNU General
Public License.Similar to tcpdump,Wireshark has a graphical front-end,plus some integrated sorting
and filtering options.It allows the user to put network interface controllers that support promiscuous
mode into that mode,in order to see all traffic visible on that interface,not just traffic addressed to one
of the interface’s configured addresses and broadcast/multicast traffic.However,when capturing with
a packet analyzer in promiscuous mode on a port on a network switch,not all of the traffic traveling
through the switch will necessarily be sent to the port on which the capture is being done,so capturing
in promiscuous mode will not necessarily be sufficient to see all traffic on the network.Port mirroring
or various network taps extend capture to any point on net;simple passive taps are extremely resistant
to malware tampering.This malware encrypts two well known exploits named ’exploit’ (udev exploit)
and ’rage against the cage’ exploit.When the malware runs,it decrypts those two exploits and tries to
gain root access on the device.In the assets folder we can see 3 files which are binary files encrypted with
AES algorithm.The two exploits used by the malware,i.e.“exploid” and “rage against the cage”,are
well known exploits.The malware will try to get permissions using various methods:first,it checks the
permissions,second,it checks for the version and tries to get permissions.If the malware could not get
root,it asked the user to give it him.The exploit needs USB debugging (adb) to get this exploit to run
successfully.If USB is not enabled then it has to get it working,which can be achieved with the victim’s
approval.
Mobile Zeus
The Zeus malware (also known as Zbot) first appeared in 2006 when a security firm released a full reverse
engineering analysis of an unknown trojan named PRG.Since then,it has been modified and customized
to suite specific needs and released in different variants,each one offering innovative features to steal
sensitive information.Like most banking Trojans,the Zeus’s goal is to steal sensitive information that
could lead the attacker to carry out a financial fraud against the victim.The Zeus environment is usually
composed of three different entities:the bot,e.g.the machine that has been infected,the Command and
Center (C&C or dropzone) i.e.the main server where the control panel is hosted and where the bots
send the stolen information,and the configuration server i.e.the server where the configuration file is
hosted,ready to be downloaded by the bots.
Mobile ZeuS [23],or Trojan-Spy.*.Zitmo,was designed for one sole purpose:to quickly steal mobile
Transaction Authentication Number,(mTAN codes) without mobile users noticing.The first important
thing to point out is that ZitMo works in close collaboration with the regular ZeuS Trojan,a modification
of the Trojan that targets the Win32 platform.To defend users from this malware,Riccardi et al in [25]
have proposed a technique to extract the keystream used by Zeus to encipher its payload.In 2010,
malicious users added a new function to the PC-based ZeuS.The way it had worked remained more or
less the same,only now,a modified authentication page would also ask users to enter data about their
mobile device (the make,model,and telephone number) in addition to their username and password.
Discussing Dynamic Analysis:DroidScope and DECAF
DroidScope [34] is an Android analysis platform for virtualization-based malware analysis.DroidScope
reconstructs both the OS-level and Java-level semantics simultaneously and seamlessly.
Figure 3 illustrates the architecture of the Android system from the perspective of a system pro-
grammer.To demonstrate the capabilities of DroidScope,Yan et al.have developed several analysis
14
Linux Kernel
Zygote
System
Services
Native
Component
System
libraries
Java
Component
Java
Component
Java Libraries
DVM
JNI
API
Tracer
Native
insn.
Tracer
Dalvik
insn.
Tracer
Taint
Tracer
Java level
view
OS level
view
Instrumentation Interface
Droidscope
Figure 3:Overview of Android System in [34].
tools on it.The API tracer monitors the malware’s activities at the API level to reason about how the
malware interacts with the Android runtime environment.This tool monitors how the malware’s Java
components communicate with the Android Java framework,how the native components interact with
the Linux system,and how Java components and native components communicate through the JNI inter-
face.The Native instruction tracer and Dalvik instruction tracer look into how a malicious App behaves
internally by recording detailed instruction traces.The Dalvik instruction tracer records Dalvik bytecode
instructions for the malware’s Java components and the native instruction tracer records machine-level
instructions for the native components (if they exist).The Taint tracker observes how the malware ob-
tains and leaks sensitive information (e.g.,GPS location,IMEI and IMSI) by leveraging the taint analysis
component in DroidScope.Dynamic taint analysis has been proposed as a key technique for analyzing
desktop malware particularly with respect to information leakage behavior.To reconstruct the OS-level
view for DroidScope,they employed similar techniques used for x86 platforms,generally known as virtual
machine introspection.The OS-level view,in fact,is essential for analyzing native components.It also
serves a basis for obtaining the Java-level view for analyzing Java components.With basic instrumen-
tation support,they extract the following OS-level semantic knowledge:system calls,running processes,
including threads and the memory map.To obtain the system call information,special instructions,i.e.
insert the additional TCG instructions were instrumented.In doing so,a callback function is invoked
and it is responsible for retrieving additional information from memory.For important system calls (e.g.
open,close,read,write,connect,etc.),the system call parameters and return values are retrieved as well.
As a result,it is possible to understand how a user-level process accesses the file system and the network,
communicates with another process,and so on.With the OS-level view and knowledge of how the DVM
operates internally,it is possible to reconstruct the Java or Dalvik view,including Dalvik instructions,
the current machine state,and Java objects.DVM executes Dalvik bytecode in two ways:interpretation
and Just-In-Time compilation (JIT).The interpreter,named mterp,uses an offset-addressing method to
map Dalvik opcodes to machine code blocks.The Just-In-Time compiler was introduced to improve per-
formance by compiling heavily used,or hot,Dalvik instruction traces (consisting of multiple code blocks)
directly into native machine code.Overall,JIT provides an excellent performance boost for programs
that contain many hot code regions,although it makes fine-grained instrumentation more difficult.This
is because JIT performs optimization on one or more Dalvik code blocks and thus blurs the Dalvik in-
struction boundaries.Since completely disable JIT at build time may incur a heavy performance penalty,
the authors have chosen to selectively disable JIT at runtime.Java Objects are described using two data
structures.Firstly,ClassObject describes a class type and contains important information about that
class:the class name,where it is defined in a dex file,the size of the object,the methods,and the location
of the member fields within the object instances.To standardize class representations,Dalvik creates
15
a ClassObject for each defined class type and implicit class type,e.g.arrays.Secondly,as an abstract
type,Object describes a runtime object instance,i.e.member fields.Each Object has a pointer to the
ClassObject that it is an instance of plus a tail accumulator array for storing all member fields.Symbols
(such as function name,class name,field name,etc.) provide valuable information for human analysts
to understand program execution.Thus,DroidScope seeks to make the symbols readily available by
maintaining a symbol database.For portability,one database of offsets to symbols per module has been
used.At runtime,finding a symbol by a virtual address requires first identifying the containing module
using the shadow memory map,and then calculating the offset to search the database.Native library
symbols are retrieved statically through objdump and are usually limited to Android libraries since mal-
ware libraries are often stripped of all symbol information.On the other hand,Dalvik or Java symbols
are retrieved dynamically and static symbol information through dexdump is used as a fallback.This
has the advantage of ensuring the best symbol coverage for optimized dex files and even dynamically
generated Dalvik bytecode.
DECAF is a multi-target binary analysis platform.The core idea is to abstract away the details
of different targets,(e.g.the program counter register in x86 EIP contains the virtual address of the
current instruction being executed while the same register in ARM PC points to the next instruction) so
that the analyst can focus on doing important work (analyzing) (e.g.in DECAF there is one function
DECAF_getCurPC that returns the address of the instruction being executed and targets specific functions
to obtain the register values).In this way,what the analyst has to do is register for different events,
such as “block begin” or “instruction begin” or “system call”.Following a similar philosophy,DECAF
also provides multiple virtual machine introspection facilities so that no matter if the guest machine is
Windows or Linux,the analyst will still be able to readily obtain a shadow process list,among other
things.The current version of DroidScope is built for Android Gingerbread.Since the authors mainly
dealt with the 32-bit ARM architecture and included some files from the Android source code as part of
DroidScope,they always need that the host machine be a 32-bit machine.DECAF does not have this
limitation.Since the original paper,the authors of DroidScope have been porting it to run on top of the
DECAF [35] binary analysis platform.The immediate advantages are:
• Seamless ARM and x86 Native API support;
• Dynamic loading of plugins;
• More refined NativeAPI - more callbacks,and some bug fixes;
• Better Virtual Machine Introspection support.
Let us now introduce the core of our research:the DalvikInstructionTracer plugin.Recalling that the
DalvikInstructionTracer records Dalvik bytecode instructions for the malware’s Java components,let us
show most important pieces of source code:
1 s t a t i c mon_cmd_t DIT_term_cmds [ ] = {
2#i ncl ude"plugin_cmds.h"
3 {NULL,NULL,},
4 };
5
6 voi d DIT_cleanup ( )
7 {
8 i f ( gTracingPID!= −1)
9 {
10 mterp_clear ( gTracingPID );
11 gTracingPID = −1;
12 }
13
14 i f ( DIT_handle!= DECAF_NULL_HANDLE )
15 {
16 DS_Dalvik_unregister_callback ( DS_DALVIK_INSN_BEGIN_CB,DIT_handle );
17 DIT_handle = DECAF_NULL_HANDLE;
18 }
19 }
20
21 plugin_interface_t DIT_interface;
16
22
23 plugin_interface_t ∗ init_plugin ( voi d )
24 {
25 DIT_interface.mon_cmds = DIT_term_cmds;
26 DIT_interface.plugin_cleanup = &DIT_cleanup;
27
28//i n i t i a l i z e the pl ugi n
29 miofile=fopen ("da l v i k f i l e.txt","a");
30
31 DIT_init ( );
32 return (&DIT_interface );
33 }
Listing 4:The DalvikInstructionTracer plugin source code.
Our Solution:Enhanced Dynamic Analysis
Due to the limitation of using emulators,we will couldn’t perform tainting activities,so in this chapter
we illustrate a simple malicious application we have built to carry out our researches.
Thief Application
According to the studies upon the Android malware DroidKungFu,our application aims to steal sensitive
information,i.e.IMEI contacts and accounts,and perform background actions without users’ agreement.
Once the application is installed,users will display what is shown in Figure 4.
Figure 4:Main Screen of Thief Activity.
The core idea of the application is to send to a listening server,sensitive data stolen from devices.
Each time users press one of the buttons,a connection is set up,and data are transmitted.To establish
a connection,we have created a Client and a Server and stolen data are encapsulated into a Message.
Before explaining which kind of services are related to each button,let us have a look at Android
Manifest xml file to see permissions.
1 <uses−permission android:name="androi d.permi ssi on.ACCESS_WIFI_STATE"/>
2 <uses−permission android:name="androi d.permi ssi on.CHANGE_WIFI_STATE"/>
3 <uses−permission android:name="androi d.permi ssi on.ACCESS_NETWORK_STATE"/>
4 <uses−permission android:name="androi d.permi ssi on.ACCESS_FINE_LOCATION"/>
5 <uses−permission android:name="androi d.permi ssi on.ACCESS_COARSE_LOCATION"/>
6 <uses−permission android:name="androi d.permi ssi on.INTERNET"/>
17
7 <uses−permission android:name="androi d.permi ssi on.READ_PHONE_STATE"/>
8 <uses−permission android:name="androi d.permi ssi on.ACCESS_MOCK_LOCATION"/>
9 <uses−permission android:name="androi d.permi ssi on.READ_INTERNAL_STORAGE"/>
10 <uses−permission android:name="androi d.permi ssi on.WRITE_EXTERNAL_STORAGE"/>
11 <uses−permission android:name="androi d.permi ssi on.GET_ACCOUNTS"/>
12 <uses−permission android:name="androi d.permi ssi on.AUTHENTICATE_ACCOUNTS"/>
Listing 5:Android Manifest of Thief Application.
As already said,the IMEI number is a fundamental device identificator.
Consumer IMEIs have value to black market phone vendors.When a phone is reported stolen,its
IMEI is black-listed,which prevents it from connecting to cellular networks.This is supposed to render
stolen phones useless.
In practice,thieves can alter phone IMEIs to replace black-listed IMEIs with valid IMEIs.This
motivates a market for valid consumer IMEIs.
By default,an emulator’s IMEI should be “000-000-000-000-000”,but to perform as credible as possible
the activity,we have changed it in the emulator-arm file.
A user’s contact list includes contacts’ names,phone numbers,and e-mail addresses.This contact
information could be sold to scammers,spammers,or phishers.
As for the IMEI,an emulated device can’t provide a contacts list,so we have written a vcf file,where
we have stored faked names and telephone numbers.This file has been pushed into the simulated SD
card and then it has been imported into the emulator.
Once the contacts list have been stolen,we were able to send e-mails in background to each one of
them.
Another well-known malicious service we implemented,is getting information about users’ geograph-
ical positions.
Since our experiments were performed through an emulator,we could not get real GPS coordinates.
In any case,as we did for the IMEI number,we were able to emulate a couple of geographical coordinates
through adb and through DDMS Eclipse Prospective.
The last service we have implemented,provides the flooding of the system with hundreds of IMEI
requests.
This causes a Denial Of Service (DoS) which is an attempt to make a machine or network resource
unavailable to its intended users.
Generally speaking,motives for,and targets of a DoS attack may vary and generally consist of efforts
to temporarily or indefinitely interrupt or suspend services of a host connected to the Internet.
One common method of attack involves saturating the target machine with external communications
requests,so much so that it cannot respond to legitimate traffic,or responds so slowly as to be rendered
essentially unavailable.Such attacks usually lead to a server overload.In general terms,DoS attacks
are implemented by either forcing the targeted computer to reset,or consuming its resources so that it
can no longer provide its intended service or obstructing the communication media between the intended
users and the victim so that they can no longer communicate adequately.
A denial-of-service attack is characterized by an explicit attempt by attackers to prevent legitimate
users of a service from using that service.There are two general forms of DoS attacks:those that crash
services and those that flood services.
A DoS attack can be perpetrated in a number of ways.The five basic types of attack are:
1.Consumption of computational resources,such as bandwidth,disk space,or processor time.
2.Disruption of configuration information,such as routing information.
3.Disruption of state information,such as unsolicited resetting of TCP sessions.
4.Disruption of physical network components.
5.Obstructing the communication media between the intended users and the victim so that they can
no longer communicate adequately.
18
In most cases DoS attacks involve forging of IP sender addresses (IP address spoofing) so that the location
of the attacking machines cannot easily be identified.Our aim is to cause a Denial of Service by making
the underlying DroidScope plugin unavailable.We allocated a lot of threads which were going to steal
the IMEI number.Once created,these threads were not put in operation immediately,but they had to
wait for a kind of green light,granted by the Semaforo object.The Semaforo is an object shared by all
of them and it is initialized at the “red” state and as soon it assumes the “green” state,all the threads
want to read the IMEI number simultaneously.Once launched,the Flooding attack in effect causes a
Denial of Service causing not only the halt of the application but of the entire emulator.
Running Thief Activity under DECAF’s control
Since the plugin is merely a printer of dalvik instructions,our first task was to verify that different
inputs would fit different outputs.Simply stated,the first step was to “taint” noteworthy variables.For
example,giving the emulator two different IMEI numbers,we were able to compare their relative logs.
A DoS attack to the Monitoring System
As already stated our aim was to study Thief Activity’s behavior under DECAF control.The application
was so intrusive,that it halted the entire system and DECAF fared no better.As we expected,the
attack ended well,it hanged the emulator and the DalvikInstructionTracer could not trace all events
always leaving the last lines uncompleted.
As a next step,we have turned attention to solving the flooding problem,trying to intercept this kind
of attack through the log file and prevent the system from the invasion of calls.
Pattern Matching for suspicious Activity
Pattern matching [31] is the act of checking a perceived sequence of tokens for the presence of the
constituents of certain pattern.In contrast to pattern recognition,the match usually has to be exact.
The patterns generally have the form of either sequences or tree structures.Uses of pattern matching
include outputting the locations (if any) of a pattern within a token sequence,outputting some component
of the matched pattern,and substituting the matching pattern with some other token sequence (i.e.,
search and replace).
Sequence patterns (e.g.,a text string) are often described using regular expressions and matched using
techniques such as backtracking.
Tree patterns are used in some programming languages as a general tool to process data based on
its structure,e.g.,Haskell [14],ML [16] and the symbolic mathematics language Mathematica [15],have
special syntax for expressing tree patterns and a language construction for conditional execution and
value retrieval based on it.For simplicity and efficiency reasons,these tree patterns lack some features
that are available in regular expressions.
Searching for a string in Java is very simple,because it can be performed through the object Pattern
and the object Matcher and its relative method find.
1 pr i vat e bool ean leggiFile ( ) {
2 try {
3 File directory =new File ( pathofthefile );
4 fstream = new FileInputStream ( directory );
5 br = new BufferedReader (new InputStreamReader ( fstream ) );
6 Pattern p = Pattern.compile ("getCol or");
7 String stringApp = br.readLine ( );
8 Matcher m = p.matcher ( stringApp );
9 whi l e ( stringApp!=nul l && m.find ( )==f a l s e ) {
10 stringApp= br.readLine ( );
11 m = p.matcher ( stringApp );
12 i f ( m.find ( )==true ) {
13 return true;
14 }
15 }
16 } catch ( IOException e ) {
17//TODO Auto−generated catch bl ock
19
18 e.printStackTrace ( );
19 }
20 return f a l s e;
21 }
Listing 6:Matching the string “getColor”.
The Anti-flooding Button
In this section we will describe our idea to safeguard the system from the attack described above.
Leaving the Flooding Button untouched,we duplicated it with the Anti-Flooding Button,adding
codes that could stem damages.The main idea was to implement a service that,during the execution
of the activity and the DalvikInstructionTracer,could read the file which was going to be written,and
as soon as it tracked down suspicious entries it killed the current activity.The first obstacle is that
the plugin writes a file outside of the SD card of the emulator which is the only one location where the
activity can read files.In other words,it is impossible to alert the activity to the existence of the dalvik
file log produced by the DalvikInstructionTracer.
To get around this problem,we implemented a type of server,named “reader”.
1 publ i c Reader ( i nt port ) {
2 try {
3 letSocket=new ServerSocket ( port );
4 readData ( );
5 } catch ( IOException e ) {
6 e.printStackTrace ( );
7 }
8 }
Listing 7:The server Reader.
Like a really server,the reader establishes a connection with the application running on the emulator,
and as the name suggests,it reads the dalvik file that the activity did no have access to.During the
reading,it searches for the method “getColor” which,without a doubt,reveals that the attack is going
to start.As the pattern is matched,the method returns a flag to the activity.If the flag is set to the
true value,it means that the string has been found and the receiving activity will immediately show a
dialog.This dialog has a number field,so the user can input the pid of the application he wants to stop,
and interrupt the flooding.
Figure 5:Returning a valued true flag and showing the dialog to insert PID and stop activity.
20
Discussion and Conclusions
Today’s smartphone operating systems frequently fail to provide users adequate control over and visibility
into how third-party applications use their private data.
Their popularity also encourages malware authors to penetrate various mobile marketplaces with
malicious applications (or apps).
These malicious apps hide in the sheer number of other normal apps,which makes their detection
challenging.Unofficial repositories also exist,where developers can upload applications,including cracked
applications or trojan horses.This has allowed malicious attackers to upload malware to the Google
Market and also to spread malware through unofficial repositories.
Existing mobile anti-virus software are inadequate in their reactive nature by relying on known mal-
ware samples for signature extraction.
Contributions
The most important contribution of this work is the mechanism we propose for obtaining and analyzing
real traces of application behavior.
In collaboration with several tools [2,1,21,7,10] (see §),we have been capable of studying codes of
real malware which had gripped Android users in the past.
Furthermore,with the Yan’s et al.[34] analysis platformDroidScope/DECAF,we have been interested
in detecting anomalous applications at runtime (§).In particular,we have tested their DalvikInstruc-
tionTracer plugin.By deploying our own app,we have created a proof of its real effectiveness in tainting
data.
This analysis technique has been widely used in the literature.We have seen that there were many
different approaches to detect malware.We considered that monitoring Dalvik system calls is one of the
most accurate techniques to determine the behavior of Android applications,since they provide detailed
low level information.
We also have considered the benefits provided by virtualization-based analysis platform:analyze even
the most privileged attacks in the kernel and have an analysis completely performed externally.
Next step has been to launch a flooding attack which could block the plugin activity and also the
entire system.Once we had launched the attack,we took care of finding a solution.
We proposed a solution based on the string matching technique.During the activity execution,we
had the opportunity to see its behavior,and we could even alert the users when the attack was going to
start.
Approach Limitations
First of all,because of our approach is based on DroidScope/DECAF,we have to consider its limitations
better described in [34],i.e.limited code coverage and detecting/evading of DroidScope.
In fact,emulation-resistant malware detects if they are running within an emulated environment and
evades analysis by staying dormant or simply crashing themselves.
Furthermore,our anti-flooding systemis penalized by performance overhead.In fact,our experimental
activities are tested in a virtual machine that adds an additional virtualization layer slowing down the
entire system.
Future Works
We could improve our approach along two directions.
First,the DalvikInstructionTracer plugin’s activity can be improved in terms of time:in fact it prints
lots of strings at time,slowing down considerably the whole experimental environment.For example,it’s
possible to screen logs according to fixed parameters,improving the performance.
Second,plugins available so far only for the x86 platform,i.e.the APITracer,the NativeInstruction-
Tracer and the TaintTracker,can be ported to the ARM one.
Bibliography
[1] Ip2location,bringing location to the internet.[Online;in data 20-july-2013].
[2] Proguard,2013.[Online;in data 9-august-2013].
[3] Jeffrey Bickford,H.Andrés Lagar-Cavilla,Alexander Varshavsky,Vinod Ganapathy,and Liviu
Iftode.Security versus energy tradeoffs in host-based mobile malware detection.In Proceedings of
the 9th international conference on Mobile systems,applications,and services,MobiSys ’11,pages
225–238,New York,NY,USA,2011.ACM.
[4] Sven Bugiel,Lucas Davi,Alexandra Dmitrienko,Thomas Fischer,and Ahmad-Reza Sadeghi.Xman-
droid:A new android evolution to mitigate privilege escalation attacks.Technical Report TR-2011-
04,Technische Universität Darmstadt,Apr 2011.
[5] Sven Bugiel,Lucas Davi,Alexandra Dmitrienko,Stephan Heuser,Ahmad-Reza Sadeghi,and Bhar-
gava Shastry.Practical and lightweight domain isolation on android.In Proceedings of the 1st ACM
workshop on Security and privacy in smartphones and mobile devices,SPSM ’11,pages 51–62,New
York,NY,USA,2011.ACM.
[6] Iker Burguera,Urko Zurutuza,and Simin Nadjm-Tehrani.Crowdroid:behavior-based malware
detection system for android.In Proceedings of the 1st ACM workshop on Security and privacy in
smartphones and mobile devices,SPSM ’11,pages 15–26,New York,NY,USA,2011.ACM.
[7] Yuan Tian Cong Zheng,Ryan W.Smith.Apkinspector,2012.[Online;in data 19-may-2013].
[8] Mauro Conti,Vu Thien Nga Nguyen,and Bruno Crispo.Crepe:context-related policy enforcement
for android.In Proceedings of the 13th international conference on Information security,ISC’10,
pages 331–345,Berlin,Heidelberg,2011.Springer-Verlag.
[9] Thomas D.Androwarn,yet another static code analyzer for malicious android applications,2012.
[Online;in data 19-may-2013].
[10] Anthony Desnos.Androguard,reverse engineering,malware and goodware analysis of android ap-
plications...and more (ninja!),2012.[Online;in data 19-may-2013].
[11] William Enck,Peter Gilbert,Byung-Gon Chun,Landon P.Cox,Jaeyeon Jung,Patrick McDaniel,
and Anmol N.Sheth.Taintdroid:an information-flow tracking system for realtime privacy moni-
toring on smartphones.In Proceedings of the 9th USENIX conference on Operating systems design
and implementation,OSDI’10,pages 1–6,Berkeley,CA,USA,2010.USENIX Association.
[12] William Enck,Machigar Ongtang,and Patrick McDaniel.Understanding android security.IEEE
Security and Privacy,7(1):50–57,jan 2009.
[13] Clint Gibler,Jonathan Crussell,Jeremy Erickson,and Hao Chen.Androidleaks:automatically
detecting potential privacy leaks in android applications on a large scale.In Proceedings of the 5th
international conference on Trust and Trustworthy Computing,TRUST’12,pages 291–307,Berlin,
Heidelberg,2012.Springer-Verlag.
21
BIBLIOGRAPHY 22
[14] Google.Haskell,2013.[Online;in data 06-august-2013].
[15] Google.Mathematica,2013.[Online;in data 06-august-2013].
[16] Google.Ml,2013.[Online;in data 06-august-2013].
[17] Mariem Graa,Nora Cuppens-Boulahia,Frédéric Cuppens,and Ana Cavalli.Detecting control flow
in smarphones:combining static and dynamic analyses.In Proceedings of the 4th international con-
ference on Cyberspace Safety and Security,CSS’12,pages 33–47,Berlin,Heidelberg,2012.Springer-
Verlag.
[18] Michael Grace,Yajin Zhou,Qiang Zhang,Shihong Zou,and Xuxian Jiang.Riskranker:scalable and
accurate zero-day android malware detection.In Proceedings of the 10th international conference on
Mobile systems,applications,and services,MobiSys ’12,pages 281–294,New York,NY,USA,2012.
ACM.
[19] Kevin Gudeth,Matthew Pirretti,Katrin Hoeper,and Ron Buskey.Delivering secure applications
on commercial mobile devices:the case for bare metal hypervisors.In Proceedings of the 1st ACM
workshop on Security and privacy in smartphones and mobile devices,SPSM ’11,pages 33–38,New
York,NY,USA,2011.ACM.
[20] Vivek Haldar,Deepak Chandra,and Michael Franz.Dynamic taint propagation for java.In Pro-
ceedings of the 21st Annual Computer Security Applications Conference,ACSAC ’05,pages 303–311,
Washington,DC,USA,2005.IEEE Computer Society.
[21] iSecLab.Andrubis:A tool for analyzing unknown android applications,2012.[Online;in data
19-may-2013].
[22] Long Lu,Zhichun Li,Zhenyu Wu,Wenke Lee,and Guofei Jiang.Chex:statically vetting an-
droid apps for component hijacking vulnerabilities.In Proceedings of the 2012 ACM conference on
Computer and communications security,CCS ’12,pages 229–240,New York,NY,USA,2012.ACM.
[23] Denis Maslennikov.Update:Security alert:Hacked websites serve suspicious android apps (not-
compatible),October 6,2012.[Online;accessed 29-November-2012].
[24] Vaibhav Rastogi,Yan Chen,and Xuxian Jiang.Droidchameleon:evaluating android anti-malware
against transformation attacks.In Proceedings of the 8th ACM SIGSAC symposium on Information,
computer and communications security,ASIA CCS ’13,pages 329–334,New York,NY,USA,2013.
ACM.
[25] Marco Riccardi,Roberto Di Pietro,Marta Palanques,and Jorge Aguilí Vila.Titans’ revenge:
Detecting zeus via its own flaws.Comput.Netw.,57(2):422–435,feb 2013.
[26] Edward J.Schwartz,Thanassis Avgerinos,and David Brumley.All you ever wanted to know about
dynamic taint analysis and forward symbolic execution (but might have been afraid to ask).In Pro-
ceedings of the 2010 IEEE Symposium on Security and Privacy,SP ’10,pages 317–331,Washington,
DC,USA,2010.IEEE Computer Society.
[27] Karsten Sohr,Tanveer Mustafa,and Adrian Nowak.Software security aspects of java-based mobile
phones.In Proceedings of the 2011 ACM Symposium on Applied Computing,SAC ’11,pages 1494–
1501,New York,NY,USA,2011.ACM.
[28] 2012 AVG Technologies.[Online;accessed 14-April-2013].
[29] Zhi Wang,Jiang Ming,Chunfu Jia,and Debin Gao.Linear obfuscation to combat symbolic execution.
In Proceedings of the 16th European conference on Research in computer security,ESORICS’11,pages
210–226,Berlin,Heidelberg,2011.Springer-Verlag.
[30] T.J.Watson.Welcome to the t.j.watson libraries for analysis (wala),2006.[Online;in data 19-july-
2013].
BIBLIOGRAPHY 23
[31] Wikipedia.Pattern matching —wikipedia,the free encyclopedia,2013.[Online;accessed 6-August-
2013].
[32] Wikipedia.Wireshark — wikipedia,the free encyclopedia,2013.[Online;accessed 21-April-2013].
[33] Lok-Kwong Yan,Manjukumar Jayachandra,Mu Zhang,and Heng Yin.V2e:combining hardware
virtualization and softwareemulation for transparent and extensible malware analysis.SIGPLAN
Not.,47(7):227–238,March 2012.
[34] Lok Kwong Yan and Heng Yin.Droidscope:seamlessly reconstructing the os and dalvik semantic
views for dynamic android malware analysis.In Proceedings of the 21st USENIX conference on
Security symposium,Security’12,pages 29–29,Berkeley,CA,USA,2012.USENIX Association.
[35] Lok Kwong Yan and Heng Yin,2013.[Online;accessed 12-Jenuary-2013].