Design and Implementation of a Policy Enforce- ment Scheme for iOS

powerfuelSoftware and s/w Development

Nov 9, 2013 (3 years and 5 months ago)

111 views

Design and Implementation of a Policy Enforce-
ment Scheme for iOS
Tim Werthmann
Diploma Thesis – March 13,2012.
Chair for System Security.
1st Supervisor:Prof.Dr.rer.nat.Thorsten Holz
2nd Supervisor:Prof.Dr.-Ing.Ahmad-Reza Sadeghi
Advisors:Dipl.-Inf.Ralf Hund,M.Sc.Lucas Davi
Abstract
Today,smartphone platforms are widely spread and used everyday to store multifarious
information.But smartphone operating systems frequently fail to adequately protect a
user’s privacy from the access of third-party applications;either because they perform
malicious tasks or because they contain vulnerabilities an adversary can exploit.We ad-
dress this issue with a novel enforcement framework presented in this thesis that enables
a policy maker to restrict any application in accessing any desired operating system re-
sources,including privacy sensitive information.Our solution focuses on Apple’s iOS and
overcomes several obstacles that are caused by the extensive usage of Objective-C and
the underlying ARM architecture by using a new in-depth analysis of the Mach-O file
format and the Objective-C runtime system.Our reference implementation successfully
enables a policy maker to notice and react on a policy violation by either terminating the
process or by substituting the enforced call,respectively the sensitive data.Furthermore,
our performance evaluation demonstrates that our prototype induces only little overhead
at load-time and negligible overhead at runtime.
Declaration
I hereby declare that this submission is my own work and that,to the best of my knowl-
edge and belief,it contains no material previously published or written by another person
nor material which to a substantial extent has been accepted for the award of any other
degree or diploma of the university or other institute of higher learning,except where
due acknowledgment has been made in the text.
Erklärung
Hiermit versichere ich,dass ich die vorliegende Arbeit selbstständig verfasst und keine
anderen als die angegebenen Quellen und Hilfsmittel benutzt habe,dass alle Stellen der
Arbeit,die wörtlich oder sinngemäß aus anderen Quellen übernommen wurden,als solche
kenntlich gemacht sind und dass die Arbeit in gleicher oder ähnlicher Form noch keiner
Prüfungsbehörde vorgelegt wurde.
Date author
Dedicated posthumously to my grandfather,
Heinz Werthmann
ix
Contents
Acronyms xxii
1 Introduction 1
1.1 Motivation....................................2
1.2 Contribution..................................3
1.3 Organization of this Thesis..........................3
2 Background 5
2.1 ARM Architecture...............................5
2.1.1 ARM Processor Modes and Core Registers..............6
2.1.2 The Program Status Registers and Execution State Registers...7
2.1.3 Procedure Call Standard for the ARM Architecture.........9
2.2 iOS........................................10
2.2.1 iOS Frameworks............................10
2.2.2 Model-View-Controller Design Pattern................11
2.2.3 The Application Launch Cycle....................12
2.2.4 Security Features............................13
2.2.5 Function-Calling Conventions Used in the iOS ABI.........15
2.3 Objective-C...................................16
2.3.1 Classes and Objects..........................16
2.3.2 Messaging................................20
2.3.3 Declared Properties..........................23
2.3.4 Protocols................................23
2.3.5 Class Clusters..............................24
2.3.6 Runtime System............................24
2.3.7 Memory Management.........................25
2.4 Mach-O File Format..............................25
2.5 Control-Flow Integrity.............................27
3 Design of the Enforcement Framework 31
3.1 Preprocessing..................................31
3.2 Binary Analysis.................................32
3.3 Load-Time Module:Binary Rewriting....................33
3.4 Runtime Module:Control-Flow Integrity Enforcement...........33
3.5 Runtime Module:Policy Enforcement....................33
x Contents
4 Implementation Details 35
4.1 Static Analysis.................................35
4.1.1 Patchfile Generation..........................35
4.1.2 Control-Flow Graph Determination..................40
4.2 Load-Time Module...............................50
4.2.1 Shadow Stack Initialization......................50
4.2.2 Exception Handling Initialization...................52
4.2.3 Binary Rewrite Initialization.....................52
4.2.4 Control-Flow Graph Initialization...................53
4.2.5 Policy Enforcement Initialization...................54
4.2.6 Binary Rewrite.............................56
4.3 Runtime Module................................58
4.3.1 Control-Flow Enforcement.......................59
4.3.2 Policy Enforcement...........................64
5 Security Considerations and Discussion 67
5.1 Integration and Discussion of the Policy Enforcement Framework.....67
5.2 Optimizations..................................68
5.3 Possible Attacks and Coutermeasures.....................69
6 Evaluation 73
6.1 Privacy Policy Enforcement..........................73
6.2 Performance...................................77
7 Related Work 79
7.1 PiOS.......................................79
7.2 Vx32.......................................80
7.3 TaintDroid...................................80
7.4 AppFence....................................81
8 Conclusion and Future Work 83
8.1 Summary....................................83
8.2 Future Work..................................83
Bibliography 85
A ARM Condition Codes 89
B Mach-O Header Data Structure 91
C Patch File Generation Workflow Charts 93
D Objective-C Structures 97
E Configuration Options for Argument Rules 99
Contents xi
F Used Trampolines 101
G Control-Flow Integrity Workflow Charts 103
xiii
List of Figures
2.1 The format of the CPSR and SPSRs......................7
2.2 Format of the APSR...............................9
2.3 Framework layers in iOS............................11
2.4 The Model-View-Controller design pattern..................12
2.5 The application launch circle.........................13
2.6 Objects in Objective-C.............................17
2.7 Exemplary class hierarchy...........................18
2.8 Objective-C object linkage in the class hierarchy..............19
2.9 Objective-C messaging.............................23
2.10 Example of a class cluster...........................24
2.11 The Mach-O file structure...........................26
2.12 Exemplary control-flow between basic blocks.................28
2.13 Control-flow enforcement using inline validation...............29
2.14 Control-flow enforcement using a validation function............29
3.1 Framework architecture.............................32
3.2 Policy enforcement architecture........................34
4.1 Patch Generator Workflow...........................36
4.2 CFG Generation Workflow...........................40
4.3 Workflow of the load-time module.......................51
4.4 General trampoline for internal calls......................57
4.5 Runtime module architecture..........................59
4.6 Workflow of the policy enforcement......................64
6.1 Load-time comparison chart..........................77
6.2 Runtime comparison chart...........................78
C.1 Workflow of the function call processing....................93
C.2 Workflow of the function return by BX processing..............93
C.3 Workflow of the BX jump processing......................94
C.4 Workflow of the return by POP processing..................94
C.5 Workflow of the PC jump processing......................94
C.6 Workflow of the table jump processing.....................95
C.7 Workflow of the meta information processing.................95
F.1 General trampoline for external calls......................101
xiv List of Figures
F.2 Extended general trampoline for external calls................101
F.3 Custom ARM trampoline for non PC relative branches...........102
F.4 Custom ARM trampoline for PC relative branches..............102
F.5 Custom Thumb trampoline for non PC relative branches..........102
F.6 Custom Thumb trampoline for PC relative branches.............102
G.1 Workflow of the validation function for direct internal calls.........103
G.2 Workflow of the validation function for direct external calls.........103
G.3 Workflow of the validation function for Objective-C calls..........104
G.4 Workflow of the validation function for indirect calls.............104
G.5 Workflow of the validation function for indirect jumps............105
G.6 Workflow of the validation function for register based returns........105
G.7 Workflow of the validation function for table branches............106
G.8 Workflow of the validation function for stack based returns.........106
xv
List of Tables
2.1 ARM processor modes..............................6
2.2 Organization of general-purpose registers and program status registers...7
2.3 The derived execution states of the instruction set and the endianness...8
2.4 If-Then execution state.............................8
2.5 Register roles in the AAPCS..........................9
6.1 Property lists used to harvest sensitive data..................74
6.2 Policy rules covering the files presented in Table 6.1.............75
A.1 ARM condition codes..............................89
E.1 Type identifiers for policy rules.........................99
E.2 Comparator identifiers for policy rules.....................99
xvii
List of Algorithms
4.1 Offset decoding..................................44
xix
List of Listings
2.1 Thumb return on C and C++ interworking..................10
2.2 Example of an iOS function prolog.......................16
2.3 Example of an iOS function epilog.......................16
2.4 Objective-C message syntax...........................20
2.5 Objective-C method name derivation.....................20
2.6 A derived Objective-C method name......................20
2.7 Objective-C message example..........................21
2.8 Objective-C message example continued....................21
2.9 Objective-C runtime method call........................21
2.10 Example Objective-C method call.......................22
2.11 Example Objective-C runtime class method call...............22
2.12 Example Objective-C runtime instance method call.............22
2.13 Concrete examplary Objective-C method call.................22
2.14 Concrete examplary Objective-C method call continued...........22
4.1 Patch file entry example.............................39
4.2 Example of the binding information linkage..................43
4.3 Example of the selector linkage.........................45
4.4 Example of the API call linkage........................46
4.5 Example of a compiled embedded class_t structure.............47
4.6 Example of a compiled embedded class_ro_t structure...........47
4.7 The class_t structure used on iOS.......................48
4.8 Example of a compiled embedded method_list_t structure.........48
4.9 Example of a compiled embedded protocol_list_t structure.........48
4.10 Example of a compiled embedded protocol_t structure...........48
4.11 Example of a compiled embedded ivar_list_t structure...........48
4.12 Example of a compiled embedded objc_property_list structure.......49
4.13 Objective-C runtime system call to retrieve the address of a compiled
selector......................................53
4.14 Objective-C runtime system call to retrieve the address of a class......54
4.15 Objective-C policy rule example........................55
5.1 Objective-C message to retrieve the current device information.......69
5.2 Objective-C runtime system call to retrieve a method............70
5.3 Objective-C runtime system call to retrieve a method continued......70
5.4 Objective-C runtime system call to exchange two method implementations.70
xx List of Listings
6.1 Accessing the email account configuration...................74
6.2 Accessing the device’s UUID and the phone number.............74
6.3 Accessing personal photos............................75
6.4 Accessing the address book...........................76
6.5 Accessing the keyboard cache..........................76
B.1 Mach-O header..................................91
B.2 Mach-O CPU types...............................91
B.3 Mach-O ARM sub CPU types.........................91
B.4 Mach-O load command.............................92
B.5 nlist structure..................................92
D.1 The id structure.................................97
D.2 The class_t structure..............................97
D.3 The class_ro structure.............................97
D.4 The method_list_t structure..........................97
D.5 The protocol_list_t structure.........................97
D.6 The method_t structure............................98
D.7 The protocol_t structure............................98
D.8 The ivar_list_t structure............................98
D.9 The ivar_t structure...............................98
D.10 The objc_property_list structure.......................98
D.11 The objc_property structure..........................98
xxi
Acronyms
ARM Advanced RISC Machine
RISC Reduced Instruction Set Computer or Computing
CFI Control-Flow Integrity
MoCFI Mobile Control-Flow Integrity
API Application Programming Interface
ThumbEE Thumb Execution Environment
JIT Just-In-Time
DAC Dynamic Adaptive Compilation
AOT Ahead-Of-Time
SP Stack Pointer
LR Link Register
PC Program Counter
CPSR Current Program Status Register
SPSR Saved Program Status Register
APSR Application Program Status Register
ITSTATE If-Then execution state
GE Greater than or Equal
RAZ Read-As-Zero
SBZP Should-Be-Zero-or-Preserved
SVC Supervisor Call
ENDIANSTATE Endianness execution state
AAPCS Procedure Call Standard for the ARM Architecture
PSR Program Status Register
MVC Model-View-Controller
SSP Stack-Smashing Protector
xxii Acronyms
ASLR Address Space Layout Randomization
PIE Position-Independent Executable
CSE Code Signing Enforcement
XML Extensible Markup Language
MAC Mandatory Access Control
MCS Mandatory Code Signing
ABI Application Binary Interface
Mach-O Mach Object
BBL Basic Block
CFG Control-Flow Graph
UUID Universally Unique Identifier
IMSI International Mobile Subscriber Identity
ICCID Integrated Circuit Card Identifier
IPC Inter-Process Communication
1
1 Introduction
Programs from untrusted third-parties often contain vulnerable or malicious code.Vul-
nerable code induces a high security risk to any system,as it can allow an adversary to
subvert the program’s control-flow and,in the worst-case,to inject malicious code.Such
malicious code can be used,e.g.,to steal private data,take control over the machine it is
running on,or spy a user’s behavior.The impact of such malicious behavior can be re-
duced by running untrusted code in virtual environments – so called sandboxes.Virtual
machines,such as VMWare,Bochs,and Virtual PC,abstract the host system’s hardware
and map these resources to the virtual guest machine.As this technique virtualizes a
complete system it is very complex,and in most cases the features are not needed to
diminish malicious behavior.In such cases,it is sufficient to monitor the control-flow of
a program,and ensure that a certain policy is met,or enforced if violated.Such access
control systems implement the interception of system calls,so called hooking,to realize
a sandbox.
Today,smartphone platforms store sensitive data such as address books,notes,SMS,
calendars,e-mails,and often passwords or payment information.The necessity of a
proper protection against,e.g.,data theft or data alteration is even intensified by the fact
that alone in the year 2011 the total smartphone sales reached 472 million units (up 58
percent from2010) and is estimated to growby 39 percent in 2012.[1] Afull virtualization
on such mobile devices would have a significant impact on both performance and battery
life,whereas a sandboxing scheme based on policy enforcement is less complex in both
computation time and power consumption,as it lacks the hardware abstraction.
The downside of a stand-alone policy enforcement is that it does not prevent control-flow
subversion caused by vulnerabilities.Since control-flow attacks are of the most prominent
and successful software attacks for over 20 years [2],they pose a serious thread.As a
subversion modifies the programstate,and,as a consequence,induces abnormal program
behavior that will most likely subvert the policy enforcement,a subversion of the control-
flow must be prevented in order to ensure the policy enforcement.
One approach to prevent control-flow attacks is the enforcement of control-flow integrity
(CFI) [3],where CFI asserts the circumstance that only predetermined legitimate paths of
a program’s control-floware followed.Recently CFI has been applied to mobile platforms,
the authors of mobile CFI (MoCFI) [3] have successfully implemented a shared library
to assert a program’s control-flow graph on Apple’s iOS.
This thesis presents the design and implementation of an enforcement framework based
on MoCFI.The implementation focuses on Apple’s operating system iOS running on the
ARMarchitecture,as used on iPhone,iPad,and iPod touch.With a market share of 23.8
percent,or 35.4 million units in the year 2011,iOS is the second most used operating
system on smartphone platforms behind Google’s Android.[1] The iOS operating system
2 1 Introduction
already provides application sandboxing,but the built-in mechanisms are insufficient as
all third-party applications share the same sandboxing profile.Furthermore,the profile
grants access to files that store sensitive data and to any public framework API,where a
public framework in the Apple terminology refers to a directory that contains a dynamic
shared library and the resources needed to support that library which can be used by
any developer (in contrast to private frameworks which are decided by Apple being not
of public interest and thus must not be used by anyone but Apple).
One obstacle to overcome is the fact that iOS applications are written in the highly dy-
namic programming language Objective-C,which uses object messaging to interact with
an underlying runtime system.Objective-C is extensively used throughout all third-party
applications in parallel with standard API calls.
The presented enforcement framework solves all these problems and drawbacks and en-
ables a separate policy rule set for each application,active verification of all API calls
and Objective-C messages,and configurable actions on policy violations.Hence,the
presented framework is able to create application compartments that can be used to
control each application’s accesses to the underlying operating system using fine-grained
configurable rules.Furthermore,this thesis provides an exemplary rule set to prevent
an application from accessing privacy sensitive data,which any application is allowed to
access,process,and even transfer to any other party.
1.1 Motivation
With more than 200 million users worldwide [4],Apple’s operating system iOS belongs
to the most used platforms on mobile phones.Applications,so called apps,are pri-
marily written in Objective-C or Objective-C++ and are distributed through the App
Store,where each distributed application is reviewed by Apple.The review process is
unknown [2,5],but at most three distributed applications were reported,and even pulled
out of the App Store,due to data harvesting [5].This is possible because every appli-
cation that utilizes Objective-C or Objective-C++
1
can freely use the complete public
framework.Apple provides application sandboxing,which limits the impact of an appli-
cations behavior.However,iOS sandboxing is based on system call hooking,which does
not limit access to the public framework,but to the system resources.[6,7,8]
The proof-of-concept application SpyPhone showed that a significant quantity of personal
data can be harvested on iPhones,by using only public framework functions.SpyPhone
has been developed to demonstrate the importance of alertness in the context of dis-
tributed applications,as it proofed that any distributed application is able to access
sensitive data without the user’s explicit permission.Although iOS enforces application
sandboxing,SpyPhone is able to access the 20 most recent Safari searches,the YouTube
history,the email account parameters,the phone information (e.g.,the phone number,
the universally unique identifier (UUID),the integrated circuit card identifier (ICCID),
and the international mobile subscriber identity (IMSI)),the keyboard cache,and private
photos.The IMSI reveals the country and the carrier,and since on iOS each photo gets
1
iOS also supports C and C++.
1.2 Contribution 3
tagged with the current GPS location,an adversary can track back the user’s location
from inspecting the private photos.Moreover,SpyPhone can access the user’s previous
location by accessing the Map application,the address book,the timezone,the data from
the weather application,and the Wifi logs.[5]
This raises the question whether it is possible to build a policy enforcement to limit
the access to the public framework.It is clear that function calls could be declared as
part of the private framework (),but this would limit legitimately accessing applications
as well.Further,it could be possible to access the private framework at runtime using
obfuscation techniques,e.g.,by using runtime generated messages to the Objective-C
runtime system.
Our approach aims at the fact that all messages of an application to the Objective-C
runtime and all API calls are known before execution.Embedding this assertion into
a control-flow check,effectively stops even sophisticated illegal interactions and can be
used to verify all calls and messages with a policy rule set.
1.2 Contribution
We make the following two contributions.First,to the best of our knowledge we provide
the first policy enforcement framework on iOS running on the ARM architecture.Sec-
ond,we designed an in-depth Mach-O file analysis to especially retrieve the Objective-C
constructs used in an iOS application and to resolve API calls compiled in a position-
independent executable.
Similar solutions already exist for the Android platform to reduce sensitive data expo-
sure.In contrast,our solution supports the peculiarities of the Objective-C runtime
system and runs without changes to the underlying operating system or system kernel.
Besides these differences,our enforcement framework can operate in the unpriviledged
user mode
2
,requires no source code of the application it is applied to,and is completely
transparent to an end user.Furthermore,our design aims on supporting any policy rule
a policy maker aims to apply.Hence,while the main target is to tackle privacy viola-
tions,we can also enforce fine-grained access control rules that can be used to isolate an
application.We also demonstrate the effectiveness of our novel iOS policy enforcement
framework:the proof of concept application SpyPhone that is able to collect a significant
amount of data can be completely prevented from accessing any sensitive data without
any side effects.
1.3 Organization of this Thesis
The rest of this thesis is organized as follows:in Chapter 2,we recall the ARM architec-
ture,discuss features of the iOS operating system,introduce the Objective-C program-
ming language,outline the Mach-O file format,and finally explain the general concepts
of control-flow integrity.Chapter 3 outlines the design of the policy enforcement and its
2
This does not completely apply to our introduced exception handler as an exception triggers a mode
change.
4 1 Introduction
static analysis,load-time,and runtime modules while the implementation details of the
enforcement framework are described in Chapter 4.Chapter 5 deals with the security
considerations and discusses possible drawbacks,respectively attack vectors and solu-
tions.The evaluation of our prototype is described in Chapter 6,whereby this chapter
also contains exemplary rules to enforce a privacy policy.In Chapter 7,we elaborate
on related work and also constitute why the formerly used PiOS [9] is not sufficient to
protect the user’s privacy.Finally,Chapter 8 summarizes the findings and results of this
thesis and outlines the future work.
5
2 Background
This chapter provides background information which is necessary to understand the re-
mainder of this thesis.Hence,Section 2.1 introduces an in-depth look into the ARM
architecture (the underlying architecture of Apple’s iOS devices).In Section 2.2 Apple’s
iOS operating system and its important features are described.This section also out-
lines the currently implemented security features.Section 2.3 discusses Objective-C,the
prevalent programming language used on iOS.The section also shows important design
aspects and peculiarities in terms of the object orientation,dynamic,and polymorphism
realized in the Objective-C runtime system.Section 2.4 describes the Mach-Ofile format,
its structure,and important details regarding the typical programsegmentation.Finally,
the basics of control-flow integrity are presented in Section 2.5.
2.1 ARM Architecture
Since this thesis focuses on iOS,we recall the basics of the ARM architecture,which is
the standard processor architecture for all Apple mobile devices.Currently,iOS supports
ARMv6 and ARMv7.Since ARMv7 architecture is used in the latest Apple devices,we
focus our description on this platform.Furthermore,the following description is mainly
derived from [10],other references are given when used.
The ARM architecture is a 32-bit Reduced Instruction Set Computer (RISC) architec-
ture.It features sixteen general-purpose registers,R0 to R15,and a program status
register (called current program status register – CPSR) to reflect the current system
state.Moreover,the ARM architecture contains a floating-point coprocessor and sup-
ports two instruction sets,the ARM and Thumb instruction sets.The ARM instruction
set consists of 32-bit instructions aligned at 32-bit boundaries that is designed for per-
formance.The Thumb instruction set consists of 16-bit instructions aligned at 16-bit
boundaries that is designed for improved code density.
As of ARMv6T2 the Thumb-2 technology is part of the ARM architecture.This tech-
nology enables 32-bit (wide) Thumb instructions to achieve performance similar to ARM
code and improved code density;both instruction sets provide almost identical function-
ality and can be switched freely at runtime (called interworking).
The ARMarchitecture is also able to execute Java byte-code directly through the Jazelle
extension.As for the similar purpose the ThumbEE extension is part of the newer ver-
sions of the ARM architecture.It is a variant of the Thumb instruction set designed
for dynamically generated code;it cannot interwork freely with the ARM and Thumb
instruction sets.
6 2 Background
To summarize,the most important features of the ARMv7 architecture are as follows:
• dedicated load and store instructions
• data-processing operations operate on register contents only
• enforced aligned memory access
• instructions that combine a shift with an arithmetic or logic operation
• auto-increment and auto-decrement addressing modes
• load and store multiple instructions
• conditional execution of almost all instructions
• fixed instruction length (either 32-bit ARM or Thumb or 16-bit Thumb)
• direct execution of Java byte-code
• native support for big-endian and little-endian format
• support for Just-In-Time (JIT),Dynamic Adaptive Compilation (DAC),and
Ahead-Of-Time (AOT) compilers
• three-staged pipelining:load,decode,and execute [11]
• eight different processor modes (see 2.1.1).
2.1.1 ARM Processor Modes and Core Registers
The ARM architecture defines eight processor modes,shown in Table 2.1.As for appli-
cations,the standard mode of operation is the user mode.This mode is unprivileged and
enables the operating system to restrict the use of system resources.Further,programs
executed in this mode cannot access protected system resources,nor change the mode
except by causing an exception.
Processor mode Description
User Suitable for application code and other unpriviledged processes.
FIQ Entered as a result of a fast interrupt.
IRQ Entered as a result of a normal interrupt.
Supervisor Suitable for running most kernel code.Entered on Reset,and on execution of a Supervisor Call (SVC)
instruction.
Monitor A Secure mode that enables change between Secure and Non-secure states,and can also be used to
handle any of FIQs,IRQs and external aborts.Entered on execution of a Secure Monitor Call (SMC)
instruction.
Abort Entered as a result of a Data Abort exception or Prefetch Abort exception.
Undefined Entered as a result of an instruction-related error.
System Suitable for processes that require privileged access to system resources,and for privileged access to
User mode registers.
Table 2.1:ARM processor modes.
Even though each register is a general-purpose register,the Stack Pointer (SP) is located
in R13,the Link Register (LR) in R14,and the Program Counter (PC) in R15.Some
processor modes create their own mapping of a subset of the core registers,also referred
to as banking
1
.
Table 2.2 gives an overview of the complete banking of the registers,including the views
of the Application Program Status Register (APSR),Current Program Status Register
1
The duplicated copies are referred to as banked registers.
2.1 ARM Architecture 7
(CPSR),and of the banked Saved Program Status Register (SPSR).The application
level names always refer to the registers of the current mode,thus the systemlevel names
(e.g.R0_usr) are only used when it is necessary,e.g.,when a mode has to refer to the
banked registers of another mode (i.e.,while R8_fiq is called R8 in FIQ mode,any other
mode must use R8_fiq to refer to the FIQ mode’s banked version of R8).The system
level view shows the representation of all banked registers,including the unpriviledged
user level,the priviledged system level,and the priviledged exception levels.
Exception modes
Priviledged modes
System level views
Application
level view
User
Mode
System
Mode
Supervisor
Mode
Monitor
Mode
Abort
Mode
Undefined
Mode
IRQ
Mode
FIQ
Mode
R0 R0_usr
R1 R1_usr
R2 R2_usr
R3 R3_usr
R4 R4_usr
R5 R5_usr
R6 R6_usr
R7 R7_usr
R8 R8_usr R8_fiq
R9 R9_usr R9_fiq
R10 R10_usr R10_fiq
R11 R11_usr R11_fiq
R12 R12_usr R12_fiq
SP SP_usr SP_svc SP_mon SP_abt SP_und SP_irq SP_fiq
LR LR_usr LR_svc LR_mon LR_abt LR_und LR_irq LR_fiq
PC PC
ASPR CPSR
SPSR_svc SPSR_mon SPSR_abt SPSR_und SPSR_irq SPSR_fiq
Table 2.2:Organization of general-purpose registers and program status registers.
2.1.2 The Program Status Registers and Execution State Registers
In order to reflect the system status,the CPSR stores the processor status and control
information.It consists of eleven one bit flags,the If-Then execution state (ITSTATE),
the Greater than or Equal (GE) flags,and the mode field.The CPSR and SPSR format
is shown in Figure 2.1.The Program Status Register (PSR) reflects the complete system
31 30 29 28 27 26 25 24 23 20 19 16 15 10 9 8 7 6 5 4 0
N Z C V Q IT[1:0] J Reserved GE[3:0] IT[7:2] E A I F T M[4:0]
Figure 2.1:The format of the CPSR and SPSRs.
state in one 32-bit register.This is realized by subdividing the 32-bit into sets of one to
8 2 Background
seven bits,which are used to indicate,e.g.,the condition of a conditional execution,the
execution state registers,and the used processor mode.The complete format is given as
follows:
• Bits [31:28]:Condition code flags,can be read or written in any mode
– Bit [31]:Negative condition code flag
– Bit [30]:Zero condition code flag
– Bit [29]:Carry condition code flag
– Bit [28]:Overflow condition code flag
• Bit [27]:Cumulative saturation flag,can be read or written in any mode
• Bits [15:10,26:25]:ITSTATE[7:0]
• Bit [24]:Jazelle bit
• Bits [23:20]:Reserved.RAZ/SBZP
• Bits [19:16]:GE[3:0],can be read or written in any mode
• Bit [9]:Endianness execution state bit (ENDIANSTATE)
• Bits [8:6]:Mask bits:
– Bit [8]:Asynchronous abort disable bit
– Bit [7]:Interrupt disable bit
– Bit [6]:Fast interrupt disable bit
• Bit [5]:Thumb execution state bit
• Bits [4:0]:Processor mode field [4:0].
The derivation of the execution state registers is shown in Tables 2.3a,2.3b,and 2.4.
J T Instruction set state
0 0 ARM
0 1 Thumb
1 0 Jazelle
1 1 ThumbEE
(a) Instruction set state.
State bit Endian mapping
0 Little-endian
1 Big-endian
(b) Endianness execution state.
Table 2.3:The derived execution states of the instruction set and the endianness.
Bits [7:5] Bit [4] Bit [3] Bit [2] Bit [1] Bit [0] Note
Condition
1
P1 P2 P3 P4 1 4-instruction IT block
Condition
1
P1 P2 P3 1 0 3-instruction IT block
Condition
1
P1 P2 1 0 0 2-instruction IT block
Condition
1
P1 1 0 0 0 1-instruction IT block
Table 2.4:If-Then execution state.
The SPSR is a banked version of the pre-exception value of the CPSR.When taking an
exception,the processor copies the CPSR to the SPSR of the exception mode it is about
1
The ARM condition codes can be found in Table A.1.The ITSTATE condition bits refer to the three
leftmost bits of the ARM condition codes.
2.1 ARM Architecture 9
to enter.
The APSR is an application level alias for the CPSR.Basically it is the CPSR,but it
lacks the access to the system level information – Figure 2.2 shows the application level
view of the CPSR.
31 30 29 28 27 26 24 23 20 19 16 15 0
N Z C V Q RAZ/
SBZB
Reserved GE[3:0] Reserved
Figure 2.2:Format of the APSR.
2.1.3 Procedure Call Standard for the ARM Architecture
The following description is mainly derived from [12],other references are given when
used.
The Procedure Call Standard for the ARM Architecture (AAPCS) describes a contract
between a calling routine and a called routine.Using the AAPCS,subroutines can be
separately written,separately compiled,and separately assembled to work together.For
this purpose the AAPCS defines that parameters are passed using R0 to R3,all other
arguments must be passed using the stack;return values,if any,are passed using R0.
Moreover,it defines that the registers R4 to R8,R10,R11,SP,and the floating-point
coprocessor registers D8 to D15
2
must be preserved in a subroutine.The return address
is stored in LR,and R0 to R3 are considered as scratch registers (R9 must be preserved
if it is used as platform register,R12 can be used as scratch register between a routine
and any subroutine it calls) - Table 2.5 gives an overview of the register roles.If values
larger than 32 bits are passed,they are treated as a consecutive amount of 32-bit values.
Registers Role
R0-R3 Arguments/Scratch registers,R0 also serves as return register.
R4-R8 Variable registers,must be preserved when calling a subroutine.
R9 Scratch register or platform register (must be preserved).
R10,R11 Variable registers,must be preserved when calling a subroutine.
R12 The Intra-Procedure-call scratch register.
SP The Stack Pointer,must be preserved.
LR The Link Register,holds the return address.
PC The Program Counter.
D8-D15 Variable coprocesser registers,must be preserved when calling a sub-
routine.
Table 2.5:Register roles in the AAPCS.
Even if the ARMarchitecture supports full and empty types of ascending and descending
stacks,the AAPCS dictates a full descending stack.As the SP must be preserved,the
caller has to eventually clean up the stack after a subroutine call,which resembles the
2
Double precision registers,overlapping with single precision registers S16 to S31 and quad word reg-
isters Q4 to Q7.
10 2 Background
cdecl
3
.Since the ARM architecture supports two distinct instruction sets,the AAPCS
also defines how they interoperate.All instructions are aligned at even addresses,leading
to the fact that the least significant bit (bit 0) is always cleared.
This fact is used for indicating an instruction set exchange:
• Branching (either through a call or jump) to Thumb sets bit 0 of the target address
• Branching to ARM clears bit 0 of the target address.
A special type of return is applied when C and C++ interwork on ARM in Thumb state.
On so called non-leaf functions,which are functions that call subroutines themselves,a
stack-based return
4
has to be replaced by the sequence [13]:
1
POP
{
R4
-
R7
}
//restore registers
2
POP
{
R3
}
//restore LR to R3
3
BX
R3
//branch and exchange to ARM
Listing 2.1:Thumb return on C and C++ interworking.
2.2 iOS
The iOS operating system runs on iPhone,iPod touch,and iPad devices and is based on
Apple’s Mac OS X,which again is based on the open-source operating system Darwin.
The operating system’s design contains a hybrid kernel called XNU (X is Not Unix) that
combines the features of monolithic and microkernels.The microkernel part is based
on the Mach kernel and implements the basic mechanisms to run an operating system,
i.e.,the low-level functionality (e.g.,virtual memory management and inter-process com-
munication),while the monolithic kernel part is based on FreeBSD
5
and implements
further operating system concepts,e.g.,BSD system calls,UNIX process model,and
the FreeBSD network stack.[14] Thus,iOS applications run in a UNIX-based system,
whereby each application has its own virtual address space.Furthermore,iOS does not
support paging to disk,constraining the amount of usable virtual memory to the amount
of physical memory available.Thus,whenever the virtual memory gets full,the virtual
memory system releases read-only memory pages,such as code pages.[15,16]
2.2.1 iOS Frameworks
A framework on iOS is a directory that contains a dynamic shared library and the
resources needed to support that library.iOS implements four categories of frameworks,
which can be viewed as a set of layers,as shown in Figure 2.3.Each layer is dedicated
3
The cdecl defines that a caller is resposible for storing and removing all call parameters on the stack.
Whereas the callee must preserve the stack as it is when the callee is called,meaning that the callee
must remove all additional stack variables (local variables) it uses before returning to the caller.
4
Popping the return value from the stack directly to PC instead of returning by branching to the
address in LR.
5
A UNIX-like operating system.
2.2 iOS 11
Core OS
Core Services
Media
Cocoa Touch
Figure 2.3:Framework layers in iOS.[15]
for a specialized purpose and is made up of several frameworks [15]:
• The Cocoa Touch layer defines the basic application infrastructure and support
for high-level system services,including,but not limited to,the Address Book UI
Framework and the UIKit Framework.
• The Media layer is designed for accessing graphics,audio,and video technologies,
including,but not limited to,the Core Graphics Framework and the Quartz Core
Framework.
• The Core Services layer defines the fundamental system services that all applica-
tions use,it includes,but not limited to,the Accounts Framework and the Core
Foundation Framework.
• The Core OS layer contains the low-level features that most other technologies
are built upon,including,but not limited to,the Accelerate Framework and the
Security Framework.
Furthermore,some frameworks are declarated as public and some as private.This means
that public frameworks can be freely used by any developer to write iOS applications;
the usage of the public frameworks is not restricted.Private frameworks,however,must
not be used by any developer as they are reserved for Apple’s internal use.
2.2.2 Model-View-Controller Design Pattern
The following description is mainly derived from [16],other references are given when
used.
The Model-View-Controller (MVC) design pattern governs the overall structure of every
iOS application and consists of an controller object,a data model,and a view domain
(Figure 2.4).
The controller object of every application is the UIApplication.It manages the applica-
tions event loop and coordinates the high-level application behaviors.The custom logic,
most notably the state transition logic,resides in the custom application delegate object.
It is created at application launch time,and works in tandem with the UIApplication
object.
12 2 Background
The application’s presentation is managed by view controller objects.The application’s
core infrastructure is built from objects in the UIKit framework,which provides stan-
dard views for presenting content.Each view controller manages a single view and its
collection of subviews.The coordination of the presentation of one or more views is done
by a UIWindow object,which works together with the UIApplication object to deliver
events to views and view controllers.[15,16]
The data model objects realize the data storage,and are application specific.
Every iOS application has at least one UIWindow object and one UIView object.
Data Objects
Document
Data Model Objects
Data Model Objects
Views and UI Objects
Data Model Objects
Data Model Objects
View Controller
Model
Controller
Event
Loop
View
UIWindow
UIApplication
Custom Objects
System Objects
Either system or custom objects
Application Delegate
Figure 2.4:The Model-View-Controller design pattern.[16]
2.2.3 The Application Launch Cycle
All third-party applications on iOS have a specific launch cycle that obeys an event driven
execution.
The entry point of every iOS application is the main function.The main function is used
to transfer control to the UIKit framework by calling the UIApplicationMain function.
Beforehand,an autorelease pool,used for memory management within the application,
is created.[16] Hence,the normal launch cycle is performed as follows (see Figure 2.5):
• the user starts the application
• the main function is called by the operating system’s loader
2.2 iOS 13
• the autorelease pool is created
• the UIApplicationMain is called
• the UIApplicationMain loads and initializes the application (e.g.,resources and the
configuration)
• after the load phase,the applications logic is event driven,i.e.,it reacts on received
tasks and gestures
Foreground
Your code
User taps app icon
PDLQ
8,$SSOLFDWLRQ0DLQ
DSSOLFDWLRQ
GLG)LQLVK/DXQFKLQJ:LWK2SWLRQV
Load main UI file
Initialize the app
Event
Loop
Launch Time
Handle events
Activate the app
Switch to a different app
DSSOLFDWLRQ'LG%HFRPH$FWLYH
Figure 2.5:The application launch circle.[16]
2.2.4 Security Features
iOS implements several security mechanisms to counterfeit illegal actions such as control-
flow attacks or the introduction of new code at runtime.
Since iOS v2.0,Apple enforces the writable or executable security model (W⊗X) on the
stack and the heap.This countermeasure effectively prevents code injection attacks,
such as the conventional stack overflow attack or the heap overflow attack,as injected
code is located in non-executable areas.However,this does not prevent an adversary
from injecting data to damage a stack frame or the heap,the injected data is just not
14 2 Background
executable.[3,6,7,17]
Moreover,iOS implements the stack-based countermeasure Stack-Smashing Protector
(SSP) to detect stack-smashing attacks.SSP features guard values (canaries) to detect
illegal stack modifications,as well as bounds-checking for selected critical functions;SSP
is applied to the stack only,not protecting the heap.[3]
As of iOS v4.3,Address Space Layout Randomization (ASLR) is introduced in two levels
of completeness.Depending on whether an application has been compiled with support
for position-independent executables (PIE),either all memory regions will be randomized
or the main executable binary,the dynamic linker,and the main thread’s stack will
always begin at the same location in memory.All built-in applications are compatible
with full ASLR.[3,7]
In order to verify the authenticity of all executable code,iOS enforces Mandatory Code
Signing (MCS).The MCS system requires that all native code is signed by a known and
trusted certificate and also forms the basis of the code signing security model in iOS.[7]
The important components to the code signing security model are:
• Developer Certificates:
In order to run custom applications,a developer must be granted a Developer
Certificate from Apple.
• Provisioning Profiles:
A Provisioning Profile is a XML plist file signed by Apple that is used to install a
developer certificate permitting the execution of custom code.
• Signed Applications:
All iOS executable binaries and applications must be signed by a trusted certificate
(Apple signs every application distributed via the App Store).
• Entitlements:
Entitlements grant an application further privileges (only entitlements granted to
the developer certificate can be granted to an application).
The code signing security model is completed by the Code Signing Enforcement (CSE)
security protection,which prevents the introduction of new executable code at runtime.
This actively prevents loading unsigned or using self-modifying code (e.g.code that has
been compressed or encrypted as often used in malware).[3,6,7]
The dynamic-codesigning entitlement constitutes an exception by loosening the CSE to
allow runtime generated code,in order to support native JIT compiling.In all other
cases,CSE enforces a strict policy on invalid signatures;the entire process is invalidated,
if a single memory page is invalid.[3,7]
The security protections related to code signing are implemented by the AppleMobile-
FileIntegrity kernel extension for the TrustedBSD mandatory access control (MAC)
framework to install MAC policy hook functions in order to check the signatures on
executed binaries.[6,7]
The application-based security model on iOS dictates that each application is isolated
from other applications.Thus,each application is placed in a sandbox at install time,
2.2 iOS 15
enforcing the separation and protecting the underlying operating system by limiting the
access to the file system,preferences,network resources,hardware,and other resources.
As part of the sandboxing process,each application is assigned a unique Application
Home Directory or container on the device file system,benefiting from the Unix-based
security model.But since all applications run with the same user access rights (mobile),
the normal Unix-based security model is not able to provide sufficient application isola-
tion and system protection.[6,7,8]
Hence,the core of the sandboxing is achieved by a policy module for the TrustedBSD
MAC framework,further referred to as the sandbox.[8]
The installment of the sandbox is done by systemcall (syscall) hooking and kernel object
tagging using the MAC framework.The kernel module sets up a table of systemcalls and
kernel structure life cycle management functions to hook,defined by a sandbox profile,
and then calls the TrustedBSD API to install policy checking.After the initialization,
all hooked function calls will pass through the MAC policy enforcement,where every op-
eration that has been defined in the sandbox profile on that event is evaluated.[6,7,8]
The container sandbox profile is assigned to all third-party applications.This profile
restricts file access to the application’s container,the user’s photo library,the iTunes
library,the user’s address book,and necessary system files.Additionally,all outbound
network connections,except for connecting to launchd’s unix domain sockets,the cre-
ation of sockets to receive kernel events and the system routing table,actions related
to POSIX semaphores,shared memory,file IOCTLs,Mach bootstrap servers,network
socket binding and accepting inbound connections,certain classes of privileged actions,
and reading kernel state information through the kernel sysctl interface are allowed.[7,8]
2.2.5 Function-Calling Conventions Used in the iOS ABI
The following description is mainly derived from [18],other references are given when
used.
The function-calling conventions used in the iOS ABI are largely based on the Procedure
Call Standard for the ARM Architecture (see 2.1.3).
In contrast to the AAPCS described in Section 2.1.3,the function-calling conventions
define that R7 is used as the frame pointer register,and that the whole iOS environment
uses the little-endian byte ordering scheme.Furthermore,all subroutine calls and return
sequences must support interworking between ARM and Thumb states by using the
appropriate BLX and BX instructions for all calls to function pointers.
A compliant function prolog:
1.pushes the value of LR onto the stack
2.pushes the value of R7 onto the stack
3.stores the address of the saved R7 (current SP) into R7
4.pushes the values of the registers that must be preserved onto the stack
6
6
Only registers that are altered during the function must be preserved.
16 2 Background
5.allocates space in the stack frame for local storage
7
.
Listing 2.2 shows an example prolog:
1
PUSH
{
R4
,
R5
,
R7
,
LR
}
//save used registers,frame pointer and return address
2
ADD
R7
,
SP
,#
8
//adjust frame pointer to current frame -> points to saved R7
3
SUB
SP
,
SP
,#
8
//allocate 8 bytes of space
Listing 2.2:Example of an iOS function prolog.
In addition,every function epilog has to [18]:
1.deallocate the previously allocated space used for local storage in the stack
2.restore the preserved registers saved in the prolog
3.restore the value of R7
4.return by loading the saved LR into the PC
8
.
Listing 2.3 shows an example epilog:
1
ADD
SP
,
SP
,#
8
//deallocate space
2
POP
{
R4
,
R5
,
R7
,
PC
}
//restore saved registers
Listing 2.3:Example of an iOS function epilog.
2.3 Objective-C
The Objective-C programming language extends standard ANSI C and supports the
same basic syntax.Its additions to C are mostly based on Smalltalk [19].Objective-
C provides a syntax for defining classes and methods,as well as other constructs that
promote dynamic extension of classes.
Since many decisions are deferred from compile-time to runtime,a runtime system is
required in order to use Objective-C.[20,21,22,23]
All iOS applications use the Objective-C 2.0 runtime version.[22]
2.3.1 Classes and Objects
Since Objective-C is an object-oriented,dynamically typed programming language,it
mainly operates on objects that are instances of classes.For this purpose,Objective-C
provides three basic constructs for encapsulating data with the actions (referred to as
methods) that operate on that data [20,21]:
7
This can be done explicitly by substracting from the SP,or implicitly using write-back (storing onto
the stack with automatic SP modification),or this step is skipped when using the frame pointer
instead.
8
LR can also be loaded in every other register,the return is then achieved by branching.
2.3 Objective-C 17
• metaclasses
• classes
• objects
Objects are instances of classes
9
,whereby classes are instantiated from their metaclasses
(class objects),and metaclasses are instantiated from their root’s metaclass (metaclass
object).This referential integrity shows that each construct is in fact an object.[20,
21,24] The classes serve as definition of class instances including [21]:
• the class name and its superclass
• a template describing the instance variables
• the method declarations and their return and parameter types
• the method implementations.
Moreover,class definitions are additive and thus every class inherits all methods and
instance variable declarations from its superclasses up to its root class.While a subclass
can override inherited methods,it cannot override inherited instance variables.[21]
Figure 2.6:Objects in Objective-C.
In this context,Figure 2.6 outlines the relationship between the three constructs:a
metaclass describes the class object and the metaclass object includes all of the class
object’s methods (class methods).The class,however,describes the class instances and
the instance variables (also referred to as ivars or member variables);whereas the class
object includes all instance methods.[21,24]
Both objects are missing instance variables and the compiler builds only one instance of a
9
Also referred to as instances or class instances.
18 2 Background
metaclass for each metaclass,and only one instance of a class for each class.Furthermore,
a class cannot invoke instance methods and the metaclass object is used only internally
by the runtime system.[21]
The instances of a class,in contrast,share the same instance methods from their class
object and contain no methods themselves.But each instance has its own set of instance
variables as declared in the class object.[21]
Every class can have an arbitrary amount of subclasses.Furthermore,every class (except
for a root class) has a superclass from which it inherits.This inheritance results in a
hierarchical tree linking all classes together with a single class as its root - the root class
(see Figure 2.7 and Figure 2.8).Objective-C defines two root classes in the Foundation
framework,NSObject and NSProxy.The former one is most often used by iOS applica-
tions,while the latter one is rarely encountered which is mainly due to the fact that it
is an abstract super class for objects acting as stand-ins (i.e.,as substitution) for other
objects or objects that do not exist yet.[21,23,25]
Figure 2.7:Exemplary class hierarchy.
The root class supplies a definition of the fundamental behavior and an interface common
to all objects by defining:
• allocation and initialization
• duplication
• object retention and disposal
• introspection
10
• comparison
• object encoding and decoding
• message forwarding
• message dispatching
10
The ability for an object to know its own type at run-time.
2.3 Objective-C 19
for every inheriting object.[25]
Figure 2.8:Objective-C object linkage in the class hierarchy.
In Objective-C,all objects (i.e.,instances,class objects,and metaclass objects) are of
type id,a fact that is used for weak typing.Moreover,each object contains an isa
pointer which connects the object to its class.The class objects and metaclass objects
further contain a pointer to their superclass,a method dispatch table,a list of protocols
(see 2.3.4),and a list of all instance variables.[20,21,23]
All objects are dynamically allocated at runtime,where the allocation is usually inherited
from the root class and allocates memory for the new instance.The allocation properly
sets all pointers to create the class connections and sets the instance variables to 0.
However,the initialization is always overridden to perform at least a successive call to
the superclass’s initialization in order to initialize all inherited classes before the instance
can be safely used;ending in a call to the root class initialization.A similar behavior is
20 2 Background
needed on deallocation,as all inherited classes must be properly deallocated.[21,23]
2.3.2 Messaging
In contrast to C which calls functions directly,Objective-C invokes a dynamic message
dispatch to call object methods.These messages are bound to the method implementa-
tions at runtime;hence,a method is identified by a method identifier instead of types.
Nevertheless,an Objective-C method declaration is simply a virtual C function,which
is the reason why methods cannot be overloaded.Moreover,all methods have in com-
mon that two additional parameters prepend the arguments list,the receiving object
(receiver) and the selector.[20,21,22,23]
The selector is defined in two ways:it is used to describe a unique method identifier and
it refers to the first part of the method name (in C this would be the function name).[21]
The complete method name is thereby derived as concatenation of the method signature
keywords [20]:
1
[
object
selector
]
Listing 2.4:Objective-C message syntax.
The object messaging syntax encloses the receiver and the selector,as well as the method
parameters,with brackets [20].The first parameter is separated fromthe selector by using
a colon,whereby other parameters are indicated by a parameter name that is followed by
a colon and the parameter value (argument);parameters are separated from each other
by spaces and optional parameters are separated by commas:
1
[
object
selector
:
argument1
]
2
3
[
object
selector
:
argument1
parameter_name1
:
argument2
...
parameter_nameN-1
:
argumentN
]
4
5
[
object
selector
:
argument1
,
argument2
,...,
argumentN
]
Listing 2.5:Objective-C method name derivation.
The selector and the names of all required parameters,including all colons,build
the selector signature that represents the method name (the colon may be omitted if the
method takes only one parameter) [21,23]:
1
selector
:
parameter_name1
:...:
parameter_nameN-1
:
Listing 2.6:A derived Objective-C method name.
2.3 Objective-C 21
A concrete example is the initialization of an NSString
11
object using an array of char:
1
[
NSString
initWithCharacters:
array
length:
array_length
]
Listing 2.7:Objective-C message example.
This programmatic example takes two arguments:the pointer to an array of char,which
will be stored as string in the object,and the array’s length.The selector signature,
and thus the method name,is derived by concatenating initWithCharacters:and
length:.
1
[
initWithCharacters:
length:
]
Listing 2.8:Objective-C message example continued.
This method name is unique to the runtime system and is thus used to derive a unique
identifier of type SEL (a compiled selector) that is registered by the runtime system.
The uniqueness of the identifier is assured by the runtime system,so that all methods
with the same name share the same selector.This also means that the equality of
two selectors can be determined by checking their compiled form,instead of performing
an inefficient string matching on the method name.[21,22,23]
It is important to mention that a compiled selector identifies a method name,not a
method implementation.[21] For this purpose,an identifier contains a reference to the
associated method name,thus creating a connection between the two.
Because each object can implement its own version of a method,the same selector can
refer to many implementations.Thus,the exact method can be determined only at
runtime,and is therefore looked up by the runtime system using the class of the receiver
and the supplied selector.[20,21,22]
The runtime systemthen calls the determined method passing any arguments,and passes
the method’s return value to the caller as its own.This technique is referred to as
dynamic binding and is realized by transferring Objective-C messages into a call on a
runtime system messaging function,most notably objc_msgSend.[21,22]
This function is actually an imported symbol to every application and realized as a
standard API call.The first two arguments are the receiver and the compiled selector,
followed by the method’s parameters [22]:
1
objc_msgSend
(
receiver
,
selector
,
arg1
,
arg2
,...,
argn
)
Listing 2.9:Objective-C runtime method call.
11
NSString serves as interface to store text strings that are defined at creation and subsequently cannot
be changed.
22 2 Background
When a message like:
1
[[
class
alloc
]
init
]
Listing 2.10:Example Objective-C method call.
is sent,the inner message is dispatched to the class and will return an allocated class
instance.In this case,the messaging function follows the class object’s isa pointer to
the metaclass structure,where it looks up the alloc method selector in the dispatch
table.If it cannot find the selector,it follows the pointer to the superclass and tries
to find the selector in its dispatch table;this is repeated until either the root class is
reached or the message is forwarded to another class (Listing 2.11 shows the runtime call
transformation of the inner message).[22]
1
instance
=
objc_msgSend
(
class
,
alloc
)
Listing 2.11:Example Objective-C runtime class method call.
After a successful allocation,the init message is dispatched to the newly created in-
stance.Now the messaging function follows the instance object’s isa pointer to the class
structure,where it looks up the init method selector in the dispatch table.Again,if
the selector cannot be found,the messaging function traverses the hierarchy towards
the root class.As already pointed out in the previous section,the initialization must
successively call the superclass initialization,this is done before any other initialization
is performed.The whole process is finished,when the initial init call returns the com-
pletely initialized object.Listing 2.12 shows the runtime call transformation of the outer
message and Figure 2.9 outlines the complete procedure.[22]
1
objc_msgSend
(
instance
,
init
)
Listing 2.12:Example Objective-C runtime instance method call.
In terms of the example showed in Listings 2.7 and 2.8,the NSString object is first
allocated and initialized by:
1
[[
NSString
alloc
]
initWithCharacters:
array
length:
array_length
]
Listing 2.13:Concrete examplary Objective-C method call.
These two messages are then compiled as:
1
objc_msgSend
(
NSString
,
init
)
2
objc_msgSend
(
instance
,
initWithCharacters:length:
)
Listing 2.14:Concrete examplary Objective-C method call continued.
2.3 Objective-C 23
Figure 2.9:Objective-C messaging.
This also emphasizes that the first messsage is sent to the class,while the second message
is sent to the newly created instance of the class.
2.3.3 Declared Properties
Declared properties are used to declare and optionally implement accessor methods (so-
called getter and setter) of an object.The compiler generates the descriptive metadata
of all properties and optionally generates two methods for each property;one to get and
one to set the state of the object.[20,21,22,23]
2.3.4 Protocols
A protocol is a class independent list of method declarations that can be implemented
by any class.Thus,a protocol can be used as an interface to anonymous objects,such
as objects that are not yet defined or that have a concealed class identity.Furthermore,
a protocol yields an alternative to subclassing,as a protocol captures class similarities
that are not hierarchically related.[20,21,23,25]
Objective-C defines two kinds of protocols:formal protocols and informal protocols.
While the former declare a list of methods including declared properties,the latter are
realized by categories,a technique that adds methods to a class without subclassing.
Informal protocols are typically categories of a root class like NSObject,thus making the
category methods available in any part of the inheritance hierarchy.However,formal
protocols are represented at runtime as instances of the Protocol class.[21,25]
24 2 Background
2.3.5 Class Clusters
Class clusters are based on the Abstract Factory design pattern which provides an inter-
face for creating groups of related or dependent objects under a public abstract superclass.
Figure 2.10 outlines a possible class cluster.While a number is an abstract type (it would
have to be mathematically specified),an integer or float is a concrete type of number.
The class clustering realizes the interface to all concrete subtypes by classifying that
the concrete types are also of the same abstract type,i.e.,a float is a number,this
also accounts for double,integer,etc.As a result,a client is decoupled from any of
the specifics of the private concrete subclasses,as the abstract superclass in a class clus-
ter must declare methods for creating instances of its private subclasses.Furthermore,
the superclass allocates and initializes the proper subclass object,based on the invoked
creation method,and dispenses it to the client.This also means that the client is not
responsible for releasing the received object,as the client does not own it.A client cannot
choose the class of the dispensed instance.[25]
Figure 2.10:Example of a class cluster.
2.3.6 Runtime System
Objective-C is highly dynamic,most operations,including creating objects and binding
of method implementations,are done at runtime.Hence,the runtime system is designed
to maintain all loaded objects,load and link new classes at runtime,track methods and
their implementation,and even identify classes and selectors by name.[21,22,23]
For this purpose,the public part of the runtime system provides a set of interface func-
tions,most notably [26]:
• functions to work with classes,comprising the retrieval of
– the class name and the class’s superclass
– the information on whether a class is a metaclass
– the complete list of all registered classes
– classes and metaclasses by name
– the class’s methods and class instance’s methods and their implementation
2.4 Mach-O File Format 25
• functions to work with instances,including the retrieval
– and modification of an instance’s class and variables
– of the instance’s class name
• functions to work with methods,especially the
– retrieval of the name and implementation
– substitution of method implementations
– exchange of two implementations
• functions to work with selectors,particularly the
– retrieval of the selector name (method name)
– registering of selectors by name.
2.3.7 Memory Management
Objective-C on iOS uses a reference counting memory management mechanism.In con-
trast to automatic garbage collection,the programmer is responsible for releasing all
objects that are not needed anymore.The reference counter of a newly created object is
always set to 1,and increased by one each time the object is retained.Thus,every allo-
cation must be undone by a deallocation,and every retain must be undone by a release
or autorelease;where autorelease registers the object into an autorelease pool which is
ultimately destroyed at the program termination.[21,23]
2.4 Mach-O File Format
The Mach object (Mach-O) file format is the standard used to store applications and
libraries on disk in the iOS application binary interface (ABI).[27]
Every Mach-O file consists of three parts (see Figure 2.11) [27]:
1.Header structure
12
• identifies the file as Mach-O file
• indicates the target architecture
• qualifies the amount and size of the load commands
• contains flags specifying interpretation options.
2.Load commands
• are variable sized
• specify the file’s layout and linkage characteristics
12
See B.1 for details.
26 2 Background
• specify the location of the symbol table
• specify main thread’s context
• specify shared libraries,frameworks,and imported symbols.
3.One or more segments:
• a segment defines a range of bytes in a Mach-O file and memory protection
attributes
• each segment contains zero or more sections
• each section contains code or data.
Header
Load commands
Data
Section 1 data
Section 2 data
Section 3 data
Section 4 data
Section 5 data
Section n data
Segment command 1
Segment 1Segment 2
Segment command 2
Figure 2.11:The Mach-O file structure.[27]
Every iOS application identifies itself as cputype = ARM and either as cpusubtype =
CPU_SUBTYPE_ARM_V6 or as cpusubtype = CPU_SUBTYPE_ARM_V7
13
.Furthermore,a typ-
ical iOS application contains at least 20 segments or sections respectively,including:
• __TEXT segment
– __text section
→ contains the application’s executable code
• __DATA segment
– __objc_classrefs section
→ contains the class references used in the application’s executable code
13
See B.2 and B.3.
2.5 Control-Flow Integrity 27
– __objc_selrefs section
→ contains the selector references used in the application’s executable code
– __lazy_symbol section
→ contains the imported symbol references used in the application’s exe-
cutable code
– __objc_data section
→ contains the embedded class definitions
– __objc_const section
→ contains the instance variables,methods,protocols,and properties of the
embedded classes
• __LINKEDIT segment
– contains linking and binding information,aside other information
The aformentioned are of further interest,as they are used in this thesis for analysis
purposes.
2.5 Control-Flow Integrity
Control-flow integrity asserts the circumstance that only predetermined legitimate paths
of a program’s control-flow are followed.More precisely,the CFI policy dictates that
an application’s execution must follow a path of a control-flow graph (CFG) determined
ahead of time.A CFG is created by means of static analysis,runtime profiling,or
additional compiler output and serves as a specification of allowed control transfers in
an application.[3,28,29]
Thus,CFI consists of two major steps:
1.CFG determination and
2.runtime CFG enforcement.
The general CFG determination is accordant to the following facts [28,29]:
• each program can be separated into instruction sequences with a single entry and
a single exit instruction (basic blocks,BBLs)
• every instruction w
i
is followed by an instruction w
j
• each instruction w
i
targets either a constant destination or the destination is com-
puted
• each successive instruction w
j
is either part of the current BBL or part of another
BBL
• under the assumption that code is not writable,each constant destination can
be seen as legitimate,as long as no BBL transition occurs
• it is assumed that data is not executable.
28 2 Background
Therefore,the CFG describes a graph covering every BBL transition and every computed
destination,whereby consecutive instructions within the same BBL need not to be cov-
ered.[3,28,29]
On ARM,the PC is a general purpose register.This means that not only every branch,
but also every instruction that modifies the PC,must be covered by the CFG [3,10]:
• branch with link and exchange (BLX) and every branch with link (BL) is considered
as BBL transistion,as these instructions are used to call functions
• branch and exchange (BX) at the end of a BBL is considered as BBL transistion,
since it is used to return from a function
• pop multiple registers (POP) or load multiple (LDM) that includes the PCis considered
as return,and thus as BBL transition
• branch and exchange (BX) that is not at the end of a BBL,and any instruction that
modifies the PC,is considered as calculated jump
• table branch byte (TBB) and table branch halfword (TBH) are special cases of PC
modifying instructions
• branch and exchange jazelle (BXJ) is not covered,as it transfers the execution to
Java byte-code.
FUNCTION
[...]
[...]
RETURN
[...]
CALL FUNCTION
[...]
CALL EXTERNAL FUNC.
[...]
RETURN
BBL
t
BBL
j
BBL
k
External function
Possible BBL transitions
[...]
CALL BBL
x
[…]
Figure 2.12:Exemplary control-flow between basic blocks.
Figure 2.12 shows an exemplary control-flow starting from one arbitrarily chosen BBL.
The control transfers to BBL
x
by performing a call,where BBL
x
also transfers control
to further BBLs.The schematic also exemplifies that the BBLs could transfer control to
other BBLs themselves.
The runtime enforcement then uses the CFGto insert runtime checks that will detect and
prevent control-flow deviants.This rewriting process can be achieved in two ways:static
binary rewriting and dynamic binary rewriting.While the latter is applied at load-time
using memory patching,the former is applied to the binary and thus breaks signatures;
both techniques will create the same memory representation at runtime.[3,28,29]
Moreover,the rewriting can be applied in two forms:by inserting the validation routine
before the critical instruction (inline validation) (Figure 2.13) or by overwriting the
critical instruction with a branch into a validation routine (Figure 2.14).[3,28,29]
2.5 Control-Flow Integrity 29
Both techniques have advantages and disadvantages.The former one requires no extra
branching,but it postulates extra space within the code.While the extra space could
be easily allocated,the code is of a fixed size,linked,and aligned.Thus,the insertion
would break the existing code,if the rest of the code is not properly ajusted.[3]
The latter one,however,overwrites only existing instructions,which does not break any
code
14
,but induces at least one extra branch.
FUNCTION
[...]
[...]
RETURN
External function
[...]
-------------------------------------------
validity check
-------------------------------------------
CALL FUNCTION
[...]
-------------------------------------------
validity check
-------------------------------------------
CALL EXTERNAL FUNC.
[...]
-------------------------------------------
validity check
-------------------------------------------
RETURN
exit
t/f
t/f
t/f
Figure 2.13:Control-flow enforcement using inline validation.
FUNCTION
[...]
[...]
RETURN
External function
Check for valid control-flow
exit
Check for valid control-flow
t/f
t/f
Check for valid control-flow
RETURN
t/f
[...]
CALL VALIDATE
[...]
CALL VALIDATE
[...]
CALL VALIDATE
Figure 2.14:Control-flow enforcement using a validation function.
14
But this will likely break the program’s logic if it is not compensated.
30 2 Background
The verification is then either done on the basis of a table that holds all valid destina-
tion addresses paired with the corresponding source addresses,or the rewriting algorithm
places an ID just before every destination that must be checked.Consequently,the val-
idation routine can check the validity by using either the table or the ID,depending on
the applied approach,to distinguish between legal and illegal control-flows.[3,28,29]
31
3 Design of the Enforcement Framework
The enforcement framework is based on MoCFI [3] and can be subdivided into two
phases:static analysis and runtime enforcement (the general architecture is depicted in
Figure 3.1).The static analysis consists of a preprocessor to decrypt and disassemble
the application and the derivation of the control-flow graph (CFG).The preprocessor is
adopted from MoCFI,whereby the CFG derivation has been significantly improved to
maximize the coverage of branches and to resolve certain issues,such as the If-Then ex-
ecution state (ITSTATE),instructions that are calculated relative to the PC,and other
instructions that were not covered yet.Furthermore,the analysis has been changed in
order to support an in-depth check of the Objective-C objects and selectors as well as
the imported API calls.This information is used as part of the CFG and to enforce a
supplied policy rule set.
It has to be noted that the static analysis has to be done only once for each deployed ap-
plication.Thus it is reasonable to employ as many tasks as possible within the analysis.
The derived patch file and the CFG files can then be deployed along with the application;
the deployment would also ensure the files’ integrity as they would be signed.This does
not apply for the rule set in general,as any editing of a deployed rule set would break
the signature.This issue can be solved by excluding the rule set from the signature and
further protecting it from unauthorized editing,e.g.,using the mandatory access control
(MAC) framework implemented in iOS.
The runtime enforcement consists of a rewriting process to redirect branches to a valida-
tion routine and the validation process.Whereas the rewriting is applied only once per
run at load-time after the application has been completely loaded,the validation is ap-
plied whenever a branch is executed.The validation routine then checks if the execution
flow is legal,thus asserting the CFG,and finally validates any external branches with
the enforcement rules.
The remainder of this chapter describes the aforementioned systemcomponents in detail.
The preprocessing is shown in Section 3.1.Afterwards,the binary analysis is presented in
Section 3.2.Section 3.3 outlines the load-time.Finally,the runtime module is decribed
in Section 3.4 and 3.5.
3.1 Preprocessing
The first step of the analysis is the acquisition of an unencrypted version of the binary.
This step is performed exactly as in MoCFI using process dumping and disassembling.
As this thesis does not cover the details of the general control-flow integrity (CFI) ap-
proach,all evaluations are performed on applications we developed ourselves and on the
32 3 Design of the Enforcement Framework
Preprocessor
iOS Binary
Decryption
Dissassembling
Decrypted iOS
Binary
Patch File Generator
Control-Flow Graph
Generator
Objective-C and
Imports Analysis
Patch File
Control-Flow
Graph
Objective-C
Objects
Objective-C
Selectors
Imports
MoCFI
Load-Time Module
Runtime Module
Initialization
CFI Enforcement
Policy
Enforcement
Policy
Static Analysis Runtime Enforcement
Rewriting
Figure 3.1:Framework architecture.
SpyPhone application,whereby the compilation is done with XCode without applied
encryption.
3.2 Binary Analysis
The unencrypted application is analyzed to detect all performed branches that have to
be verified at runtime and to determine the CFG.The original CFI for the x86 archi-
tecture uses the binary instrumentation framework Vulcan [30] to derive the CFG and
to statically rewrite the application binary.This could be done similar by using the
BitRaker Anvil binary instrumentation [31],as it supports the ARM architecture,but
unfortunately it is not publicly available.
For this reason,the analysis approach of MoCFI is used as basis for the analysis imple-
mented in this thesis.Besides an improved branch detection that creates a patch file for
the rewriting process,the analysis borrows the CFG file generator which uses heuristics
to calculate possible target addresses for each indirect branch ahead of time.The MoCFI
approach further used PiOS [9] to determine CFG parts for the Objective-C calls.This
has been completely discarded,because the static analysis performed by PiOS is not
able to retrieve all used Objective-C structures,performs a partly wrong object identifi-
cation
1
,and does not take the dynamic of the runtime system into account.Hence,we
replaced PiOS by a new analysis tool,that provides a much more fine-grained precision
for indentifying all imported and implemented (embedded) classes,all used selectors,all
implemented protocols,and all implemented properties.Finally,the analysis tracks all
imported API calls and resolves their position-independent code call location,which is
1
The analysis identifies non-trackable objects as instances of NSObject.But since NSObject is a root
class this matches all possible instances and thus does not suffice a runtime validation.
3.3 Load-Time Module:Binary Rewriting 33
used in the main executable code.All information are stored in configuration files that
will be loaded by the runtime module as part of the CFG and as part of the enforcement.
3.3 Load-Time Module:Binary Rewriting
Whenever an application is loaded,its memory representation contains no CFI checks
(since the binary has not been statically rewritten),thus the code has to be instrumented
to perform the needed validation.As the insertion of the needed checks would require
further memory adjustments,MoCFI opted for a binary rewriting technique to redirect
the execution to the runtime module.This approach overwrites all relevant branches
(i.e.,those from the patch file) with a single instruction.This technique is borrowed and
adjusted to support the new introduced patches derived from the improved patch file
generator.
3.4 Runtime Module:Control-Flow Integrity Enforcement
The MoCFI architecture serves as basis for any control flow checks,whereby some vali-
dation checks are completely rewritten to support the policy enforcement.Furthermore,
a deeper analysis of MoCFI has shown that certain execution paths were never reached
and that some internals could be optimized to achieve a better performance.All these
aspects lead to a partly redesign of the CFI enforcement,but the general tasks remain
the same:every basic block (BBL) transition must be validated to assert the valid CFG.
3.5 Runtime Module:Policy Enforcement
The policy enforcement uses the information retrieved in the analysis to audit exter-
nal calls (i.e.,calls to external libraries) using a supplied policy rule set as outlined in
Figure 3.2.
Therefore,the policy enforcement core is implemented inside the runtime checks of ex-
ternal calls,whereby each call is verified to enforce both a valid control-flow and policy
compliance.Furthermore,the policy enforcement enables three types of execution for
each rule:
• Log:the policy violation will be written to the system’s log file.
• Exit:the policy violation results in a process termination.
• Return of safe values:the violation is tolerated,but the return value will be
replaced by a safe value.
Each execution type can be used to achieve different objectives.The logging option en-
sures that all policy violations are noticed by the system,the application can continue
its execution and the system’s log file can then be used to audit these violations.The
exit option is the strictest type,as it ensures that the process is immediately terminated
34 3 Design of the Enforcement Framework
35
4 Implementation Details
The prototype implementation is developed as dynamic library in Objective-C++ using
XCode 4 and runs on iOS 4.3.3.Furthermore,the static analysis tools are developed for
the use with the well-known disassembler IDA Pro 6.x,whereby the branch detector is
written in the IDA scripting language IDC,while the other tools are written in Python
using the corresponding IDA extensions.
The prototype implementation protects only the application’s main executable code,
namely the __text section.But since no new obstacles have to be overcome,the ap-
proach can be extended to protect other sections as well.
In this section,the implementation is described in detail,i.e.,which branches are de-
tected,the control-flow graph (CFG) derivation,the modifications to the control-flow
integrity (CFI) architecture,and finally the policy enforcement implementation.
The remainder of this chapter is structured as follows.First we outline the static analysis
in Section 4.1,including the patch file generation and the CFG determination.After-
wards,we describe the load-time module that is used to rewrite the binary and that ini-
tializes the data structures for the runtime module in Section 4.2.Finally,Section 4.3.1
presents the runtime module that consists of the CFI enforcement and the policy en-
forcement.
4.1 Static Analysis
The static analysis is applied on iOS binaries to generate the CFG,identify all relevant
branches,track all imported API calls and Objective-C information.Since the validation
routines are not a native part of any deployed application,the branch detection is vital
for a successful binary rewrite,which again is essential for the control-flow redirection to
the runtime module that finally performs all CFG validations.
All performed analyses are automated by using the IDA Pro 6.x disassembler that sup-
ports the ARM and Thumb instruction sets and the tools we developed ourselves.
4.1.1 Patchfile Generation
The information required by the load-time module is stored within a patch file in the form
of metainformation about the relevant branches of an application.This metainformation
is generated by the branch detector.
For this purpose,the first step in the branch detector is to locate the __text section
of the binary.After it has been found,each instruction is evaluated and checked if it is
used to perform a branch.In this context the following assumptions are made:
36 4 Implementation Details