Reverse Engineering Made Easy - Codegate.org

apatheticyogurtΛογισμικό & κατασκευή λογ/κού

13 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

91 εμφανίσεις

Reverse Engineering
Made Easy

Alex Radocea

감사합니다


Thank you

it is nice to meet you


Sophomore at Rensselaer Polytechnic
University studying CS + Engineering


Member of
Beistlab
, rpisec,
“lollerskaters dropping from rofl copters”

Presentation Goals


Share some insight on understanding
code without source


Encourage reverse engineering projects


Teach how to reverse the easy way


Discuss strategies to build these tools

Typical Approach


Static disassembly for big picture


string references, flow graphs, symbols


annotate blocks of instructions as higher
level constructs


objdump, ida, custom+graphviz


Debugging for little picture


windbg, olly/immdbg, gdb


Reverse Engineering well is all about
asking questions

Part 1


Asking questions with Hit
-
Tracing

Audience Poll


Pai Mei’s Process Stalker ?

http://code.google.com/p/paimei/


OllyDbg hit
-
trace plugins, HBGary, etc

PaiMei


Written and released by Pedram Amini


provides a strong interface to windows
debugging facilities, for python


Great support for function hooks (entry
and exit) with callbacks


very useful for automatic crash analysis

Hit Tracing


Trace code execution using breakpoints


Filter uninteresting execution traces


Focus on the code that really matters

How to implement one?


Record all instructions, code blocks, or
functions being executed


Speed vs Utility tradeoffs


record all vs unique hits


how much context per hit?

How to make it useful


Provide filters and other mechanisms
for relating and comparing different
execution traces


Provide GUI to see results


Store the right information

Using PaiMei


Quick demo

Reversing without Reversing


GameShark, Game Trainers


track values in memory across time to
isolate items of interest (HP, gold
coins, refire rate)


Why not track code?

BioGB
-
ht Demonstration


1. walk into a wall


2. walk into an empty space


Execution will fork


3. Patch branch to walkhack

BioGB
-
ht Implementation


Patch to BioGB by Ruben Daniel
Gutierrez (C++)


http://rpisec.net/projects/show/biogb
-
ht


Added debugging facilities and an
interface to sqlite


Every new instruction is recorded once

BioGB commands



[hit tab] to enter/leave debug mode

si

-

single step (F7)

sb

-

step till branch (F6)

bp

-

set breakpoint

dbp

-

remove

aw

rombank addr
-

add watchpoint

dw

rombank addr
-

del watchpoint

lw

-

list all watchpoints

x

rombank addr
-

dump data/code

sflag

-

set flags

wv

rombank addr value
-

write

sc

min max
-

scan values

dsc

-

delete scan list

ssc

-

show scan list (terminal)

trace

-

begin hit tracer;
tstop

-

stop tracing

db

filename
-

select database

st

tablename
-

save trace

dt

-

delete current trace

ft

tablename
-

use trace as filter

uft

tablename
-

unset filter

sf

-

show filters available

allf

-

use all filters from db

sactf

-

show active filters

Other Applications


Tie in hit
-
tracer to fuzzer to get test
-
case code coverage


Code profiling without rebuilding


Fast reverse engineering is all about
having the right tools

Part 2


Static Analysis

Binary Static Analysis Goals


Automate the easy tasks


Translate code to a higher
-
level
representation


flow graphs


highly digestible representations

Example


John has a Tattoo Removal Machine


John wants to do laser etching with it


15+ year old machine, no help from
vendor, no documentation on the
hardware.


Solution: reverse it

Difficulties


80C32 architecture not exactly
supported by professional reversing
toos


EPROM holds 30k instructions, ~300
unique functions


80C32 is a really ugly CISC; 3 different
types of memory access


No dynamic analysis easily possible

The process


look for string references


follow code from interrupt vectors


follow entry points


identify functions prologues

= confusion

The Tip


Over the serial console the machine
echos back lowercase input as
uppercase letters


Lucky numbers: 0x61, 0x7b, 0xdf

suboptimal code

Wishes


All
possible

control flow graphs leading
to this function


All
possible

data flow graphs interacting
with this function


Why doesn’t this generically exist?

Decompilers


Stack
-
based VM
-
interpreted byte
code languages have almost perfect
decompilers


Flash, .NET, Java


Partial solutions for C/C++: HexRays
plugin for Ida, Reverse Engineering
Compiler, Boomerang

Helping John


Create programs to generate flow
graphs to and from target functions


Produce higher
-
level language
disassembly

General Implementation


Split a program into functions and code
blocks


Use heuristics to identify higher
-
level
looping constructs from flow graphs


Transform function calls and variable
usage into a more explicit form


annotate with inputs, outputs

rcos
-
binstat


Rensselaer Center for Open Source
funded with a grant from Sean O’
Sullivan


Attempting to build a generic framework
for this type of work

rcos
-
binstat inner workings


Translate x86, MIPS to a very simple
forth
-
like IR


Generically detect and build flow graphs
of code blocks and functions


Transform library call mechanisms into
one instruction


basic register propagation for value and
string resolution in disassembly

Arguing for Static Analysis


Using Rice’s Theorem, proving any
non
-
trivial property of a program can be
reduced to the halting problem


Halting Problem misconception:


people can solve undecidable
problems computers can not

Undecidability


Shortest known unsolved instance of
Post’s correspondence problem

Symbolic Execution


Inputs and outputs are defined as
ranges of values


symbolically analyze input throughout
program execution


People* have been using satisfiability
solvers to trigger bugs

Double Free Detection


SSA
-

single static
assignment


Free should never be
called with the same
symbol twice in the same
code path

Overflow Detection


Follow input to insecure functions


Determine stack buffer boundaries
based on use and neighboring memory
access heuristics


Use annotations for library functions to
infer write destinations and conditions

Irix Case Study


Irix 6.5.5 on a machine at school; 30+
setuid binaries


Feels kind of like traveling back to the
year 2000


Quickly found heap overflows, buffer
overflows using better rcos
-
binstat

Irix Results


startmidi stack buffer overflow


libimdUtil.so sprintf(..., getenv()) heap
overflow

Next steps


plug in satisfiability and theorem solvers
to do symbolic execution


Provide richer annotations per block to
begin true emulation for accurate code
reach
-
ability


Develop loop heuristics for converting
IR into higher level code

Q&A


Thank you


Links


http://rpisec.net/projects/show/rcosbinstat



Pai Mei
-

http://code.google.com/p/paimei/


FreeBSD/Linux ptrace interface
-

http://rpisec.net/projects/show/todoht

; BioGB
http://rpisec.net/projects/show/biogb
-
ht


http://www.unprotectedhex.com/psv/