Computer Security Fall 2002 - University of New Orleans ...

saucecopywriterInternet and Web Development

Feb 2, 2013 (4 years and 8 months ago)

166 views

Talk @ N*A, 2010

Next Generation Tools for


Digital Forensics Investigations

Golden G. Richard III



golden@cs.uno.edu

golden@digitalforensicssolutions.com



http://www.cs.uno.edu/~golden

http://digdeeply.com


Talk @ N*A, 2010

Who is This Guy?


Professor of Computer Science @ University of New
Orleans



Director, Greater New Orleans Center for
Information Assurance (GNOCIA) @ University of
New Orleans



Co
-
founder, Digital Forensics Solutions, LLC in New
Orleans



GIAC
-
certified Digital Forensics Investigator



United States Secret Service Cybercrime Taskforce

Talk @ N*A, 2010

Basic Concepts

(Very Briefly)

Talk @ N*A, 2010

4

Digital Forensics

Definition: “Tools and techniques to recover,
preserve, and examine digital evidence on or
transmitted by digital devices.”


Computers, PDAs, cellular phones, videogame
consoles, digital cameras, copy machines,
printers, digital voice recorders…

Talk @ N*A, 2010

5


Data. “You only think it’s gone.”


Sensitive data tenaciously clings to life.











Vast majority of users have no idea what’s stored on
their digital devices…


…and no ability to properly “clean up” even if they do
suspect what’s there

What That Really Means

Talk @ N*A, 2010

6

Why Should You Care?


Privacy is good.


Knowing what’s stored and how to control
access and securely destroy data is important


99% of users only think they know


Prosecuting bad people is good.


Discovering what bad people have stored is
extremely important


Prosecuting good people is bad.


Defending innocent people against charges of
storing “bad” stuff is good

Talk @ N*A, 2010

7

Why Else?


Lots of interesting problems


Lots of research and hacking to do


…algorithms…


…filesystem research…


…deep OS internals…


…reverse engineering


…data mining…


…machine learning…


…parallel/distributed computing…


…GPU
-
based computation…




Talk @ N*A, 2010

8

Examples of Digital Evidence


Documents


Threatening emails


Suicide notes


Bomb
-
making diagrams


Malicious Software


Viruses


Worms


Keystroke loggers


Child pornography (contraband images/videos)


Evidence that network connections were made between machines


Cell phone SMS messages


Deleted voice messages on digital voice recorder


Deleted copy jobs on laser printer


Anything that can be stored on digital devices

Talk @ N*A, 2010

DF Enabler: Data is Hard to Kill


Privacy and fine
-
grained control over data are not
primary design concerns in modern operating systems


OS acts like a adversarial spy


What devices have been plugged in


Which URLs you’ve typed vs. clicked


Lots more


Formatting disks doesn’t delete much data


Quick layout of
filesystem

structures


Previous data remains


Completely uninstalling applications is very difficult


Digital residue remains in registry

9

Talk @ N*A, 2010

10

Data Clings to Life (2)


“Volatile” data hangs around for a long time


Kernel and application data survives long after use


Even rebooting may not erase volatile data!


Using encryption properly is extremely difficult


Temporary files, print spool files, …


Users expect backdoors


Much anti
-
forensics (“privacy
-
enhancing”) software
is broken (see [Geiger2005])


“Broken” hard drives can often be revived


Digital forensics preys on the fact that users
have an impossibly difficult data management
situation


Talk @ N*A, 2010

11

Visualization of 256MB USB Thumb Drive

Talk @ N*A, 2010

12



FAT32 format

Talk @ N*A, 2010

13



NTFS format

Talk @ N*A, 2010

14



ext3 format

Talk @ N*A, 2010

15

When is the Data Really Gone?

Military
-
grade

degausser

or

or

Secure file deletion

software (but make sure

it works!)

Hard drive

confetti

or

thermite

Talk @ N*A, 2010

16

Digital Forensics Process


Legal: Balance need to investigate vs. privacy rights


Identification of potential digital evidence


Where might the evidence be?


Which devices did the suspect use?


Preservation and copying of evidence


On the scene…


First, stabilize evidence…prevent loss and contamination


If possible, make identical copies of evidence for examination


Copies can be made on the spot, or more usually, in the lab


Careful examination of evidence


File recovery / File carving


Keyword searches


Generation of timelines


Examination of the registry





Presentation of results

Talk @ N*A, 2010

17

On the Scene Preservation: Challenges

Living room

Basement/closet

wireless connection

“Dear Susan,

It’s not your
fault…

Just pull the plug?

Move the mouse for a quick peek?

Do a live analysis?



Tripwires

tick…tick…tick…

Volatile

computing

Talk @ N*A, 2010

18

Traditional: Where’s the Evidence?


Undeleted files


Deleted files


Windows registry


e.g., USB device histories


e.g., recently accessed files, URLs


Print spool files


Hibernation files


Temp files (all those .TMP files!)


Slack space


Swap files


Browser caches


Alternate partitions


On a variety of removable media

“Dead” (traditional)

digital forensics

comprises a set of

techniques and tools

to deeply investigate

these sources of

evidence


Talk @ N*A, 2010

Next Generation

Talk @ N*A, 2010

750GB

750GB

300GB

300GB

1TB

1TB

2TB

2TB

2TB

20

Current Generation DF: Too Slow

Talk @ N*A, 2010

All the Solutions are Hard(er)


Goal: Faster, smarter tools


Faster: Apply more computational
resources


Better ideas and better designs


“No hour glass” GUIs


Automatically target “good stuff”


Evidence correlation


Visualization


Better, portable live forensics tools


9 days
remaining

21

Talk @ N*A, 2010

Solutions

# memdump

# ps > /media…

22

Talk @ N*A, 2010

23

File Carving: High Level

one cluster

one sector

header, e.g.,

0x474946e8e761

(GIF)


unrelated disk blocks interesting file

footer, e.g.,

0x003B

(GIF)

“milestones”

or “anti
-
milestones”

Can be time
-
consuming and require a lot

of manual intervention for large drives


(
false positives!
)

See: [Richard2005]

Talk @ N*A, 2010

24

In
-
Place File Carving

preview database

FUSE

scalpel_fs

client applications

nbd server

nbd client

network

local drive

remote drive

Scalpel

See: [Richard2007]

Talk @ N*A, 2010

25

100GB local/network

Talk @ N*A, 2010

26

Too Slow: Symptoms


Machines tied up for days doing preprocessing


Painful to “think outside the box” (i.e., outside
the index) during investigation


Getting an answer to even a simple question


“Does this credit card number appear anywhere?”


“Did Joe Smith send an email to Cassandra Wilson?”


…takes a long time


If a regular expression search takes hours to
complete, what do you do in the meantime?


Talk @ N*A, 2010

27

Faster: Distributed Digital Forensics

Talk @ N*A, 2010

28

Experimental Setup

SCSI

RAID: 504GB

File Server

CPU: 2x1.4GHz

Xeon

RAM: 2GB

Switch

96

-

port, 10/100/1000 Mb

24

Gb

Backplane

1Gb

Node

CPU: 2.4 GHz

Pentium

4

RAM: 1 GB

SCSI

RAID: 504GB

File Server

CPU: 2x1.4GHz

Xeon

RAM: 2GB

Switch

96

-

port, 10/100/1000 Mb

24

Gb

Backplane

1Gb

Node

CPU: 2.4 GHz

4

RAM: 1 GB

Talk @ N*A, 2010

29

A Few Preliminary Results


Target:


Dell Optiplex GX1 w/ 6.4GB IDE drive


NTFS, ~110,000 files in ~7,800 directories


Imaged using dd w/ a Linux boot disk


Machine used for “traditional” investigation:


3GHz P4, 2GB RAM, 2 x 73GB 15Krpm Ultra320 SCSI


FTK v1.43a

Initial Operation
Time
(hh:mm:ss)
FTK Add Evidence
1:38:00
CACHE
0:09:36
8-node LOAD
0:03:58
1-node LOAD
0:05:19
more nodes better at

loading the fileserver

Talk @ N*A, 2010

30

Results (2)


Live string search:


typical first/last name



Regular expression search:


v[a
-
z]*i[a
-
z]*a[a
-
z]*g[a
-
z]*r[a
-
z]*a


Search time:
String Expression
(mm:ss)
Search time:
Regular Expression
(mm:ss)
FTK
08:27
41:50
8-node System
00:27
00:28
See [Roussev2004]

Talk @ N*A, 2010

31

A Different Experiment


Stego

detection using
Stegdetect

0.5 under Red Hat Linux on
the cluster


Traditional:


6GB image mounted using loopback device


find /
mnt
/loop

exec ./
stegdetect

‘{}’
\
;


790 seconds == 13:10 minutes


Using the distributed framework


Stegdetect

0.5 code incorporated into framework


Detection against cached files


“STEGO” command (after IMAGE/CACHE)


82 seconds == 1:22 minutes


9.6X faster with 8 machines

Talk @ N*A, 2010

Multicore CPUs


Gone: ever
-
increasing clock rates


Dual
-
core / Quad
-
core


6
-
core
7400
Xeon® processor (
1.9
billion
transistors)


for server applications


What's next?


10
-
100
's of cores in a single processor

32

Talk @ N*A, 2010

33

Finding More Processing Power

Single CPU

speed

speed

Multicore CPUs

speed

Clusters

Filling this gap?

Graphics Processing Units (GPUs)?

Talk @ N*A, 2010

34

Too Slow: Another Symptom










Hint: It requires a $500 video card

Talk @ N*A,
2010

Modern GPUs

2007:

G80 GPU

768MB Device Memory

128 compute cores, >1GHz

each


2009:

G200 GPU

1+GB Device Memory

200+ compute cores, >1GHz

each


Hardware thread management,

can schedule millions of threads


Can use multiple GPUS per
workstation

Thanks, gamers!!

35

Talk @ N*A, 2010

GPU Horsepower

pixelsnort

GPU Scalpel (
2007
)


gnort

36

Talk @ N*A, 2010

Challenges


Most programmers not familiar with massively threaded
software designs or distributed computing


Potentially complicated synchronization issues


Programmer doesn't care → software will be too slow


GPU programming harder


Requires application to be broken into distinct host / GPU
components


GPU component is SIMT (SIMD)



Complicated memory hierarchy


Components must bulk copy data between host and GPU


Portability issues

37

Talk @ N*A, 2010

All the Solutions are Hard


Programming multicore CPUs / GPUs


10
-
100
s of cores


Programming is hard, portability issues


On top of that…


Poor language support for massively threaded designs


Python?





unless you want to spawn multiple interpreters?


Ruby?






C implementation uses non
-
thread
-
safe libraries


Java?





C?







does this make you happy?


Erlang?


(!)


38

Talk @ N*A, 2010

RELEASE

RUN TIME BW

Scalpel v1.60


448
s
45
MB/s

Scalpel v1.91MT
-
multicore


178s 111MB/s

Scalpel v1.91MT
-
multicore
-
async


146s 140MB/s

Scalpel v
1.91
MT
-
gpu
-
async


77s 265MB/s

The Headache Does Pay Off

20
GB disk image,
25
file types targeted for carving. Quad
-
core Dell XPS
720
w/
4
GB
RAM,
8
x
15
K SCSI disk array (max bandwidth ~
600
MB/sec), G
200
GPU w/
896
MB
device RAM,
192
compute cores.

Custom binary string search + async I/O + massively threaded
design with GPU / multicore overlap


* haven’t yet touched “no transfer” CPU


GPU communication *

39

Talk @ N*A, 2010

40

Live Forensics: RAM Carving


Can construct patterns and apply file carving
techniques to discover fragments of application
data hours or days old in memory dumps


Process dump of MSN Messenger yields chat
message fragments:

Content
-
Type: text/plain; charset=UTF
-
8

X
-
MMS
-
IM
-
Format: FN=MS%20Shell%20Dlg; EF=;
CO=0; CS=0; PF=0

Are you coming down for Mardi Gras this year? I’m
dressing up as Peter Frampton. Do you feeeeel…

Talk @ N*A, 2010

41

Expanding Scope: Live Forensics


Running processes


open DLLs


registry


file handles



Open files


Network connections


Memory


Regular disk files


Images of entire disk


Live disk imaging


Deleted files


Live file carving

e.g., detect keystroke loggers

e.g., discover unauthorized programs

e.g., find” “shreds” of volatile

evidence, analyze kernel structures

Do “traditional” forensics on a live machine

Talk @ N*A, 2010

42

Andreas Schuster: PTFinder:
Windows Memory Carving

Use known or predictable

values to carve kernel

objects out of memory

dump

Avoids problems with list
-

carving approaches, e.g.,

with DKOM

Talk @ N*A, 2010

Schuster’s PTFinder

43

Carve and analyze

Windows kernel structures

from a physical memory dump

to identify process activity

Talk @ N*A, 2010

44

Schuster’s PTfinder

Talk @ N*A, 2010

45

Schuster’s PTfinder

Talk @ N*A, 2010

46

FACE: Linux Memory Analysis


“FACE: Automated Digital Forensics Discovery
And Correlation”


First paper appeared at 2008 Digital Forensics
Research Workshop (DFRWS); see [Case2008]


Deep analysis of kernel structures in physical
memory dump (for Linux)


Correlation between network traces, filesystem,
log files, and physical memory dumps

Talk @ N*A, 2010

47

FACE: ps on Memory Dump

# ./ramparser debian.vmem

x



PID UID GID NAME


1 0 0 init


2 0 0 migration/0


3 0 0 ksoftirqd/0


4 0 0 watchdog/0


5 0 0 migration/1


6 0 0 ksoftirqd/1










2425 0 43 xterm


2426 0 0 bash


2515 0 0 pdflush


2521 0 0 firefox
-
bin


2548 0 0 ftp




Talk @ N*A, 2010

48

FACE: netstat on Memory Dump

# ./ramparser debian.vmem

N | sort


Proto Local Address Foreign Address State PID Program name

TCP 192.168.20.128:45351 192.168.20.129:20 ESTABLISHED 2548 ftp

TCP 192.168.20.128:55071 192.168.20.129:80 ESTABLISHED 2521 firefox
-
bin

TCP 192.168.20.128:59447 192.168.20.129:21 ESTABLISHED 2548 ftp

TCP 192.168.20.128:59447 192.168.20.129:21 ESTABLISHED 2548 ftp

TCP 192.168.20.128:59447 192.168.20.129:21 ESTABLISHED 2548 ftp

UDP 0.0.0.0:111 0.0.0.0:0 1959 portmap

UDP 0.0.0.0:32768 0.0.0.0:0 2332 rpc.statd

UDP 0.0.0.0:68 0.0.0.0:0 2301 dhclient3

UDP 0.0.0.0:812 0.0.0.0:0 2332 rpc.statd























UNIX /tmp/.X11
-
unix/X0 2417 Xorg

UNIX /tmp/.X11
-
unix/X0 2417 Xorg

UNIX /tmp/.X11
-
unix/X0 2417 Xorg

























Talk @ N*A, 2010

49

FACE: Deep Process Analysis (
1
)

Talk @ N*A, 2010

50

FACE: Deep Process Analysis (2)

Talk @ N*A,
2010

51

FACE: Socket Buffer Dumps

# ./ramparser debian.vmem
-
k
2548

# ls socks

socket
0
socket
15
socket
21
socket
28

socket
34
socket
40
socket
47
socket
53

socket
1
socket
16
socket
22
socket
29

socket
54
socket
60
socket
67
socket
73



# head socks/socket
28

aplacental

aplanatic

aplanospore

aplasia

aplastic

aplenty

aplite

aplomb

apnoea



/usr/dict/words was being

transmitted

Talk @ N*A,
2010

Beyond FACE


Portable live forensics


Most live forensics tools are very brittle


Work for very specific kernel versions


Tools automatically adapt to different kernel versions


See, e.g., [Case2010b]


Deeper live forensics


Look at lower
-
level kernel structures to help build the big
picture in an investigation


See, e.g., [Case2010a]


Analysis of kernel structures to detect malware


Introspection
-
based live forensics for VM


NSF funding pending

Talk @ N*A, 2010

Deeper: Example: kmem_cache Analysis

53

Linux kernel

kmem_cache

in

kernel memory allocator

Previously opened files

Previous network connections

NAT table entries

Memory mappings






Talk @ N*A,
2010

Live Forensics via VM Introspection

54

Talk @ N*A,
2010

55

Some Challenges for Live Forensics


A potential minefield


Memory covering attacks


What you get isn’t what’s really there


Shadow Walker


Split TLB de
-
synchronization attack


Joanna’s hardware poisoning stuff


Disrupt both software and hardware
-
based approaches to memory
acquisition


SMM attacks


Other malware that pollutes the kernel



Most tools simply assume none of this stuff is happening


Biggest problem with these things is that they weaken
your set of basic assumptions

Talk @ N*A,
2010

56

References


[Adel2006] F.
Adelstein
, “Live Forensics: Diagnosing Your System Without Killing It First,”
Communications of the ACM 49, 2 (Feb. 2006), pp. 63
-
66.


[Carrier2006] B. Carrier, “Risks of Live Digital Forensic Analysis,” Communications of the ACM 49, 2
(Feb. 2006), pp. 56
-
61.


[Carrier2004] B. Carrier, J. Grand, “A Hardware
-
based Memory Acquisition Procedure for Digital
Investigations,” Digital Investigation (2004):1.


[Case2010a] A. Case, L.
Marziale
, C. Neckar, G. G. Richard III, “Treasure and Tragedy in
kmem_cache

Mining for Live Forensics Investigation,” Proceedings of the 10th Annual Digital
Forensics Research Workshop (DFRWS 2010), Portland, OR, 2010.


[Case2010b] A. Case, L.
Marziale
, G. G. Richard III, “Dynamic Recreation of Kernel Data Structures
for Live Forensics,” Proceedings of the 10th Annual Digital Forensics Research Workshop (DFRWS
2010), Portland, OR, 2010.


[Case2008] A. Case, A. Cristina, L.
Marziale
, G. G. Richard III, V.
Roussev
, "FACE: Automated
Digital Evidence Discovery and Correlation," Proceedings of the 8th Annual Digital Forensics
Research Workshop (DFRWS 2008), Baltimore, MD, 2008.


[Casey2002] E. Casey, “Practical Approaches to Recovering Encrypted Digital Evidence,”
International Journal of Digital Evidence, (2002) 1:3.


[Chow2004] J. Chow, B. Pfaff, T.
Garfinkel
, K. Christopher, and M.
Rosenblum
, "Understanding Data
Lifetime via Whole System Simulation," Proc. of the 13th USENIX Security Symposium, August
2004.


[Chow2005] J. Chow, B. Pfaff, T.
Garfinkel
, and M.
Rosenblum
, “Shredding Your Garbage: Reducing
Data Lifetime Through Secure
Deallocation
,” Proceedings of the 14th USENIX Security Symposium,
2005.


[
Dornseif
] M.
Dornseif
, “FireWire
-

all your memory are belong to us”,
http://md.hudora.de/presentations/.


[Duflot2006]
Loïc

Duflot
, Daniel
Etiemble
, Olivier
Grumelard
, “Using CPU System Management
Mode to Circumvent Operating System Security Functions,” Proceedings of
CanSecWest
, 2006.

Talk @ N*A,
2010

57

References (
2
)


[Eckstein2005] K. Eckstein, M.
Jahnke
, “Data Hiding in Journaling File Systems,” Proceedings of the
5th Annual Digital Forensic Research Workshop (DFRWS 2005), New Orleans, 2005.


[Garfinkel2006] S.
Garfinkel
, “Disk Imaging with the Advanced Forensics Format, Library and Tools,"
Proceedings of the Second Annual IFIP WG 11.9 International Conference on Digital Forensics,
Orlando, FL, Jan 2006.


[Garfinkel2006] S.
Garfinkel
, “Forensic Feature Extraction and Cross
-
Drive Analysis,” Proceedings of
the 6th Annual Digital Forensic Research Workshop (DFRWS 2005), West Lafayette, IN, 2006


[Geiger2005] M. Geiger, Evaluating Commercial Counter
-
Forensic Tools, Proceedings of 5th Annual
Digital Forensic Research Workshop (DFRWS 2005), New Orleans, 2005.


[Marziale2007] L.
Marziale
, G. G. Richard III, V.
Roussev
, "Massive Threading: Using GPUs to
Increase the Performance of Digital Forensics Tools," Proceedings of the 7th Annual Digital
Forensics Research Workshop (DFRWS 2007), Boston, MA, 2007.


[Marziale2009] L.
Marziale
, S.
Movva
, G. G. Richard III, V.
Roussev
, L.
Schwiebert
, ”Massively
Threaded Digital Forensics Tools”, In Chang
-
Tzun

Lu (ed.), Handbook of Research on
Computational Forensics, Digital Crime and Investigation: Methods and Solutions, IGI Global, 2009.


[Rutkowska2007] J.
Rutkowska
, “Beyond The CPU: Defeating Hardware Based RAM Acquisition
Tools (Part I: AMD case)”, Black Hat DC 2007.


[Petroni2006] N.
Petroni
, A. Walters, T. Fraser, and W.
Arbaugh
, "
FATKit
: A Framework for the
Extraction and Analysis of Digital Forensic Data from Volatile System Memory", Digital Investigation,
3(4):197
-
210, December 2006.


[Petroni2004] N.
Petroni
, T. Fraser, J. Molina, and W.
Arbaugh
, "Copilot
-

a Coprocessor
-
based
Kernel Runtime Integrity Monitor," Proc. of the 13th USENIX Security Symposium, August 2004.

Talk @ N*A,
2010

58

References (3)


[Richard2006] G. G. Richard III, V.
Roussev
, "Toward Secure, Audited Processing of Digital
Evidence:
Filesystem

Support for Digital Evidence Bags," Research Advances in Digital Forensics II,
Springer, 2006.


[Richard2006] G. G. Richard III, V.
Roussev
, "Next Generation Digital Forensics," Communications of
the ACM, February 2006.


[Richard2005] G. G. Richard III, V.
Roussev
, "Scalpel: A Frugal, High Performance File Carver,"
Proceedings of the 2005 Digital Forensics Research Workshop (DFRWS 2005), New Orleans, LA.


[Richard2007] G. G. Richard III, V.
Roussev
, L.
Marziale
, "In
-
place File Carving," Research
Advances in Digital Forensics III, Springer, 2007.


[Roussev2004] V.
Roussev
, G. G. Richard III, "Breaking the Performance Wall: The Case for
Distributed Digital Forensics," Proceedings of the 2004 Digital Forensics Research Workshop
(DFRWS 2004), Baltimore, MD.


[Schuster2006] A. Schuster, “Searching for Processes and Threads in Microsoft Windows Memory
Dumps,” Proceedings of the 6th Annual Digital Forensic Research Workshop (DFRWS 2006), West
Lafayette, IN, 2006.


[Sparks] S. Sparks, J. Butler, “Raising the Bar for Windows
Rootkit

Detection,”
Phrack

Issue # 63.


[Turner2007] P. Turner, “Applying a Forensic Approach to Incident Response, Network Investigation
and System Administration using Digital Evidence Bags," Digital Investigation, 4(2007), pp. 30
-
35.


[Wood2008] C. Wood, http://www.xs4all.nl/~carlo17/howto/undelete_ext3.html.


Talk @ N*A,
2010

59

References: Web Sites


http://www.dfrws.org


Lots of references related to digital forensics, including a link to an interesting e
-
journal…


http://www.ijde.org/


International Journal of Digital Evidence


http://www.tucofs.com/tucofs/tucofs.asp?mode=mainmenu


Collection of forensics
-
related software


http://www.sleuthkit.org


Home of
Sleuthkit

and Autopsy tools


http://www.digitalforensicssolutions.com


Home of Scalpel (file carving software)


http://www.linux
-
ntfs.org/


The Linux NTFS Project


http://www.nongnu.org/gfzip


The Generic Forensic Zip Project


http://pyFlag.sourceforge.net/


PyFLAG
: Forensics and Log Analysis GUI


http://www.invisiblethings.org


Joanna
Rutkowska’s

website




Talk @ N*A,
2010

60

References: Software


Commercial and open
-
source digital forensics software


Sleuthkit / Autopsy


Scalpel


Foremost


Encase


FTK (Forensics Tool Kit)



ILook (law enforcement only)


WinHex


… lots more …


Open source digital forensics software project


http://www.opensourceforensics.org/


Talk @ N*A,
2010

61

Thanks, Discussion?

?

golden@cs.uno.edu

golden@digdeeply.com