Research in Next
-
Generation
Digital Forensics
Golden G. Richard III, Ph.D.
Associate Professor
Dept. of Computer Science
golden@cs.uno.edu
http://www.cs.uno.edu/~golden
Digital Forensics Research Group
•
Fall 2006:
–
Thursdays @ 1pm in NSSAL (Math 322)
•
Primary Collaborators:
–
Vassil Roussev [UNO CS]
–
Vico Marziale [UNO Ph.D. student]
–
Frank Adelstein [ATC
-
NY]
Digital Forensics
Definition: “Tools and techniques to recover,
preserve, and examine digital evidence on or
transmitted by digital devices.”
Devices include computers, PDAs, cellular phones,
videogame consoles, copy machines, printers, …
Examples of Digital Evidence
•
Threatening emails
•
Documents (e.g., in places they shouldn’t be)
•
Suicide notes
•
Bomb
-
making diagrams
•
Malicious Software
–
Viruses
–
Worms
–
…
•
Child pornography (contraband)
•
Evidence that network connections were made
between machines
•
Cell phone SMS messages
Facts (or: Why Digital Forensics?)
•
Deleted files aren’t securely deleted
–
Recover deleted file + when it was deleted!
•
Renaming files to avoid detection is
pointless
•
Formatting disks doesn’t delete much data
•
Web
-
based email can be (partially)
recovered directly from a computer
•
Files transferred over a network can be
reassembled and used as evidence
Facts (2)
•
Uninstalling applications is much more difficult than it
might appear…
•
“Volatile” data hangs around for a long time (even across
reboots)
•
Remnants from previously executed applications
•
Using encryption properly is difficult, because data isn’t
useful unless decrypted
•
Anti
-
forensics (privacy
-
enhancing) software is mostly
broken
•
“Big” magnets (generally) don’t work
•
Media mutilation (except in the extreme) doesn’t work
•
Basic enabler: Data is very hard to kill
Privacy Through Media
Mutilation
degausser
or
or
forensically
-
secure
file deletion
software
(but make sure it works!)
or
Digital Forensics Process
•
Identification of potential digital evidence
–
Where might the evidence be?
–
Which devices did the suspect use?
•
Preservation and copying of evidence
–
On the crime scene…
–
First, stabilize evidence…prevent loss and contamination
–
If possible, make identical copies of evidence for
examination
•
Careful examination of evidence
•
Presentation
–
“The FAT was fubared, but using a hex editor I changed the first
byte of directory entry 13 from 0xEF to 0x08 to restore
‘HITLIST.DOC’…”
–
“The suspect attempted to hide the Microsoft Word document
‘HITLIST.DOC’ but I was able to recover it without tampering with
the file contents.”
•
Legal: Balance of need to investigate vs. privacy
“Traditional” Digital Forensics
•
Pull the plug
•
“Image” (make bit
-
perfect copies) of hard drives,
floppies, USB keys, etc.
•
Use forensics software to analyze copies of
drives
•
Investigator typically uses a single computer to
perform investigation in the lab
•
Present results to client, to officer
-
in
-
charge,
court
Traditional: Where’s the evidence?
•
Undeleted files, expect some names to be incorrect
•
Deleted files
•
Windows registry
•
Print spool files
•
Hibernation files
•
Temp files (all those .TMP files!)
•
Slack space
•
Swap files
•
Browser caches
•
Alternate or “hidden” partitions
•
On a variety of removable media (floppies, ZIP,
Jazz, tapes, …)
But Evidence is Also…
•
In RAM
•
“In” the network
•
On machine
-
critical machines
–
Can’t turn off without severe disruption
–
Can’t turn them ALL off just to see!
•
On huge storage devices
–
1TB server: image entire machine and drag it
back to the lab to see if it’s interesting?
–
10TB?
Next Generation: Needs
•
Broad:
–
Better design, better software
•
Yes, some of it is engineering (and hacking)
•
Someone has to do it
–
Better vision, application of ‘real’ CS to problems
•
More specific:
–
Need for speed
–
Machine correlation
–
Machine profiling
–
Better auditing of investigative process
–
On
-
the
-
spot forensics: Triage
–
Live forensics
–
Network forensics
–
Specific tools for detection and remediation of malware
–
Phishing investigation
–
…
Next Generation: UNO
•
Better file carving
•
Forensic
-
aware OS components
•
In
-
place file carving
•
Forensic accountability
•
On
-
the
-
spot forensics
•
Distributed digital forensics
File Carving: Basic Idea
one cluster
one sector
header, e.g.,
0x474946e8e761
(GIF)
unrelated disk blocks interesting file
footer, e.g.,
0x003B
(GIF)
“milestones”
or “anti
-
milestones”
File Carving: Fragmentation
header, e.g.,
0x474946e8e761
(GIF)
footer, e.g.,
0x003B
(GIF)
“milestones”
or “anti
-
milestones”
File Carving: Fragmentation
header, e.g.,
0x474946e8e761
(GIF)
footer, e.g.,
0x003B
(GIF)
File Carving: Damaged Files
header, e.g.,
0x474946e8e761
(GIF)
“milestones”
or “anti
-
milestones”
No footer
File Carving: Doing a Better Job
•
Better design
•
Faster
•
Distributed implementation
•
More flexible description of file types
•
Automatic generation of type descriptions
–
Patterns
–
Rule sets
•
Multiple
-
pass carving
–
Carve, “remove” validated files from block list, re
-
carve, hope that some fragmented files coalesce
•
Block
-
sniffing
File Carving: Block Sniffing
header, e.g.,
0x474946e8e761
(GIF)
Do these blocks “smell” right?
•
N
-
gram analysis
•
entropy tests
•
parsing
Better Software: File Carving:
Scalpel
•
Two
-
pass design
•
Minimizes:
–
Reads
–
Seeks
–
Writes
–
Data copying
–
Memory usage
•
Doesn’t yet incorporate all of the carving
wizardry we have in mind
G. G. Richard III, V. Roussev, "Scalpel: A Frugal, High Performance File Carver,"
Proceedings of the 2005 Digital Forensics Research Workshop (DFRWS 2005)
, New Orleans, LA.
Some Scalpel Results (1)
Big targets, large carve sizes, huge improvement (over 5 hours faster)
T
read
+ 238,270,750,000 bytes
Some Scalpel Results (2)
Big targets, large carve sizes, huge improvement (over 7 hours faster)
T
read
+
117,622,357,936 bytes
OS Support for Digital Forensics
•
Export raw disk devices across network for
processing
–
Others: network block device (NBD)
–
Us: optimization
•
“In
-
place” file carving
–
Us: Export results from file carving as a
filesystem, w/ minimal extra storage
•
Better auditing of investigative process
–
Us: “digital evidence bag”
-
aware filesystems
FUSE (Filesystem in User Space)
user space
kernel space
Linux
Virtual File System
Interface
(VFS)
C library
dd if=/evidence/DEC/img.dd of=copy.dd
read()
FUSE
ext3
reiserFS
C library
FUSE library
Filesystem
Implementation
In
-
Place File Carving
preview database
FUSE
scalpel_fs
client applications
nbd server
nbd client
network
local drive
remote drive
G. G. Richard III, V. Roussev, V. Marziale, “In
-
Place File Carving,” submitted to the
Third Annual
IFIP WG 11.9 International Conference on Digital Forensics
, 2007.
Scalpel
Better Auditing
Want: Digital Evidence Bags
See:
P. Turner, “Unification of Digital Evidence from Disparate Sources (Digital Evidence Bags),” DFRWS 2005
See:
Common Digital Evidence Storage Format (CDESF) working group,
http://www.dfrws.org/CDESF/
.
Better Auditing (2)
DEC
(DEB, AFF,
Gfzip …)
FDAM
dd
scalpel
FTK
…
VFS Interface
TSK
Evidence
Data
Audit Log
Import/
Export
Applications
(User space)
(Kernel)
Operating
System
Block
-
level
Data Access
Filesystem
Data Access
FDAM
Block Device
G. G. Richard III, V. Roussev, "Toward Secure, Audited Processing of Digital Evidence:
Filesystem Support for Digital Evidence Bags," Research Advances in Digital Forensics, Springer, 2006.
Digital
Evidence
Container
Bluepipe: On the Spot Digital Forensics
Y. Gao, G. G. Richard III, V. Roussev, “Bluepipe: An Architecture for On
-
the
-
Spot Digital Forensics,”
International Journal of Digital Evidence (IJDE)
, 3(1), 2004.
Bluepipe Patterns
<
BLUEPIPE
NAME=”findcacti”>
<!
--
find illegal cacti pics using MD5 hash dictionary
--
>
<DIR TARGET=”/pics/” />
<FINDFILE
USEHASHES=TRUE
LOCALDIR=”cactus”
RECURSIVE=TRUE
RETRIEVE=TRUE
MSG="Found cactus %s with hash %h ">
<FILE ID=3d1e79d11443498df78a1981652be454/>
<FILE ID=6f5cd6182125fc4b9445aad18f412128/>
<FILE ID=7de79a1ed753ac2980ee2f8e7afa5005/>
<FILE ID=ab348734f7347a8a054aa2c774f7aae6/>
<FILE ID=b57af575deef030baa709f5bf32ac1ed/>
<FILE ID=7074c76fada0b4b419287ee28d705787/>
<FILE ID=9de757840cc33d807307e1278f901d3a/>
<FILE ID=b12fcf4144dc88cdb2927e91617842b0/>
<FILE ID=e7183e5eec7d186f7b5d0ce38e7eaaad/>
<FILE ID=808bac4a404911bf2facaa911651e051/>
<FILE ID=fffbf594bbae2b3dd6af84e1af4be79c/>
<FILE ID=b9776d04e384a10aef6d1c8258fdf054/>
</FINDFILE>
</BLUEPIPE>
Distributed Digital Forensics
V. Roussev, G. G. Richard III, "Breaking the Performance Wall: The Case for Distributed Digital Forensics,“
Proceedings of the 2004 Digital Forensics Research Workshop (DFRWS 2004)
, Baltimore, MD
750GB
750GB
300GB
300GB
Distributed Digital Forensics
•
Scalable
–
Want to support at least IMAGE SIZE / RAM_PER_NODE nodes
•
Platform independent
–
Want to be able to incorporate any (reasonable) machine that’s
available
•
Lightweight
–
Horsepower is for forensics, not the framework
—
less fat
•
Highly interactive
•
Extensible
–
Allow incorporation of existing sequential tools
–
e.g., stegdetect, image thumbnailing, file classification, hashing,
…
•
Robust
–
Must handle failed nodes smoothly
Distributed Digital Forensics (2)
Distributed Digital Forensics (3)
SCSI
RAID: 504GB
File Server
CPU: 2x1.4GHz
Xeon
RAM: 2GB
Switch
96
-
port, 10/100/1000 Mb
24
Gb
Backplane
1Gb
Node
CPU: 2.4 GHz
Pentium
4
RAM: 1 GB
SCSI
RAID: 504GB
File Server
CPU: 2x1.4GHz
Xeon
RAM: 2GB
Switch
96
-
port, 10/100/1000 Mb
24
Gb
Backplane
1Gb
Node
CPU: 2.4 GHz
4
RAM: 1 GB
Beowulf [RIP], Slayer of Computer
Criminals…
DDF: Results (1)
•
Live string search:
“Vassil Roussev”
•
Regular expression
search:
v[a
-
z]*i[a
-
z]*a[a
-
z]*g[a
-
z]*r[a
-
z]*a
DDF: Results (2)
•
Stego detection using Stegdetect 0.5 under RH9 Linux
on the cluster
•
Traditional:
–
6GB image mounted using loopback device
–
find /mnt/loop
–
exec ./stegdetect ‘{}’
\
;
–
790 seconds == 13:10 minutes
•
Using the distributed framework
–
Stegdetect 0.5 code incorporated into framework
–
Detection against cached files
–
“STEGO” command (after IMAGE/CACHE)
–
82 seconds == 1:22 minutes
•
9.6X faster with 8 machines
•
CPU bound operation
DDF: To Do List
•
User interface! (unless you love Putty)
DDF: To Do (2)
•
Case persistence
•
Secure support for overlapping cases
•
Better fault tolerance
•
Intelligent caching schemes to support larger
images
•
Collaboration with colleagues (you?) working in:
–
Image analysis/classification
–
Speech recognition
–
More stego
–
Other CPU horsepower
-
intensive, forensics
-
applicable stuff
–
We provide cycles…you provide…
Current: Live Forensics
•
Physical memory dumps
–
Hard to do when adversarial OS is present
–
Via USB hacking?
–
Firewire proof of concept developed by Maximillian
Dornseif
•
Defeating process hiding techniques, e.g., FU
“rootkit”
–
Check OS components from many angles
•
Remnants of applications (executed) past…
–
e.g., instant messenger fragments
–
e.g., recent invocations of process hiding
–
e.g., fingerprints of recently executed (or executing)
malware
Conclusion: Lots of Work To Do
•
Benevolent hacking (engineering) meets
science
•
Desperately need methods for pipelining
investigative process
•
Live forensics critically important
–
volatile computing
–
whole disk encryption
–
hardware
-
based whole disk encryption!
–
nasty malware
Conclusion (2)
•
Arguably, almost
any
field in CS can collaborate
–
All media handling needs work
–
Algorithms for dealing with huge, partially
-
organized
datasets
–
Attribution
–
Correlation
–
Profiling
–
Document similarity measures
–
Databases
–
High
-
performance computing
–
OS Internals
Random Bedside Reading…
•
http://www.dfrws.org (Digital Forensics Research Workshop)
•
http://www.ijde.org/ (International Journal of Digital Evidence)
•
F. Adelstein, “Live Forensics: Diagnosing Your System Without Killing it First,”
Communications of
the ACM
, February 2006.
•
M. A. Caloyannides,
Privacy Protection and Computer Forensics
, Second Edition, 2004.
•
B. Carrier,
File System Forensic Analsis
, Addison
-
Wesley, 2005.
•
B. Carrier, “Risks of Live Digital Forensics Analysis,”
Communications of the ACM
, February
2006.
•
E. Casey,
Digital Evidence and Computer Crime
, Academic Press, 2004.
•
J. Chow, B. Pfaff, T. Garfinkel, M. Rosenblum, “Shredding Your Garbage: Reducing Data Lifetime
Through Secure Deallocation,”
14th USENIX Security Symposium,
2005.
•
M. Geiger, “Evaluating Commercial Counter
-
Forensic Tools,”
5th Annual Digital Forensic
Research Workshop (DFRWS 2005)
, New Orleans, 2005.
•
G. G. Richard III, V. Roussev, "Next Generation Digital Forensics,"
Communications of the ACM
,
February 2006.
•
G. G. Richard III, V. Roussev, “Digital Forensics Tools: The Next Generation,” invited chapter in
Digital Crime and Forensic Science in Cyberspace
, IDEA Group Publishing, 2005.
•
A. Schuster, “Searching for Processes and Threads in Microsoft Windows Memory Dumps,” 6th
Annual Digital Forensic Research Workshop (DFRWS 2006), West Lafayette, IN, 2006.
•
S. Sparks, J. Butler, “Raising the Bar for Windows Rootkit Detection,” Phrack Issue # 63.
•
G. Hoglund, J. Butler, “Rootkits: Subverting the Windows Kernel,” Addison
-
Wesley, 2005.
Presentation available:
http://www.cs.uno.edu/~golden/teach.html
golden@cs.uno.edu
Security Lab (NSSAL): Math 322
?
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο