similar

Parts of a Workflow


Entities/Stages


where something
happens (e.g. data are transformed,
someone makes a decision, data are
captured)


Input(s)


control and/or information that
flows into an entity/stage


Output(s)


control and/or information that
flow out of an entity/stage

Level

Label

Explanation

8

Aggregation of objects

Set of objects that form an aggregation that is meaningful
encountered as an entity

7

Object or package

Object composed of multiple files, each of which could
also be encountered as individual files

6

In
-
application rendering

As rendered and encountered within a specific application

5

File through filesystem

Files encountered as discrete set of items with associate
paths and file names

4

File as “raw” bitstream

䉩瑳瑲敡洠敮e潵湴敲敤o慳⁡ c潮瑩湵潵n s敲楥e映扩湡特
v慬略u

P

卵p
J
晩l攠摡瑡⁳瑲畣瑵牥

Discrete “chunk” of data that is part of a larger file

O

䉩瑳瑲敡洠瑨牯畧栠䤯f
敱畩灭敮p

卥物敳映ㅳ⁡湤‰ ⁡ ⁡ c敳s敤e晲潭f瑨攠t瑯牡来t浥摩愠
畳i湧ni湰畴⽯畴灵琠桡牤h慲攠慮搠s潦瑷慲攠⡥⹧⸠
controllers, drivers, ports, connectors)

1

Raw signal stream through
I/O equipment

Stream of magnetic flux

transitions or other analog
electronic output read from the drive without yet
interpreting the signal stream as a set of discrete values
(i.e. not treated as a digital bitstream that can be directly
read by the host computer)

0


Bitstream on physical
medium

Physical properties of the storage medium that are
interpreted as bitstreams at Level 1

Digital Resources
-

Levels of Representation

Level

Label

Explanation

8

Aggregation of objects

Set of objects that form an aggregation that is meaningful
encountered as an entity

7

Object or package

Object composed of multiple files, each of which could
also be encountered as individual files

6

In
-
application rendering

As rendered and encountered within a specific application

5

File through filesystem

Files encountered as discrete set of items with associate
paths and file names

4

File as “raw” bitstream

䉩瑳瑲敡洠敮e潵湴敲敤o慳⁡ c潮瑩湵潵n s敲楥e映扩湡特
v慬略u

P

卵p
J
晩l攠摡瑡⁳瑲畣瑵牥

Discrete “chunk” of data that is part of a larger file

O

䉩瑳瑲敡洠瑨牯畧栠䤯f
敱畩灭敮p

卥物敳映ㅳ⁡湤‰ ⁡ ⁡ c敳s敤e晲潭f瑨攠t瑯牡来t浥摩愠
畳i湧ni湰畴⽯畴灵琠桡牤h慲攠慮搠s潦瑷慲攠⡥⹧⸠
controllers, drivers, ports, connectors)

1

Raw signal stream through
I/O equipment

Stream of magnetic flux

transitions or other analog
electronic output read from the drive without yet
interpreting the signal stream as a set of discrete values
(i.e. not treated as a digital bitstream that can be directly
read by the host computer)

0


Bitstream on physical
medium

Physical properties of the storage medium that are
interpreted as bitstreams at Level 1

Digital Resources
-

Levels of Representation

Levels where
digital forensics
methods and
tools can provide
a lot of assistance

Storage Media Acquisition and Handling Profile for Digital Repositories*

*Woods, Kam, Christopher A. Lee, and Simson Garfinkel.

Extending Digital Repository Architectures to Support Disk Image
Preservation and Access.


In
JCDL '11: Proceeding of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries
,
57
-
66. New York, NY: ACM Press, 2011.

BitCurator
-
Supported Workflow

See: http://bitcurator.net


Acquisition


Reporting


Redaction


Metadata Export

Two Sources of Workflow Examples

Martin J. Gengenbach, “’The Way We Do it Here’: Mapping
Digital Forensics Workflows in Collecting Institutions,” A
Master’s Paper for the M.S. in L.S degree. August 2012.

http://digitalcurationexchange.org/system/files/gengenbach
-
forensic
-
workflows
-
2012.pdf


AIMS Work Group, “AIMS Born
-
Digital Collections: An Inter
-
Institutional Model for Stewardship,” January 2012.
http://www2.lib.virginia.edu/aims/whitepaper/AIMS_final.pdf

Martin J. Gengenbach, “’The Way We Do it Here’: Mapping Digital Forensics Workflows in Collecting Institutions,” A Master’s
Paper for the M.S. in L.S degree. August, 2012.

AIMS Work Group, “AIMS Born
-
Digital Collections: An Inter
-
Institutional Model for Stewardship,” January 2012.

Identifying a Process*


Name it


Verb
-
noun



e.g. generate AIP, harvest web site


Verb
-
qualifier
-
noun



e.g. generate descriptive
information, develop preservation strategy


Verb
-
noun
-
noun



e.g. assign file permissions, verify
object integrity


Ensure there is a clearly intended result


Test:
noun
-
is
-
verbed

form (e.g. AIP is generated, web
site is harvested, object integrity is verified

*Sharp, Alec, and Patrick McDermott.
Wokflow Modeling: Tools for Process Improvement and
Applications Development
. 2nd ed. Boston, MA: Artech House, 2009. p.40

Criteria for Identified Result*

1.
Discrete and identifiable


“you can differentiate
individual instances of the result, and it makes
sense to talk about 'one of them‘”

2.
Countable



“you can count how many of that
result you've produced in an hour, a day, or a
week”

3.
Essential



“fundamentally necessary to the
operation of the enterprise, not just a
consequence of the current implementation,”
i.e. “must focus on 'what, not who or how‘”

*Sharp, Alec, and Patrick McDermott.
Wokflow Modeling: Tools for Process Improvement and
Applications Development
. 2nd ed. Boston, MA: Artech House, 2009. p.40
-
41

For Further Consideration After Defining the
Basic Workflow


The “Three R’s”


Roles (who are the actors who complete
steps in the process?)


Responsibilities (what are the individual
steps that each actor performs?)


Routes (what are the flows and decisions
that connect the steps and define the
path?

*Sharp, Alec, and Patrick McDermott.
Workflow Modeling: Tools for Process Improvement and Applications
Development
. 2nd ed. Boston, MA: Artech House, 2009. p.203


Factors to consider:


Available storage capacity


Size and number of acquired media (
TeraByte

hard
drive vs. 1.44
MegaByte

floppy)


Reasons for acquiring materials and intended use
cases


Your institution’
s ability to deal with disk images
appropriately in the future


Understood commitments to donors and other
stakeholders


Strong arguments for at least keeping the image
within a

staging area




even if later discarded


Should I Preserve Disk Images?

Nautilus Scripts


In addition to the specialized forensics
tools in the BitCurator environment, there
are a variety of scripts that can be run
using the GNOME file manager called
Nautilus (Linux analog to Windows
Explorer or Mac OS X Finder)


Can be used in the BitCurator environment
or your own Linux environment

File Details for Word Documents in a Directory (Nautilus Script)

MD5 Hashes of Files (Nautilus Script)

Packaging, processing and
providing access to data from
forensic disk images (including
redaction considerations)

Distribution Paths*

*Woods, Kam, Christopher A. Lee, and Simson Garfinkel.

Extending Digital Repository Architectures to Support Disk Image
Preservation and Access.


In
JCDL '11: Proceeding of the 11th Annual International ACM/IEEE Joint Conference on Digital
Libraries
, 57
-
66. New York, NY: ACM Press, 2011.


Provide entire disk image to user as a single file


Extract portions of image for later user access
1


Allow access to portions of image by generating
them on
-
request
2

Access to Data from Disk Images

1. Garfinkel, Simson L. "Automating Disk Forensic Processing with Sleuthkit, XML and Python." Paper
presented at the Fourth International IEEE Workshop on Systematic Approaches to Digital Forensic
Engineering, Oakland, CA, USA, May 21, 2009.

2. Woods, Kam, and Geoffrey Brown. "From Imaging to Access
-

Effective Preservation of Legacy Removable
Media." In
Archiving 2009: Preservation Strategies and Imaging Technologies for Cultural Heritage Institutions
and Memory Organizations: Final Program and Proceedings
, 213
-
18. Springfield, VA: Society for Imaging
Science and Technology, 2009.

———
. "Migration Performance for Legacy Data Access."
International Journal of Digital Curation
3, no. 2
(2008): 74
-
88.

Example of System for Dynamically
Walking the Trees of Disk Images

http://svp.soic.indiana.edu/svp/4909070

http://svp.soic.indiana.edu/svp/4909070/FID1/

Example of Serving out Entire Disk Images
(Using
BitTorrent
)

http://terasaur.org/item/show/computer
-
forensics
-
2009
-
m57
-
scenario/187

Identification, Flagging and
Potential Redaction of
Sensitive/Hidden Information

Example of Revealing Overlaid Text in PDF
(After Unsuccessful Redaction in MS Word)



Analysis of Diversity in the Attorney
Workforce.


KPMG Consulting. June 14,
2002. p.ES
-
2.

Slide from:
Garfinkel
,
Simson

L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.

Slide from: Garfinkel, Simson L.

Computer Forensics and Media Exploitation: Technology, Policy and
Countermeasures.


Large Installation System Administration (LISA) Conference. San Diego, CA. Nov. 9, 2008.


Changes to Microsoft Office, e.g.


Embedded PID GUID abandoned in Office 2000


Document Investigator (Office 2007)


Appearance of comments & tracked changes by default when opening
document (Office 2007)



Fast save


turned off by default in Word 2000 & disabled in Word 2003


Rise and fall of the embedded PID GUID (introduced in Office 97,
abandoned in Office 2000)


Some other Windows changes


No more accidental dumping of RAM slack (now writes zeros)


In Index.dat, deleted entries are now (since IE7 and Vista) reportedly now
zeroed out


Macintosh


encryption and safe delete


Large market for software designed specifically for managing e
-
discovery, e.g.
EnCase & Neutrino (Guidance Software), Discovery Partner (Electronic
Evidence Discovery),
iScrub

(Esquire Innovations)

E
-
Discovery and Forensics Impact on
Computer Industry

Legal and Ethical Considerations


When and how to

really


delete data


Responsibilities of curators of

hidden


data


Commitments to donors and other stakeholders

Professional and Ethical Considerations


Unlike conventional analog data, such as the shade of grey or
the subjective recollection of a witness, whose believability and
validity is scrutinized in depth, digital data which takes one of
two very unambiguous values (zero or one) is misperceived by
the average person as being endowed with intrinsic and
unassailable truth.


In fact, quite the opposite is true.




This

摩牴r 汩瑴汥⁳散牥t


慢潵琠摩杩瑡氠

敶楤敮捥


楳⁣潮癥湩敮瑬e
soft
-
pedaled by the computer forensics industry and by the
prosecution, both of which focus on those other aspects of the
process of collecting, preserving and presenting digital data
evidence which can indeed be unassailable if done properly,
such as the

捨慩渠潦o捵獴潤c


灯牴p潮o潦o桡湤h楮朠摩杩瑡氠
evidence.

*

*
Caloyannides
, Michael A. "Digital 'Evidence' Is Often Evidence of Nothing." In
Digital Crime and Forensic
Science in Cyberspace, edited by
Panagiotis

Kanellis
, 334
-
39. Hershey, PA: Idea Group Pub., 2006.

"If a forensic examiner has complete
confidence in his/her conclusions, this
is usually an indication that he/she is
missing something


there is always
uncertainty and all assertions should be
qualified accordingly...


Casey,
Eoghan
. "Error, Uncertainty, and Loss in Digital Evidence."
International
Journal of Digital Evidence
1, no. 2 (2002).


Investigators cannot, in general, directly observe digital
data and instead they can only observe the data displayed
on a monitor or other output device, which is driven by
various types of hardware and software. Because the
observation of the data is indirect
, a hypothesis must be
formulated that the actual data is equal to the observed
data. Testing this hypothesis requires that the hardware
and software being used are accurate and reliable.
Hypotheses also need to be formulated about the data
abstractions that exist and the previous states and events
that occurred.

*


*Carrier, Brian D. "A Hypothesis
-
Based Approach to Digital Forensic Investigations."
Doctoral Dissertation, Purdue University, 2006. p.11 (emphasis added).


Name embedded in a MS Word file is the document

s author (e.g. in
case when a template or document has been reused)


Given IP address identifies an individual


Presence of email addresses on different hard drives indicate
correspondence patterns between individuals


Many common MD5 values across storage locations indicate sharing
of files across those locations (context
-
based filtering can help to
address this)


Last modified date indicates when a document was finalized


Parts of a page available through the
WayBack

Machine for a given
date represent the parts of the page as available on that date

Examples of Potentially Useful Inferences
(that could be Wrong)

The

䭥K扯b牤r䑩汥浭a



Even if a document can be traced to a particular
computer and/or IP address, how can we identify
who was actually at the keyboard composing the
document? It is a particular problem in
environments where multiple users may have
access to the same computer or when users do
not have to authenticate themselves to access a
particular account.


Chaski
, Carole. "The Keyboard Dilemma and Authorship Identification." In
Advances
in Digital Forensics III: IFIP International Conference on Digital Forensics, National
Center for Forensic Science, Orlando, Florida, January 28
-
January 31, 2007
, edited
by Philip
Craiger

and
Sujeet

Shenoi
. New York, NY: Springer, 2007. p.133.


Shared Computer Use in the Home*

*Frohlich, David, and Robert Kraut. "The Social Context of Home Computing." In
Inside the Smart Home
, edited by Richard Harper, 127
-
62. London: Springer,
2003.

When Delete Really Does Mean Delete

Excerpted from: Kissel,
Richard, Matthew Scholl, Steven Skolochenko, and
Xing Li.
Special Publication 800
-
88: Guidelines for Media Sanitization.
Gaithersburg, MD: National Institute of Standards and Technology, 2006.

Sanitization Techniques

Method

Description

Clear


啳攠e潦瑷慲攠潲o桡牤h慲攠灲潤畣瑳 瑯t潶敲e物瑥rs瑯牡来ts灡p攠潮⁴桥e浥摩愠ai瑨t
湯n
J
s敮ei瑩v攠摡瑡


䵡j i湣l畤攠潶敲e物瑩湧n湯琠潮oy 瑨攠l潧oc慬 s瑯牡来tl潣慴a潮o潦o愠晩l攨猩e⡥⹧⸬E晩l攠
allocation table) but also may include all addressable locations


Goal is to replace written data with random data


䍡湮潴C扥⁵b敤⁦潲e浥摩愠瑨慴a慲攠摡浡d敤爠湯琠牥r物瑥慢re

偵牧m


䑥条畳si湧n潲o數散畴u湧⁦n牭r慲攠卥p畲攠䕲慳u c潭o慮a


Degaussing = exposing the magnetic media to a strong magnetic field in order to
disrupt the recorded magnetic domains...using either a strong permanent magnet or
an electromagnetic coil

Destroy


䵡湹 摩f晥牥湴f瑹灥pⰠ瑥t桮h煵敳Ⱐ慮搠灲潣敤畲敳u


Disintegration, Pulverization, Melting, and Incineration
-

designed to completely
destroy the media


Shredding
-

Paper shredders can be used to destroy flexible media such as
diskettes once the media are physically removed from their outer containers


l灴pc慬慳s⁳瑯牡来t浥摩愠浵a琠扥⁤敳瑲潹敤e批 灵pv敲楺i湧Ⱐc牯獳c畴⁳桲敤摩h朠潲g
扵牮楮b

8
-
301. Clearing and Sanitization. Instructions on clearing, sanitization
and release of IS [Information System] media shall be issued by the
accrediting CSA [Cognizant Security Agency].


a.
Clearing
. Clearing is the process of eradicating the data on media
before reusing the media in an environment that provides an acceptable
level of protection for the data that was on the media before clearing. All
internal memory, buffer, or other reusable memory shall be cleared to
effectively deny access to previously stored information.


b.
Sanitization
. Sanitization is the process of removing the data from
media before reusing the media in an environment that does not provide
an acceptable level of protection for the data that was in the media
before sanitizing. IS resources shall be sanitized before they are
released from classified information controls or released for use at a
lower classification level.


Source: National Industrial Security Program Operating
Manual (DoD 5220.22
-
M)


For most sensitive data, can’
t just overwrite, but
must instead degauss (use powerful magnet) or
destroy medium

Defense Security Service provides a
Clearing and Sanitization Matrix (C&SM)

Conclusions: Implied Changes with the
Archival Profession


Professional vocabulary evolving to include terms such
as disk image, hex viewer, cryptographic hash, and
filesystem


Gaining access to new professional communities and
guidance


Use of tools designed to treat data at a low level


as
raw bitstreams off media


rather than at the file level


Potential to shift “center of gravity” about electronic
records from design of institutional recordkeeping
systems toward acquisition and management of records
from a more diverse and unpredictable set of sources

Implications for your institution?


See supplemental materials


Digital forensics vendors offer workshops (e.g.
AccessData, Digital Intelligence, Guidance
Software)


http://www.forensicswiki.org/

For further Guidance

BitCurator Resources

Get the software

Documentation and technical
specifications

Screencasts

Google Group

http://wiki.bitcurator.net/

People

Project overview

Publications

News

http://www.bitcurator.net
/

Twitter: @
bitcurator

Thank you!