Shotton ... - ImageWeb server

richessewoozyBiotechnology

Oct 1, 2013 (3 years and 9 months ago)

165 views

Twenty Questions
for

Research
Data Management


A Web


entry form that permits creation of a data management plan using these questions is now available at
http://www.miidi.org/dmp/
.

[Notes: These questions were revised on 22 March 2012 and again on 11 June 2012.


Further changes to improve the clarity of
the questions were made on 9 May 2013


-

see Footnote.




This document is also available as a Word file from


http://imageweb.zoo.ox.ac.uk/pub/2012/publications/Shotton
-
Twenty_Questions_for_Research_Data_Management.docx
.]

These twenty questions are designed to

prompt and assist your thinking, as a research student, a postdoc or an
academic researcher at the beginning of a research project, and to form the basis of a workable research data
management plan that can both guide your on
-
going data management activit
ies and inform others about the nature
and availability of your research data.

They will help you determining how best to safeguard your data from loss, how to describe your datasets in ways
that assist both yourself when returning to them in the future an
d others in their subsequent interpretation, and how
to publish your data in ways that maximize their usefulness to others and bring maximum academic scholarly credit
to yourself, to reward your efforts in acquiring, analysing, describing, interpreting and

publishing them in the first
place.

You may not have immediate answers to all these questions.


But, by seeking advice from your research supervisor,
colleagues and others in your institution with responsibilities for data management, you should endeavour

to
discover them.


Then, once in a while, you should revisit these questions and see whether your data management
practices can be improved, updating your answers.

More detailed data management planning questions are available online, and a comparison of
those with these
Twenty Questions will be the subject of a subsequent blog post.

The nature of your data

1







What is the general subject discipline (domain, field) to which your research data relates?

Possible responses:



Quantum physics.



Cell biology.



Ornithology.

2







What is the exact nature (range, scope) of your research data?

Possible responses:



Long
-
distance quantum communication using entangled photons.



Protein chemistry and electron microscopy of cell membrane proteins.



Video field recordin
gs of avian behaviour, and their quantitative analysis.

3







Who will own the data arising from your research, and the intellectual property rights relating to them?

Possible responses:



Myself alone.



Myself and my research group leader.



My university.

4







If you know at this stage, specify in what format(s), will you store your data in the short term after acquisition?

Possible responses:



Questionnaire response data will be stored on my laptop in a Microsoft Office Access 2007 database.



Raw video re
cording on digital video tapes on the shelf above my desk, edited videos in
.mov

format on my laptop.
numerical analyses in a spreadsheet (Microsoft Office Excel 2007 format) on my laptop.



Numerical analyses in a spreadsheet (Microsoft Office Excel 2007 fo
rmat) on my laptop.



On my research group’s cloud
-
based secure DataStage research data file store, in Zeiss confocal 3D image format.

Date descriptions, so that someone else can understand what the data are about (i.e. metadata, “data about
data”)

5







W
hen and where will you describe each of your research datasets?

Possible responses:



The only description will be the filenames on my hard drive.



I will describe the data using handwritten notes in my lab notebook if and when I have time, after the experime
nts have
been completed


hopefully I’ll be able to remember all the details.



I will describe the data using the column and row labels in my spreadsheets after the data have been analysed.



I will create descriptive metadata for each dataset as I create/acq
uire it, and will save these descriptions with my
datasets on my hard drive.

6 How will descriptive metadata be created or captured?

Possible responses:



Instrument metadata are automatically included in each data file.



I will create a title and short textu
al description for each dataset using the supplied submission interface when I submit
the dataset to my university’s data repository.



My data descriptions will be saved in spreadsheets or word processor documents.



I will create rich metadata conforming to
a Minimal Information Standard appropriate to my research field will be
recorded at the time of data acquisition, using a metadata entry form to ensure I don’t miss any essential
information.


This metadata file will be saved locally with my dataset, and e
ventually will be deposited with the dataset
when it is submitted to a data repository.

Data sharing and publication

7







With whom will you share your research data in the short term, before publication of any papers arising from
their interpretation?

Possible responses:



My research supervisor only.



Members of my research group and trusted external collaborators.



Anyone who asks for them.



Everyone, by publishing the data online, since our research community is committed to the rapid sharing of research

results.

8






For how long will you embargo your research data before it is published for others to see and use?

Possible responses:



We will allow immediate public access to the data.



For one year, to permit us to exploit our hard
-
won research results.



Until the journal article describing our results has been published.

9






Why is public access to your research data to be restricted (if indeed it is)?

Possible responses:



We intend to make a patent application, and must avoid prior disclosure.



Don’t w
ant to make locations of members of endangered species available to poachers.



The research data are confidential because of the arrangement my research group has made with the commercial
partner sponsoring our research.



My data form part of a long
-
term stu
dy upon which my research group is entirely reliant for its on
-
going research
publications and academic reputation.


We only share this with trusted colleagues.



Confidential human patient data.



Questionnaire data collected in confidence from individuals


anonymized averaged data
will
be published.

10





Under what data
-
sharing license will you publish your research data?

Possible responses:



What is a data
-
sharing license?



Under a Creative Commons Open Data CC Zero public domain dedication and waiver,
since my research data are not
covered by copyright.



Using a Creative Commons Attribution License, since my image data are covered by copyright.

11





What persistent identifier will be used to permit correct citation of your datasets?

Possible responses
:



This URL:
http://****
.



A Digital Object Identifier (DOI).



The accession number for the dataset issued by the European Bioinformatics Institute database to which the dataset is
submitted.

12





What metadata will be published

with the data to make them interpretable and reusable?

Possible responses:



I will expect users to be able to interpret the column and row labels in my spreadsheets.



The dataset will be described in the journal article we will publish, but will have no oth
er metadata beyond those
required by the repository for data citation: Author, Date, Title, Source, Identifier.



An XML metadata file created in conformance with a Minimal Information standard will be submitted to the repository
as part of the data package,

along with the data files.

Data storage, backup and archiving

13






Where will you store your data in the short term, after acquisition?

Possible responses:



On my laptop.



On the computer connected to the microscope.



On my research group’s DataStage fil
estore.

14






Who is responsible for the immediate day
-
to
-
day management, storage and backup of the data arising from
your research?

Possible responses:



Myself alone.



My research group’s data manager.



Our departmental IT staff, who manage our research group’s DataStage research data management system.

15





How frequently will your research data be backed up for short
-
term data security?

Possible responses:



Whenever I remember to do so.



Nightly, using

our research group’s DataStage research data management system connected to the University’s
automated backup service.

16





Where will your research data be archived for long
-
term preservation?

Possible responses:



Selected data will be included in the
figures and tables of research papers published by my research group, but we have
no plans to archive and publish the full datasets.



As supplementary files attached to my journal articles on the publisher’s web site.



In the University’s DataBank data repos
itory, run by the library service.



In appropriate genomics databases run by the European Bioinformatics Institute.

17





When will your research data be moved to a secure archive for long
-
term preservation and publication?

Possible responses:



Our researc
h data are already securely stored in an institutional data server.



Nightly.



Upon completion of each set of experiments.



When my research group leader decides it is appropriate.



Immediately after publication of my thesis.



Upon submission of our
Nature

paper, so that the data are available for reviewers.

18





Who will decide which of your research data are worth preserving?

Possible responses:



Myself alone.



Myself, in consultation with my research supervisor.



My research supervisor alone.

19





How (i.e. by what physical or electronic method) will you transfer your research datasets to their long
-
term
archive, under the curatorial care of a separate third
-
party, e.g. a data repository?

Possible responses:



On physical hard drives that I will bring

back from my field site by air.



By e
-
mailing files to our librarian.



By completion of the selected data repository’s Web
-
based submission form and uploading of the data files over the
Internet.



By use of a local data management system such as
DataStage

that can automatically package and submit data files to
the selected repository.

20





Who will be responsible for your data, once you have left your present research group?

Possi
ble responses:



At this stage, I have no idea.



I’ll take my data with me and maintain responsibility.



My supervisor will make appropriate arrangements.



I hope the journal will maintain access to the supplementary information files associated with my articl
e.



My University will assume long
-
term responsibility for the data I have chosen to preserve in its data archive.

-



-



-

Notes

Creative Commons
: Creative Commons is a non
-
profit organization that has develope
d a legal and technical
infrastructure for the licensing of copyright material and data in a standardised and machine
-
readable manner,
thereby facilitating open publication, sharing and innovation in the digital age.

DataStage

and
DataBank
: DataStage
is a simple research data filestore and repository data submission system,
designed for deployment at the re
search group level.


DataBank is a data repository for archiving and publishing
research data,
designed for deployment at the institutional level.


Both are open
-
source services for local or cloud
deployment developed together at Oxford University within the JISC University Modernization Fund
DataFlow
Pr
oject
, and both are now available for third
-
party installation and use.




European Bioinformatics Institute
:


The European Bioinformatics Institute (EBI) houses Europe’s primary databases
for molecular sequence data,
genomics and bioinformatics, and shares data daily with similar institutions in the
United States and Japan.

Minimal Information Standards for life science research specify minimal metadata requirements for certain types of
research data, are integrated by

the
MIBBI Project

(Minimum Information for Biological and Biomedical
Investigations), and are described in Reference [1].

Reference

[1]






Taylor
et al
. (2008). Promoting coherent minimum reporting guidelines for biological and biomedical
investigations: the MIBBI project.
Nature Biotechnology

26
(8): 889
-
896.
doi:10.1038/nbt0808
-
889
.

-



-



-

Footno
te:

These questions were revised on 22 March 2012, two weeks after they were first published, to simplify
the wording, to remove some redundancy between questions, and to split compound questions into single
questions. To keep the total number of question
s to 20, two questions about
when

data would be collected
and analysed have been removed.


The remaining twenty questions have been slightly re
-
ordered.

Following suggestions by Sally Rumsey of the Bodleian Library, minor revisions were then made on 11
Ju
ne to the text of questions 5, 6 and 18, and to the possible responses for questions 14 and 18, in order to
add clarity and remove ambiguities.


Question 20 was also moved to position 15, and the subsequent
questions re
-
numbered (s0 that Question 18 is now

Question 19, etc.).


The Notes were also edited to
update the information on DataFlow and to delete the description of SWORDv2, considered to be too
specialized.

On 9th May 2013, some questions were slightly changed in wording, and others swapped in posi
tion and
renumbered to make the flow of questions more logical, to match changes to the online data entry form at
http://www.miidi.org/dmp/.


Question 3 was swapped with question 4, and questions 8
-
12 were s
wapped
with questions 13
-
20.


Some of the exemplar responses were also revised to make them more useful.

A list of the original questions follows.

Original Twenty Questions published on 7 March 2012

1








What is the subject discipline (domain, field)
to which your research data relates?

2








What is the exact nature (range, scope) of your research data?

3








When will your research data be collected?

4








When will your research data be processed and analysed?

5








Who owns the data ari
sing from your research, and the intellectual property rights relating to them?

6








How will your research datasets be described, i.e. with what metadata or accompanying interpretive information will they be a
ccompanied,
and how will these metadata be

created?

7








Where, and in what format(s), will you store your data in the short term after acquisition?

8








Who is responsible for the immediate day
-
to
-
day management, storage and backup of the data arising from your research?

9








How
frequently and where will your research data be backed up for short
-
term data security?

10







With whom will you share your research data in the short term, before publication of any papers arising from their interpreta
tion?

11







Why is access to yo
ur research data to be restricted in the short term (if indeed it is)?

12







To whom will you provide access to your research data in the long term, with what limitations as to re
-
use, and under what license
arrangements.

13







Why is access to your
research data to be restricted in the long term (if indeed it is)?

14







How (i.e. by what physical or electronic method) are your research datasets to be transferred from short
-
term storage under the local care
of yourself or your research group to the
ir long
-
term archival and Web publication destination under the curatorial care of a separate third
-
party,
e.g. a data repository?

15







Where will your research data be archived for long
-
term preservation?

16







When will your research data be moved

from your own local storage to a secure archive for long
-
term preservation (e.g. your institutional
library’s data repository)?

17







Who has authority to decide which of your research data are NOT worth preserving and will be deleted?

18







Where w
ill your research data be published for others to see?

19







When will your research data be published in this manner?

20







To whom will responsibility for the long
-
term preservation of your research data devolve, once you have left your present res
earch group?

This document is licensed under a
Creative Commons Attribution 3.0 Unported License
.