Slides - In-Silico Analysis of Proteins

whipmellificiumBiotechnology

Feb 20, 2013 (4 years and 5 months ago)

129 views

Making research findings
visible


the future of the
scientific paper


Matthew Cockerill

Publisher, BioMed Central



"There is nothing more amusing than
watching business interests work
themselves up into a righteous frenzy over
a threat to their monopoly profits from a
new technology or some upstart with a
different business model. Invariably, the
monopolists… try to present themselves as
champions of the consumer, or defenders of
a level playing field, as if they hadn't
become ridiculously rich by sticking it to
consumers and enjoying years in which the
playing field was tilted to their advantage."


Steven Pearlstein in the Washington Post, July 19 2006



Status of open access publishing

Momentum for transition to OA


We are seeing action (not just words) from
funding agencies and governments


Wellcome and several UK research councils now
require

OA
deposit as a condition of grants


Federal Research Public Access Act may do the same in US


OA journals continue to grow rapidly


Impressive impact factors demonstrate OA
and quality are absolutely compatible


Move to OA basically unstoppable


Growth of OA

Rolling 28-day count of submissions to BioMed Central
Journals
0
200
400
600
800
1000
1200
1400
Jul-00
Jan-01
Jul-01
Jan-02
Jul-02
Jan-03
Jul-03
Jan-04
Jul-04
Jan-05
Jul-05
Jan-06
Jul-06
Submissions
Impact factors



Genome Biology



IF 9.71


BMC Bioinformatics



IF 4.96


BMC Genomics



IF 4.09


Genome Biology

is:


10
th

of 124 in
GENETICS & HEREDITY


4
th

of 139 in BIOTECHNOLOGY & APPLIED MICROBIOLOGY

What does this mean for the
future of the scientific article?

Why did we start BioMed Central
as an open access publisher?


Limited access to research articles makes
further research needlessly inefficient


Barriers to access obstruct interdisciplinary
cross
-
fertilization



It is in the interest of researchers for their
research being read and cited as widely as
possible


Traditional scientific publishing is not an
effective market, and so high serials prices
mean a poor deal for the scientific community

The main reason we started
BioMed Central


Publications and data are a continuum


Publications include data


Publications
are

data


To make sense of data and publications
delivered by post
-
genomic science, we need


The best possible tools


The widest possible collection of raw material


Open access stimulates the creation of tools
by providing access to the raw material



The future of the scientific
article


Computers will be at least as
important as human readers

Text mining


Open access facilitates text mining


BioMed Central XML corpus of full
text articles is freely downloadable


The more semantics that are
captured in the XML, the richer the
possibilities for mining



Existing examples of automated
sifting of published research

Postgenomic

CiteULike

This is just

bibliographic information


but it's a start

Semantic enrichment


Ensure that the rest of the knowledge
represented in scientific articles is
structured to be computer
-
readable


Ideally capture semantics
unambiguously at time of publication


Mining of free text is a stopgap/fall
-
back


It is not just articles that need semantic
enrichment, but data sets too


Appropriate standards are now emerging


RDF


Useful common technical standard
for expressing semantics


Subject
-
predicate
-
object

triples


BioMed Central already exposes
bibliographic RDF for all articles


Tools like the
PiggyBank

can
capture RDF and then store it in
triple
-
stores (local or networked)

Semantic Laundry List


Scientific stuff


Genes


Proteins


Anatomy


Taxonomy


Small molecules/drugs


Macromolecules


Diseases


Experimental methodologies


Experimental data types


General stuff


People, Places, Organizations, Relationships



NCBO

e.g. of enriched research



Neurocommons.org


A ScienceCommons project


Working with open access articles
from BioMed Central and PLoS


Attempting to define best
practices/gold standard for
semantic enrichment of articles


Text mining and enhanced
authoring tools both have role



The role of wikis


The challenge: Ontologies, to be useful,
must stay up
-
to
-
date and receive
ongoing maintenance and curation


Scope of problem is enormous
-

every
entity and relationship of relevance to
science


Wikis provide a promising approach
-

perhaps the only viable approach


e.g. AuthorIDs

Projects at BioMed Central to
capture structured info


Case reports


Clinical trials


Biological processes


Chemical structures


Taxonomic descriptions


Publishing research articles in a more structured form
allows the results to be treated as a database


Structured authoring

Publicon


an experiment in
structured authoring

Benefits of structure

Live maths in articles

Live maths in articles

Problem


adding structure is a
hassle

Incentivize authors


Ideally, create structured
authoring tools that remove work
rather than add it (
e.g.

EndNote)


If you do create extra work for
authors, find a way to provide the
author with an
immediate return
on investment

Reduce work
-

smart authoring


e.g.

auto suggest


Standard way to disambiguate contacts


Why not chemicals, genes, species too?


Unambiguously capture semantics


Increase accuracy, save time, encourage uptake


Return on investment


Automatic update of meta
-
analysis
based on clinical trial data


Automatic list of closely
-
related
case reports from database


Automatic deposit of taxonomic
information in registry (Zoobank)

Q & A