Bioinformatics: Bringing it all together - Exordio

moredwarfΒιοτεχνολογία

1 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

378 εμφανίσεις

Bioinformatics: Bringing it all together

Vol.

419,

No. 6908 (17 October 2002)

Forget test tubes, petri dishes and pipettes. One of the few pieces of equipment
that can be honestly labelled ubiquitous in biology today is the computer.
Bioinformatics


the
development and application of computational tools to
acquire, store, organize, archive, analyse and visualize biological data


is one of
biology's fastest
-
growing technologies.

Marina Chicurel is a science writer based in Santa Cruz.



Bioinformatics: Bringing it all together
technology feature

MARINA CHICUREL

doi:10.1038/419
751a

|
Full

text
|
PDF
(194
K)
|


751





Genome analysis
at your fingertips


MARINA CHICUREL

doi:10.1038/419751b

|
Full

text
|
PDF
(194
K)
|


751




Putting a name on it

MARINA CHICUREL

doi:10.1038/419755a

|
Full

text
|
PDF
(139
K)
|


755




table of suppliers


doi:10.1038/419759a

|
Full

text
|
PDF
(35
K)
|


759



17 October 2002




Nature

419
, 751
-

757 (2002); doi:10.1038/419751a

<>




Bioinformatics: Br
inging it all together technology
feature


MARINA

CHICUREL


Marina Chicurel is a science writer based in Santa Cruz.


Forget test tubes, petri dishes and pipettes. One of the few pieces of equipment that can be
honestly labelled ubiquitous in biology today

is the computer. Bioinformatics


the
development and application of computational tools to acquire, store, organize, archive,
analyse and visualize biological data


is one of biology's fastest
-
growing technologies.

Biologists at the bench studying small

networks of genes want user
-
friendly tools to
analyse their results and help them to plan experiments. They need accessible interfaces
that allow them to search databases, and compare their data with those of others (see
'
Genome analysis at your fingertips
').

At the other end of the spectrum, researchers analysing whole genomes, and drug
-
discovery
companies mining the genome for drug ta
rgets, want high
-
throughput analysis tools to
accelerate genome annotation and extract information from databases in more efficient and
sophisticated ways.

And all of those involved want more integration


integration of data across the hundreds,
if not th
ousands, of different databases, and visual integration of data to aid interpretation.
"The key to bioinformatics is integration, integration, integration," says bioinformatics
expert Jim Golden at Curagen spin
-
off 454 Corporation in Branford, Connecticut.

"To
answer most interesting biological problems, you need to combine data from many data
sources," agrees Russ Altman, a biomedical informatics expert at Stanford University.
"However, creating seamless access to multiple data sources is extremely difficu
lt."

Standard currencies

One of the most insidious problems is the lack of standard file formats and data
-
access
methods. But attempts to standardize them are gaining momentum. One success is the
distributed annotation system (DAS), a standard protocol dev
eloped by Lincoln Stein at
Cold Spring Harbor Laboratory in New York and his colleagues. "It's a simple solution to a
simple but obvious problem," says Stein. "There was no standard way of exchanging
sequence annotations."

DAS allows one computer to contac
t multiple servers to retrieve and integrate dispersed
genomic annotations associated with a particular sequence, such as predicted introns and
exons from one server and corresponding single
-
nucleotide polymorphisms (SNPs) from
another. It handles the anno
tations as elements associated with a particular stretch of
genomic sequence and so enables users to obtain a picture of that genome segment with all
of its associated annotations. Many providers of genome data, including WormBase,
FlyBase, the Ensembl ser
ver run by the European Bioinformatics Institute (EBI) and the
Sanger Institute near Cambridge, UK, and the genome browser at the University of
California, Santa Cruz, are currently running DAS servers.

Reckoning that data providers will never agree on a u
niversal standard for representing
data, building database interfaces or writing access scripts, Stein thinks that web services
such as DAS are the best route to interoperability. Data providers only have to agree on a
small set of standards that define ho
w their data and tools are presented to the outside
world.

And a 'registry' can keep track of which data sources implement which services. Scripts for
retrieving a particular type of data or operation consult the registry, as they would an
address book, to

determine which data sources to query. A project of this type is
BioMOBY, led by Mark Wilkinson at the National Research Council in Saskatoon,
Canada. BioMOBY will be a powerful exploration tool, he says, because apart from
answering database queries, it
will discover cross
-
references to other relevant data and
applications. Betting on BioMOBY's potential, several groups are encouraging its
development. "At the moment, we have the support of almost all of the model organism
databases," says Wilkinson.

Anot
her indicator of the widespread desire for interoperability is the incorporation in
February 2002 of the Interoperable Informatics Infrastructure Consortium (I3C). With 14
member organizations


including Sun Microsystems of Santa Clara, California; IBM of

White Plains, New York; Millennium Pharmaceuticals and the Whitehead Institute for
Biomedical Research, both in Cambridge, Massachusetts


I3C is not a standards body,
but aims to develop and promote the adoption of common protocols.

To integrate the curr
ent set of non
-
standardized databases, researchers are relying on two
main strategies: warehousing and federation. A warehouse is a central database where data
from many different sources are brought together on one physical site. Entrez, the widely
used s
earch
-
and
-
retrieval system developed by the US National Center for Biotechnology
Information in Bethesda, Maryland, is an example.

Access all areas

A popular tool is SRS produced by LION Bioscience
of Heidelberg, Germany, which facilitates access to a
wide range of biological databases using a warehouse
-
like s
trategy. SRS is used in the online genome portals
maintained by Celera Genomics in Rockland,
Maryland, and Incyte Genomics in Palo Alto,
California, and is the core technology of tools sold by
LION.

Federation, on the other hand, links different
databases
so that they appear to be unified to the end
-
user but are not physically integrated at a common
site. A query engine takes a complicated question
requiring access to multiple databases and divides it
into subqueries that are sent to the individual
database
s. The answers are then reassembled and presented to the user. Aventis
Pharmaceuticals in Strasbourg, France, for example, has adopted IBM's DiscoveryLink
federating software to aid collaboration between its biologists and chemists in drug
development.

Whi
ch approach to use and when is much debated. "Updating and maintaining local copies
of external data collections in a warehouse is a major task," says bioinformatician Rolf
Apweiler at the EBI's lab in Hinxton, UK. Federation avoids this because the data a
re
accessed directly from the original source. But the bioinformatics databases you want to
query must be accessible for programmatic queries over the Internet, and most are not, says
Peter Karp, director of the bioinformatics research group at the non
-
pro
fit research institute
SRI International in Menlo Park, California. "It's like installing a state
-
of
-
the
-
art telephone
exchange in a village without telephones."

Several projects combine the two approaches. On the industry side, IBM has set up a
partnershi
p with LION to integrate DiscoveryLink with SRS. Particularly ambitious is the
public
-
domain Integr8 project led by Apweiler. His team aims to bring together some 25
major databases spanning a broad range of molecular data, from nucleotide sequences to
pro
tein function. "We're trying to make an integrative layer on top of it all so that you can
easily zoom in on the sequence data linked to the gene, and then go to the genomic data, to
the transcriptional data and to the protein sequences. You'll have a sort

of magnifying
glass," says Apweiler.

Knowledge is power

Smart systems that can answer complicated questions about different sorts of data are also
on the move. "A knowledge base is a fancy word for a database that allows you to do really
sophisticated que
ries," says bioinformatician Mark Yandell at the University of California,
Berkeley. Such databases often rely on vocabularies known as 'ontologies' (see '
Putting a
name on it
') combined with frame
-
based systems, a way of representing data in computers
as objects within a hierarchy. One frame, for example, could be called 'protein', with slots
describing its relationships to other concep
ts, such as 'gene name', or 'post
-
translational
modifications'. So when a user asks a question about a protein, frames make it easy to
LION BIOSCIENCE


Structure prediction: modelling a
sequence homolog in LION's SRS
3D.

retrieve the name of the corresponding gene and the modifications the protein can undergo.
If the user asks for literatur
e references, ontologies make it possible to retrieve not only
articles that include the protein name but also those about related genes or processes.

The Genome Knowledgebase, a collaborative project between Cold Spring Harbor
Laboratory, the EBI and the
Gene Ontology Consortium, will have, among other
capabilities, the ability to make connections between disparate genomic data from different
species. "We store things specific to a species but allow a patchwork of evidence from
different species to weave t
ogether," says Ewan Birney, a bioinformatician at the EBI. So
when users pose questions about a biological process, they will get answers that incorporate
knowledge collected from various model organisms.

Knowledge bases are being developed for a wide vari
ety of topics, but some researchers are
sceptical about their future. Information scientist Bruce Schatz of the University of Illinois
at Urbana
-
Champaign, for example, thinks that ontologies require too much expert effort to
generate and maintain. "All on
tologies are eventually doomed," he says. Instead, he favours
a purely automated process of knowledge generation, such as concept
-
switching, which
relies on analysing the contextual relationships between phrases to identify underlying
concepts. Concept
-
swi
tching algorithms, for example, allow users to start with a general
topic, such as mechanosensation, and explore its 'concept space', zeroing in on specific
terms such as the mechanosensory genes of a particular species.

Visualizing the genome

An essential

component of bioinformatics is the ability to visualize retrieved data,
especially complex data, in ways that aid their interpretation. "Integration and visualization
are actually very closely related, because after you integrate information, the first th
ing you
want to do is display it," says Altman. "They're both parts of the issue of taking information
that's perfectly happy in a computer and turning it into information that a user is happy
digesting cognitively."

Genome browsers are particularly powerful, as they
provide a bounded framework, the genome sequence,
onto which many diffe
rent types of data can be
mapped. The University of California, Santa Cruz, for
example, maintains a browser where users can
simultaneously view the locations of SNPs, predicted
genes and mRNA sequences along a chosen genome
stretch. "It's all about linkin
g," says principal
investigator David Haussler. "It's about having it all at
your fingertips."

Tools that compare genomes from different species
are also proving their worth. The VISTA project,
developed and maintained by the Lawrence Berkeley
National Lab
oratory in Berkeley, California, allows
biologists to align and compare large stretches of
sequence from two or more species. "It gives you a graphical output where you see peaks of
R.R. JONES


David Haussler: putting the picture
together.

conservation and valleys of lack of conservation," says Edward Rubin, one
of VISTA's
developers.

Spotfire of Somerville, Massachusetts, sells software that can transform all sorts of data
into images. Using Spotfire's DecisionSite, researchers at Monsanto in St Louis, Missouri,
represented as a 'heat map' the results of complex
experiments that tracked changes in the
expression of thousands of genes and the concentrations of numerous metabolites during
maize development. It helped them to link the expression of certain genes to the presence or
absence of particular amino acids. "
A lot of times it's through comparisons and
comparisons and comparisons that researchers see an interesting trend," says David Butler,
vice
-
president of product strategy at Spotfire.

Biologists are moving closer to their dream of data
integration. But open issues remain. Schatz worries that if
public support doesn't increase, industry ma
y come to
dominate the field, providing suboptimal solutions for
scientists. "If a Celera
-
like company starts doing this kind of
activity and they get bought by Microsoft, which is an
entirely possible activity in the world at large, then it will be
too la
te. And then scientists will get whatever the major
customers of Microsoft want," he says.

But Celera's director of scientific content and analysis,
Richard Mural, advocates a centralized, industry
-
based
solution to integration and genome annotation. He no
tes that
there are few rewards for academic researchers for working
on such problems, and their focused interests can be hard to
reconcile with a global approach. "To really get it done
quickly and well, I think the commercial may be a stronger
model," he
says.

However these issues are resolved, the road ahead looks bright. "Ninety
-
nine percent of
bioinformatics is new stuff," says Haussler. "It's an enormous frontier."

Distributed analysis system

http://biodas.org

Interoperable Informatics Infrastructure Consortium

http://www.i3c.org

University of California, Santa Cruz, genome browser

http://genome.ucsc.edu

Genome Knowledgebase

http://www.genomeknowledge.org

ROY KALTSCHMIDT/LBL


Edward Rubin takes a
graphical view.

Entrez s
ystem

http://www.ncbi.nlm.nih.gov/Entrez

Ensembl genome browser

http://www.ensembl.org

VISTA

http://www
-
gsd.lbl.gov/vista



17 October 2002




Nature

419
, 751
-

752 (2002); doi:10.1038/419751b




Genome analysis at your fingertips


MARINA

CHICUREL


Marina Chicurel is a science writer based in Santa Cruz.


The working biologist now has an enormous number
of options when it comes to bioinformatics tools. On
one hand, there is a lot of free high
-
quality software
in the publ
ic domain. On the other, researchers can
buy commercial products offering added features,
such as programs to streamline sequential tasks, to
access proprietary databases and to enhance data
security. And because software producers realize
that users' need
s change and their products will
rarely be used in isolation, flexibility and modularity
are on the rise.

An important trend has been the increasing
integration and sophistication of tools available to
non
-
experts. A wide range of user
-
friendly packages
in
corporating tools for nucleotide and protein sequence analysis are available from
companies such as MiraiBio, a Hitachi Software Engineering subsidiary based in Alameda,
California; DNASTAR in Madison, Wisconsin; InforMax in Bethesda, Maryland; and
Accelry
s in San Diego, California. On the non
-
commercial side, the Biology WorkBench
INFORMAX


InforMax's BioAnnotator uses locally
stored databases to find protein
motifs.

maintained by the Supercomputer Center at the University of California, San Diego, is
particularly popular, offering more than 80 bioinformatics tools to more than 10,000
registe
red users. "It's a one
-
stop
-
shop for doing a lot of things," says lead developer
Shankar Subramaniam. "You can be sitting in front of any type of computer; as long as you
have a web browser, you can access it."

Software has also become more user
-
friendly.
Back in the early 1990s, users of the GCG
Wisconsin package, the grandfather of molecular
-
biology packages (now sold by Accelrys),
had to work with UNIX
-
based systems. Although these systems are still preferred by some,
users can now point
-
and
-
click their
way through a wide range of tasks on ordinary desktop
computers.

Another trend is the increased integration of data analysis with experimental design. The
needs of bench scientists don't always coincide with those of professional bioinformaticians
producin
g tools for whole
-
genome analyses. Genome projects require programs that can
efficiently, if not very accurately, process huge amounts of sequence data, but the biologist
in the lab is often interested in studying small sets of genes and their products wit
h very
high precision. Last month, for example, InforMax released GenomBench, a tool that
allows users to predict the structure of genes and their splice variants, progressively refine
these predictions, and then design experiments to validate them. "It's
an interactive tool that
can work with researchers not just to analyse the data they have, but to design the right
experiment to resolve ambiguities in the data," says Steve Lincoln, senior vice
-
president of
life
-
science informatics at the company.

Others
are hooking up their software to catalogues of reagents. As just one example, the
genome browser run by the University of California, Santa Cruz, is being used in a
collaboration with the National Cancer Institute in Bethesda, Maryland, to identify new
gen
es to expand, and ultimately complete, the Mammalian Gene Collection


a set of
cDNA clones of expressed genes for human and mouse. The browser will be linked to the
collection's website, so that users can go straight from analysing an electronic
represent
ation of a gene to ordering a clone.

A key trend in the development of commercial products is the emergence of workflows,
automated chains of operations that can dramatically increase analysis throughput. For
example, software producer geneticXchange of Me
nlo Park, California, recently
demonstrated a workflow that sorts gene
-
expression data generated by microarrays, looks
up the accession numbers that identify the selected genes, collects sequence information
from the US National Center for Biotechnology In
formation's UniGene database, gathers
annotation information from the LocusLink website, and goes to Medline to assemble a list
of relevant references. "You just hit a button and it does what might take a biologist 600
hours to do, in about five hours," sa
ys Mark Haselup, chief technical officer for the
company.

Some commercial products are valuable because they're linked to otherwise unavailable
proprietary data. One of the main selling points of the Celera Discovery System, for
example, is the access it p
rovides to the biotech firm's high
-
quality human and mouse
genome annotations. Unlike many other collections of annotations, a high proportion of
Celera's have been generated by manual curation (see '
Putting a name on it
').

Commercial products often provide greater security for those who don't wish to manipulate
their unpublished or unpatented results openly over the Internet. Although s
ome public sites
offer a degree of security, commercial packages usually have more protection options and
can be operated behind a firewall.

But the recurrent theme in the design of bioinformatics tools is the trend towards increased
integration. The Disco
very Studio Gene package recently launched by Accelrys is a case in
point. "Results are put into a project database that has the ability to be accessed by a set of
applications that span both chemistry and biology," says Scott Kahn, senior vice
-
president
o
f life science at Accelrys. "We set up the ability to collaborate between domains."

Biology WorkBench
h
ttp://workbench.sdsc.edu


17 October 2002




Nature

419
, 755 (2002); doi:10.1038/419755a




Putting a name on it


MARINA

CH
ICUREL


Marina Chicurel is a science writer based in Santa Cruz.


A chasm separates sequence da
ta from the biology of
organisms


and genome annotation will be the bridge,
says Lincoln Stein, a bioinformatics expert at Cold Spring
Harbor Laboratory in New York. Spanning three main
categories


nucleotide sequence, protein sequence and
biological pro
cess


annotation is the task of adding
layers of analysis and interpretation to the raw sequences.
The layers can be generated automatically by algorithms or
meticulously built up by experts in the hands
-
on process of
manual curation.

Because manual curat
ion is time
-
consuming and genome
projects are generating data, and even changing data, at an
extraordinary pace, there is a strong motive to shift as
much of the burden as possible to automated procedures. A
major task in the annotation of genomes, especia
lly large
ones, is finding the genes. There are numerous gene
-
prediction algorithms that combine statistical information
about gene features, such as splice sites, or compare stretches of genome sequence to
previously identified coding sequences, or combin
e both approaches. A new type of
algorithm, called a dual
-
genome predictor, uses data from two genomes, to locate genes by
identifying regions of high similarity.

Each algorithm has its strengths and limitations, working better with certain genes and
genom
es than with others. The GENSCAN gene
-
predicting algorithm, developed by Chris
Burge at the Massachusetts Institute of Technology, has become a workhorse for vertebrate
annotation and was one of the algorithms used in the landmark publications of the draft

human genome sequence. FGENESH, produced by software firm Softberry of Mount
Kisco, New York, proved particularly useful for the Syngenta
-
led annotation of the rice
genome sequence.

Good data preparation is also important. "A lot of
the magic happens in the environment, not the
algorithm," says Ewan Birney a bioinformatician at

the European Bioinformatics Institute (EBI) in
Hinxton, near Cambridge, UK. "People often focus
on the whizzy technology to the detriment of the
real smarts, which happen in the sanitization of data
to present them to a hard
-
core algorithm." Data
sanitiza
tion includes steps such as masking
repetitive sequences, which can interfere with an
algorithm's performance.

All current large
-
scale efforts involve a combination
of automatic and manual approaches. "For me it's
quite clear that they can only be compleme
ntary," says Rolf Apweiler at the EBI, who leads
annotation for the major protein databases SWISS
-
PROT and TrEMBL. "You can't
BILL GEDDES


Lincoln Stein: bridging the
gap.

HEIKKI LEHVASLAIHO


Automated annotation: Ewan Birney
and Ensembl.

automate anything without having manual reference sets that you can rely on."

While Apweiler is tackling large
-
scale annotation, o
thers are concentrating on finding
genes and proteins linked to a particular process, such as a disease. The bioinformatics and
drug
-
discovery company Inpharmatica in London, for example, provides annotation
databases and tools to identify potential drug t
argets.

Because of the plethora of different names given to the same genes and proteins in different
organisms, a growing trend is the use of 'ontologies'


controlled vocabularies in which
descriptive terms (such as gene and protein names) and the relatio
nships between them are
consistently defined. One ontology that is now widely adopted is the Gene Ontology (GO),
but it doesn't cover all biology, and others have developed their own, often complementary,
ontologies. BioWisdom in Cambridge, UK, for example
, sells information
-
retrieval and
analysis tools for drug discovery based on proprietary ontologies in fields such as oncology
and neuroscience.

Working as part of the Alliance for Cellular Signaling, a team led by Shankar Subramaniam
is developing an onto
logy that captures the different states of a protein, such as
phosphorylation state. This will serve as a foundation for the Molecule Pages, a literature
-
derived database of signalling molecules and their interactions.

GO coordinator Midori Harris at the E
BI and her colleagues are encouraging developers of
new ontologies to make them publicly available through GO's website. They hope this will
not only drive standardization, but will help to expand GO's capabilities by allowing the
creation of combinatorial

terms derived from different ontologies.

But most researchers agree that tools are only part of the solution. "The passion for biology
often gets missed out here," says Birney. "People think it is all about finding technical
solutions that magically solve

problems, but frankly, far more important is really wanting to
see the data hang together."

Gene Ontology Consortium
http://www.geneontology.org

European Bioinformatics Institute
http://www.ebi.ac.uk

Allian
ce for Cellular Signaling
http://www.afcs.org




17 October 2002




Nature

419
, 759

-

761 (2002); doi:10.1038/419759a




table of suppliers



Company

Products/activity

Location

URL

Sequence, genome
and gene
-
expression analysis




Accelrys

GCG Wisconsin package
for sequence and genome
analysis; Discovery
Studio for database
mining, genomics and
proteomics

San Diego,
California

http://www.accelrys.com

Affibody

Software for genomics
d
ata analysis and
management

Bromma,
Sweden

http://www.affibody.com

Aneda

Desktop bioinformatics
tools for genomics and
proteomics

Roslin, UK

http://www.aned
abio.com

ApoCom Genomics

Desktop bioinformatics
tools for gene prediction
and gene
-
expression
analysis

Knoxville,
Tennessee

http://www.apocom.com

Array Genetics

Protein information
database; tools for
geno
mics and proteomics

Newtown,
Connecticut

http://www.arraygenetics.com

BIOBASE

TRANSFAC family of
databases; analysis tools
for gene expression,
promoters and signalling
pathways; contract
bioinformati
cs

Wolfenbüttel,
Germany

http://www.biobase.de

Biocomputing

Data
-
management
systems for genotyping
and phenotype data

Espoo, Finland

http://www.biocomputin
g.fi

Bioinformatics
Solutions

Desktop bioinformatics
tools for sequence
analysis and structure
prediction

Waterloo,
Canada

http://www.bioinformaticssolutions.com

BioTools

Analysis software

for
gene and protein
sequences and
chromatograms

Edmonton,
Canada

http://www.biotools.com

Cognia

Bioinformatics software,
including BIOBASE
New York,
New York

http://www.cognia.com

software and databases

Curagen

GeneScape portal for
genome analysis tools

Branford,
Connecticut

http://www.curagen.com

Digital Gene
Technologies

TOGA gene
-
expression
analysis softw
are

La Jolla,
California

http://www.dgt.com

DNASTAR

Desktop sequence
-
analysis and genome
visualization software

Madison,
Wisconsin

http://www.dnastar.com

Entige
n

BioNavigator platform
for sequence and genome
analysis

Sunnyvale,
California

http://www.entigen.com

GATC Biotech

Accelrys, DNASTAR and
other bioinformatics
software; DNA
sequencing

Constance,
Germany

http://www.gatc
-
biotech.com

Gene Codes

Sequencher sequence
assembly and analysis
software

Ann Arbor,
Michigan

http://www.genecodes.com

Gene
-
IT

Universal
software for
database management and
genomics

Le Chesnay,
France

http://www.gene
-
it.com

Genomatix

Genome and sequence
analysis tools; portals to
mouse and human
genomes

Munich,
Germany

http://www.genomatix.de

Genomic Solutions

Proteomics
bioinformatics tools

Ann Arbor,
Michigan

http://www.genomicsolutions.com

Geospiza

Servers and tools for
sequence a
ssembly and
analysis

Seattle,
Washington

http://www.geospiza.com

Hitachi Software
Engineering

DNASIS desktop
bioinformatics software
for DNA sequence
assembly and analysis,
and analysis of
microarray data

Yokohama,
Japan

http://www.hitachi
-
sk.co.jp/English/index.html


Inpharmatica

Biopendium and
CeleraEdition
Biopendium proteome
annotation resources;
PharmaCarta large
-
scale
discovery
informatics
platform

London, UK

http://www.inpharmatica.com

InforMax

Vector bioinformatics
software for sequence,
genome and microarray
data; Vector NTI for
Macintosh; LabShare for
data storage and
man
agement

Bethesda,
Maryland

http://www.informaxinc.com


Iobion Informatics

GeneTraffic microarray
La Jolla,
http://www.iobion.com

data
-
management and
analysis software

California

iSenseIt

Microarray data analysis
and storage software;
oligonucleotide
computation

Bremen,
Germany

http://www.isenseit.com

LabBook

eLabBook web
-
enabled
electronic notebooks;
annot
ated human genome
database and data
-
mining
tools

McClean,
Virginia

http://www.labbook.com

LabVelocity

Jellyfish desktop
bioinformatics software;
information services

San Francisco,
California

http://www.labvelocity.com

LION Bioscience

Bioinformatics software,
database development
and management;
DiscoveryCenter platform
for data integration;
contract bioinformatics

Heidelberg,
Germany

http://www.lionbioscience.com


MiraiBio

DNASIS desktop
software for DNA
sequence assembly and
analysis, protein sequence
analysis, and analysis of
microarray data

Alameda,
California

http://www.miraibio.com

Molecular Biology
Insights

Oligonucleotide
identification software

Cascade,
Colorado

http://www.oligo.net

Paracel

Software for sequence
assembly, analysis and
sequence
-
b
ased
genotyping

Pasadena,
California

http://www.paracel.com

Premier Biosoft

Desktop bioinformatics
packages for sequence
analysis, primer design,
and two
-
hybrid protein
interactions

Palo Alto,
California

http://www.premierbiosoft.com

PubGene

PubGene public access
and commercial gene
databases and analysis
software

Oslo, Norway

http://www.pubgene.com

Red
asoft

Genetic mapping and
sequence analysis
software and REBASE
restriction enzyme
database

Toronto,
Ontario

http://www.redasoft.com

Rosetta BioSoftware

Rosetta Resolver gene
-
expression data analysis
syste
m

Kirkland,
Washington

http://www.rii.com

Silicon Genetics

MetaMine, GeNet and
GeneSpring microarray
analysis software

Redwood City,
California

http://www.sige
netics.com

science factory

BRENDA enzymology
database; überTOOL
bioinformatics platform
for sequence, expression
and structural data

Cologne,
Germany

http://www.science
-
factory.com

Softberry

Softw
are for sequence
and genome analysis and
database searching

Mount Kisco,
New York

http://www.softberry.com

Southwest Parallel
Software

Bioinformatics software
packages

Albuquerque,
New Mexico

http://www.spsoft.com

Textco

Desktop bioinformatics
packages and electronic
lab notebook

West Lebanon,
New
Hampshire

http://www.textco.com

X
-
MINE

Bioinformatics platform
storage and analysis of
genomics data

Brisbane,
California

http://www.XMine.com

Databases




Beilstein
Information

Chemical databases

San Leandro,
California

http://www.beilstein.com

Biomax Informatics

Annotated human
genome database;
customized data
management

Martinsried,
Germany

http://www.biomax.de

BioWisdom

Text search and
pharmacology and
oncology informa
tion
databases

Cambridge,
UK

http://www.biowisdom.com

Celera Genomics

Web
-
based tools for
accessing the Celera
annotated genomes
databases; bioinformatics
services

Rockville,
Maryland

http://www.celera.com

Compugen

GenCarta annotated
human genome,
transcriptome and
proteome database

Tel
-
Aviv,
Israel

http://www.cgen.com

DECODON

Software for 2D
-
gel
analysis and in
formation
storage

Greifswald,
Germany

http://www.decodon.de


GeneLogic

Gene
-
expression
databases and software
for drug discovery

Gaithersburg,
Maryland

http:
//www.GeneLogic.com

Iconix

DrugMatrix databases
and software platform for
chemogenomics research

Mountain
View,
California

http://www.iconixpharm.com

Incyte Genomics

Annotated gene and
expressed seque
nce tag
databases; Proteome
BioKnowledge Library
protein information
databases; bioinformatics
Palo Alto,
California

http://www.incyte.com

software

Lexicon Genetics

Gene knockout and gene
function databases
and
bioinformatics for drug
discovery

The
Woodlands,
Texas

http://www.lexgen.com

LifeSpan
BioSciences

Gene
-
expression and
protein
-
localization
databases and data
-
mining tools

Seattle,
Washington

http://www.lsbio.com

MDL

Biological and chemical
information databases;
data
-
management
software

San Leandro,
California

http://www.mdli.com

Structural
Bioinformatics

Protei
n and protein
-
structure databases;
computational proteomics

San Diego,
California

http://www.strubix.com

Computer systems,
middleware and
laboratory
information
management
systems (LIMS)




Amersham
Biosci
ences

Scierra Laboratory
Workflow System for
microarray and
sequencing data

Piscataway,
New Jersey

http://www.amershambiosciences.com


CLONDIAG

PARTISAN microarray
LIMS

Jena, Germany

http://www.clondiag.com

geneticXchange

K1 System middleware
platform for biological
data integration

Menlo Park,
Callifornia

http://www.geneticxchange.com

Heli
Xense

Software and system
infrastructure supporting
large
-
scale distributed
computing and biological
data management

Singapore

http://www.helixense.com

IBM

DiscoveryLink platform
for database integration;

data
-
management systems

White Plains,
New York

http://www.ibm.com/solutions/lifesciences

Mitsui Knowledge
Industry

LIMS; software for
membrane protein
secondary
-
structure
prediction, dat
a
management and analysis
of gene
-
expression and
SNP data

Tokyo, Japan

http://bio.mki.co.jp

NEC

Computer systems and
networks

Tokyo, Japan

http://www.nec
-
glo
bal.com

Protedyne

LIMS middleware for
integration of network
-
enabled laboratory
software

Martinsried,
Germany

http://www.protedyne.com

Silicon Graphics

SGI servers for high
-
throughput computing,
visuali
zation and data
management

San Francisco,
California

http://www.sgi.com

Sun Microsystems

Servers and workstations
for high
-
throughput
computing; universal
software platforms for
networks

Santa Clara,
California

http://www.sun.com

TimeLogic

DeCypher system for
accelerated
bioinformatics

Crystal Bay,
Nevada

http://www.timelogic.com

TurboWorx

Open computational
platfor
ms for biological
research data including
bioinformatics

New Haven,
Connecticut

http://www.turbogenomics.com

Services




Aber Genomic
Computing

Design of data
-
mining
and predictive modelling
software

Aberystwyth,
UK

http://www.abergc.com

AGOWA

Genome and expressed
sequence tag analysis;
automated sequence
annotation customized
bioinformatics services

Berlin,
Germany

http://www.agowa.de

BioInformatics
Services

Computational biology;
bioinformatics services

Rockville,
Maryland

http://www.bioinformaticsservices.com

Chemical
Computing Group

Bioinfo
rmatics software,
services and computer
-
aided molecular design

Montreal,
Qeubec,
Canada

http://www.chemcomp.com

Cyberell

Bioinformatics software
and services

Helsinki,
Finland

http://www.cyberell.com

ePitope Informatics

Epitope prediction over
the web

Durham, UK

http://www.epitope
-
informatics.com

GeneData

Bioinformatics systems
and services; data
base
development and
management

Basel,
Switzerland

http://www.genedata.com

Genometrix

Genotyping, gene
expression and
bioinformatics services

The
Woodlands,
Texas

http://www.genometrix.com

Keygene

DNA fingerprint analysis
software; contract
genomics and
bioinformatics services

Wageningen,
The
Netherlands

http://www.keygene.com

NuGenesis

Scientific data
man
agement services

Westborough,
Massachusetts

http://www.nugenesis.com

Sagitus Solutions

Bioinformatics software
development

Manchester,
UK

http://
www.sagitussolutions.co.uk

SRI International

Contract informatics
services

Menlo Park,
California

http://www.sri.com

Tripos

Chemical libraries;
molecular modelling,
pharmacophore
perception and virtual
screen
ing software;
contract informatics

St Louis,
Missouri

http://www.tripos.com

General




ALMA
Bioinformatica

Bioinformatics software,
consultancy and training

Madrid, Spain

http://www.almabioinfo.com

Applied Maths

Gel fingerprint analysis
and bioinformatics
software; contract
bioinformatics

Kortrijk,
Belgium

http://www.applied
-
maths.com

Bio
-
Rad

WorksBa
se
bioinformatics software
for proteomics

Hercules,
California

http://www.discover.bio
-
rad.com

BioSolveIt

Software for molecular
modelling, small
-
molecule docking, protein
threading; bioinformatics

services and training

St Augustin,
Germany

http://www.biosolveit.de

Dalicon

Bioinformatics software
for large
-
scale data
management and analysis

Nijmegen, The
Netherlands

http://www.dalicon.com

MegaMetrics

Data
-
mining software for
microarray, proteomics
and SNP databases

Wyndmoor,
Pennsylvania

http://www.megametrics.com

Molecular Mining

Data
-
mining soft
ware

Kingston,
Ontario,
Canada

http://www.molecularmining.com

Partek

Pattern recognition and
interactive visualization
software; consulting
services

St Charles,
Missouri

http://www.partek.com

Spotfire

DecisionSite analytical
and statistical data
-
management software

Somerville,
Massachusetts

http://www.spotfire.com

SPSS

Clementine statistical and
data
-
m
ining software;
Clementine microarray
application template

Chicago,
Illinois

http://www.spss.com

Zeptosens

SensiChip microarray
systems

Witterswil,
Switzerland

http://www.zeptosens.com