Mat et meth - IGBMC

perchmysteriousΔιαχείριση Δεδομένων

1 Δεκ 2012 (πριν από 4 χρόνια και 10 μήνες)

466 εμφανίσεις

Mat et meth


ImAnno is a web based integrated system allowing the annotation, management, visualisation and querying of the gene
expression information of the
in situ

hybridisation images.

Annotation

A first interactive tool provided by the web site allows the user to proceed to the annotation of the genes. For that the
user is invited to define a list of tissues he is interested in. This tissue list is specific to each project : 20 tissues fo
r the
Eye

annotation done by Pascal Dollé (ie retina outer layer, lens anterior epithelium, mesenchim eyelid, …), 36 for the
Teeth annotation done by Agnès Bloch
-
Zupan (ie oral epithelium, molar mesenchymal compartment dental sac, incisor
gubernaculum, …), 25 for t
he EarAndSensorySytem annotation done by Raymond Romand (see tissue list table TisLis).
The annotation consists of checking each tissue for an expression value within Negative, Weak, Medium, Strong or
NotAvailable

(
according

to
http://www.genepaint.org
;

Visel et al., 2004)
. The annotation form

(see Fig
-
AnnotForm)

accepts also a free text input for each tissue as well as a general free text for the whole annotation. An additional
«

tissue

» 'General expression' is als
o proposed.

No
tate that o
ne annotation form is associated to one gene. The aim of a gene annotation is firstly to define the gene,
then to choose a set of
in situ

hybridization images which show the expression of this gene within the tissues and final
l
y
to

estimate and check the expression level for each tissue. A gene can be defined by its gene name but a better and more
powerful method is to

copy
-
paste the gene metadata
provided by

the
GenePaint web site

(http://www.genepaint.org
)
,
facilitating therefore
the detection and lin
k to the corresponding images.
Several images
from different sections with
different magnifications are often necessary to clearly show the expression level. The images are either simple http links
to the visualization facility provide
d by GenePaint, Eurexpress or any other web site, or locally hosted images uploaded
by the user himself.

For our analysis we used following protocole :

T
he probes were obtained from

GenePaint w
h
ere sequence of the
template used for in vitro transcription
o
f the RNA probe can be obtained
. The automated obtention of non
-
radioactive
ISH from E14.5 mouse cryosections has already been described in detail, (Carson et al., 2002; Visel et al., 2004, 2007).
Gene expression patterns were digitally photographed

at IG
BMC

by a DMBL Leica microscope equiped with a
Photometrics camera with the CoolSNAP software

(v. 1.2)
.
The
images were deposited on the
ImAnno database and are
publicly available.
Expression patterns for all

tissues were manually annotated by the first aut
hor and checked for
validation one year after the initial observation.


For each form submission the new annotation is integrated in a relati
onal database, storing values,
ownership and
history of the annotation act. The same user or any other authorized a
nnotator can re
-
annotate any tissue as long as the
annotation of the gene is not marked as 'Approved'.

Data management

Information about the genes, there corresponding images, the
per

tissue expression level annotation as well as the free
text comments are stored in a Po
stgresql relational database and can be queried through the ImAnno website.


Visualization and querying

Any authorized user, according to his access rights, can quer
y and visualize all or only some parts of the information
from the database. Looking for a gene, he gets access to the images and to the expression annotations. ImAnno offers an
explicit visual display highlighting the expression on a synthetic picture sho
wing the tissues colored according to
their

level (grey: NotAvailable, blue:Negative, yellow:Weak, orange:Medium, red:Strong).

ImAnno provides also a powerful querying system. Searches can be done by

genename, of course,
and more
interestingly by the
per

tissue expression level. For example it is possible to search all genes for which there is a strong
expression in tissue T1 and medium or weak in tissue T2. These kind of select patterns can be easily combined in what
we called “sieves”

(Fig
-
Sieve)
. A siev
e is a boolean
combination (with and & or) of
such atomic patterns. ImAnno
provides an easy tool to create and save these sieves allowing therefore any user to construct powerful pattern searches,
from scratch or modifying existing one. Furthermore the lis
ts of genes obtained by several sieves can be combined as
logical union, intersection and/or complement operations using a dynamic html formular which can be modified and
extended by the user. Finally the web site offers many integrated tools which can be
applied to these lists of genes such
as Gene Ontology, Interactomic (using the String database and Cytoscape), known phenotypes, relation to diseases,
comparison with externa
l lists, etc. It allows also calculating and displaying
correlations between tissu
es, sieves and any
combination of them.

Dendogram

of tissue correlation

To obtain the dendogram we
first comput
e a distance matrix between all tissues taking into account the expression
values

of the 2000 annotated genes
,

using

a Spearman
’s rank correlation coefficient
with following numeric values
0:negative, 2:weak, 3:medium and 4:strong

(R function cor(.., method=”spearman”))
.

This distance ma
trix is used by
the program Fast
ME (
ref Gascuel O.) to construct the phylogenic tree.


Intera
c
t
omics

Protein
-
protein interactions were obtained from the STRING database containing known and predicted physical and
functional protein
-
protein interactions. STRING was used in protein mode, and only interactions with high confidence
levels (>0.7) were ke
pt.
Interactomic networks were visualized
with the Cytoscape software to facilitate the analysis of
the interactomic networks, which may be very huge. Three different
“zooming in levels” were created
:
the
first level

deals only with initial genes with at l
east one interaction (primary network), the
second level

describes the initial genes
and only the genes from the STRING database exhibiting interactions with two initial genes. The
third level
display the
genes from the initial list (initial genes) and all

the genes exhibiting at least one interaction with any initial gene found
in the STRING database.


Gene Ontology

The

gene ontology database from
http://www.geneontology.org




Carson JP
,
Thaller C
,
Eichele G
. 2002 A transcriptome atlas of the mouse brain at cellular resolution.
Curr Opin
Neurobiol.

12:562
-
265.

Visel A
,
Thaller C
,
Ei
chele G
. 2004 GenePaint.org: an atlas of gene expression patterns in the mouse embryo.
Nucleic
Acids Res.
, 32:D552
-
556.

Visel A, Carson J, Oldekamp J, Warnecke M,
Jakubcakova V, Zhou X, Shaw CA, Alvarez
-
Bolado G, Eichele G. 2007
Regulatory pathway analysis by high
-
throughput

in situ hybridization.
PLoS Genet
., 3:1867
-
1883.