Project objective

thingyoutstandingBiotechnology

Oct 1, 2013 (3 years and 6 months ago)

55 views

PhD Brain Project

Vertical Data Integration


Document Information

Author



A
NDREA
C
ALABRIA

First Draw



November 2008

Latest Update


November 2008

Version



0.1


Introduction

The document has the intent to describe the project purpose and design, defining

activities, phases and
presenting the state of art on the field of the Vertical Integration of biological data.

The problem

Topics: domain problem, d
escription of data
,

what is missing,
the
needs (vertical integration)
.

The project is included in the gene
tic and biological field related to neuroscience
. In this research area
there have been done many efforts on discovering models and brain features and actually we have a basic
knowledge on the whole nervous system, especially of the brain.
The high complex
ity of the nervous
system is the biggest mountain to climb with traditional
techniques

such

as statistics and mathematics, but
it is also very hard for computer science methods such as learning theory.

In latest years there have been produced a huge amount

of biological data in different bioinformatics areas,
the omics fields. Big project and efforts have been promoted on the integration and the fusion of these
data, and recent
f
unding
s

went to biobanks related projects whose objective is the integration of

the data
of different heterogeneous/homogeneous sources widely distributed to improve statistical analysis

and
diagnosis’ discovery
. What is actually
the most
important
challenge is the integration of the biological
knowledge and
,

based on this integratio
n
,

we could use the biological data
, for example genotypes,

to infer
or discover other knowledge (disease genes for example).

This new kind of integration is exactly
perpendicular to the first one and for this reason is sometimes called “vertical integrati
on”. In other words
what is missing in the bioinformatics scenario is a comprehensive system for ontological integration,
starting from genes (or other measures) and combining gene’s networks, pathways, systems biology
models, and so forth.

Once this syste
m has been realized, researchers may insert biological data
in
to
the
system and analyze it
in a semantic perspective and thus infer new knowledge. This biological data can be
acquired and integrated also from different sources, such biobanks will provide.


DUBBIO SULLA PARTE SUPERIORE: citare le biobanche
forse
non va bene

perché non è integrazione
semantica ma di dati, qui ci riferiamo a verticale/orizzontale riferita alla semantica (VARIFICA QUESTA
ASSERZIONE)

The available data are related to brain disea
ses; there are different kind of data: genotype information,
phenotype data and clinical records.

DESCRIVI QUI I DATI VELOCEMENTE

Project objective

The objective of the project is to infer the phenotype of a person or a group of person from his genotype
e
xploiting the actual knowledge on the nervous system domain in all the bioinformatics areas, such as
genomics, proteomics, systems biology, pharmacology, etc. This purpose has to be based on ontological
perspective and allow dynamical incremental knowledge

integration

based on statistical measures (alpha
error, confidence intervals and empirical risk).

The computer science domains related to the objective are
data mining, machine learning, data integration and fusion, data quality. Recent literature named o
nto
-
integration and vertical integration such process.

The proposed solutions

Descrizione dei due approcci e delle diverse prospettive convergenti

Graphical Approach

(Top
-
Down)

Descrizione del punto di arrivo finale

Progettazione di massima con taglio
all’obiettivo di una call internazionale

Statistical Approach

(Bottom
-
Up)

Stesura delle fasi del progetto

Inquadramento delle fasi con la letteratura (tabella articoli
\
fasi)


Article
\
Project Phase