R. P. Deolankar

frequentverseΠολεοδομικά Έργα

16 Νοε 2013 (πριν από 3 χρόνια και 10 μήνες)

56 εμφανίσεις

R. P. Deolankar

Half knowledge is always dangerous

Wet lab



A laboratory allowing for hands
-
on scientific research
and equipped with



Appropriate plumbing



Ventilation



Equipment

High
-
throughput technology


The technology handling high volume of data or
material


Large
-
scale methods to purify, identify, and
characterize DNA, RNA, proteins and other molecules.
These methods are usually automated, allowing rapid
analysis of very large numbers of samples.


Microarray


A tool used to sift through and analyze the
information contained within a genome. A microarray
consists of different nucleic acid probes that are
chemically attached to a substrate, which can be a
microchip, a glass slide or a microsphere
-
sized bead.


DNA microarray


A microarray of immobilized single
-
stranded DNA
fragments of known nucleotide sequence that is used
especially in the identification and sequencing of DNA
samples and in the analysis of gene expression (as in a
cell or tissue)


Protein microarray


Protein microarray is a piece of glass on which
different molecules of protein have been affixed at
separate locations in an ordered manner thus forming
a microscopic array.


Mass spectrometry


An instrumental method for identifying the chemical
constitution of a substance by means of the separation
of gaseous ions according to their differing mass and
charge
--

called also mass spectroscopy


Mass spectrometry: A method used to determine the
masses of atoms or molecules in which an electrical
charge is placed on the molecule and the resulting ions
are separated by their mass to charge

Tandem mass spectrometry


Multiple steps of mass spectrometry selection, with
some form of fragmentation occurring in between the
stages


Immunofluorescence and immunocytochemistry,
ELISA, immunoblotting


Dry lab



A laboratory for making computer simulations or for
data analysis especially by computers (as in
bioinformatics)

called also dry laboratory

Gene prioritization


The results of experimental or computational analyses
in the post
-
genomic era (e.g., those from microarrays,
proteomics, ChIP
-
chip, genome
-
wide in silico
searches, genetic linkages, etc.) often consist of long
lists of candidate genes. There are methods that
provide score to the gene and rank them. This process
is known as gene prioritization.


PhenoGO


PhenoGO is a multiorganism database that provides
phenotypic context, such as the cell type, disease, and
tissue and organ to existing associations between gene
products and Gene Ontology (GO) terms as specified
in the Gene Ontology Annotations (GOA).

BioMedLEE


One existing Natural Language Processing (NLP)
system, known as BioMedLEE, automatically extracts
biological information consisting of bio
-
molecular
substances and phenotypic data.


MeSH


Medical Subject Heading


MeSH is the National Library of Medicine's controlled
vocabulary thesaurus. It consists of sets of terms
naming descriptors in a hierarchical structure that
permits searching at various levels of specificity.



PhenOS


Phenotype Organizer System, PhenOS is a system
under development by the Lussier research group with
purpose of bridging the gap between heterogeneous
biomedical terminologies.


Inparanoid algorithm


The protein interaction networks of two species are
aligned by assigning proteins to sequence homology
clusters using the Inparanoid algorithm


POCUS


Prioritization of candidate genes using statistics


Reference: Turner FS, Clutterbuck DR, Semple CA.
POCUS: mining genomic sequence annotation to
predict disease genes. Genome Biol. 2003;4(11):R75.

OMIM


Mendelian Inheritance in Man


The Online Mendelian Inheritance in Man. A catalog
of human genes and genetic disorders authored and
edited by Dr. Victor A. McKusick and his colleagues at
Johns Hopkins and elsewhere, and provided through
NCBI. The database contains information on disease
phenotypes and genes, including extensive
descriptions, gene names, inheritance patterns, map
locations and gene polymorphisms.

TOM


A web
-
based integrated approach for identification of
candidate disease genes, Transcriptomics of OMIM


Reference: Rossi S, Masotti D, Nardini C, Bonora E,
Romeo G, Macii E, Benini L, Volinia S. TOM: a web
-
based integrated approach for identification of
candidate disease genes. Nucleic Acids Res. 2006 Jul
1;34


Data mining


Data mining (sometimes called data or knowledge
discovery) is the process of analyzing data from
different perspectives and summarizing it into useful
information

Online Predicted Human
Interactions Database or OPHID


Designed to be both a resource for the laboratory
scientist to explore known and predicted protein
-
protein interactions, and to facilitate bioinformatics
initiatives exploring protein interaction networks.


Single nucleotide polymorphisms
(SNPs)


A single nucleotide polymorphism (SNP, pronounced
snip), is a DNA sequence variation occurring when a
single nucleotide
-

A, T, C, or G
-

in the genome (or
other shared sequence) differs between members of a
species (or between paired chromosomes in an
individual).


Synonymous
-

nonsynonymous

substitutions


Substitutions that result in amino acid replacements
are said to be
nonsynonymous

while substitutions that
do not cause an amino acid replacement (such as a
GGG to GGC change
-

both
codons

still encode
glycine
) are said to be synonymous substitutions.
Because of the difference in their effects on the
physiology of the organism, synonymous and
nonsynonymous

substitutions can have quite different
dynamics. For example, synonymous substitutions
usually occur at a much faster rate than do
nonsynonymous

substitutions. Hence, for coding
sequence it is often desirable to separate these two.


Ka/Ks values


In genetics, the Ka/Ks ratio or dN/dS ratio is the ratio
of the rate of non
-
synonymous substitutions (Ka) to
the rate of synonymous substitutions (Ks), which can
be used as an indication of selection on a protein
-
coding gene.


dbSNP


db (Database) of Single nucleotide polymorphism


A public
-
domain archive for a broad collection of
Single Nucleotide Polymorphisms (SNPs) and is
hosted at the National Center for Biotechnology
Information.


Orthodisease


OrthoDisease, a comprehensive database of model
organism genes that are orthologous to human disease
genes


Orthodisease is constructed primarily using
Inparanoid analysis. Inparanoid is a program that
automatically detects orthologs (or groups of
orthologs) from 2 species


Field
Biology



Biology of organisms living in their natural
environments


Applications in Ecology and Evolutionary Biology

Epidemiology



Epidemiology is the study of how often disease occur
in different groups of people and why


Planning and evaluating strategies to prevent illness


Guide to the management of patients in whom disease
is already developed


Reference: Epidemiology for the uninitiated by
Coggon, Rose and Barker

Population at risk



The population at risk is the group of people, healthy
or sick, who would be counted as cases if they had the
disease being studied


It defines the denominator for the calculation of rates
of incidences and prevalence


It is the number of persons potentially capable of
experiencing the event or outcome of interest

Floating numerator



Numerator floating without its denominator


Common error occurring in field investigations


The error occurs due to the number of cases not
relating to the “at risk” population


Epidemiological conclusions (on risk) cannot be
drawn from purely clinical data (on the number of sick
people seen)


Target population



It is the population about which the conclusions are to
be drawn


Sometimes measurement can be made on the full
target population else study samples are used


Study
population and study sample


The group of individuals in a study


In a clinical trial, the participants make up the study
population


Study sample is chosen from study population

Aetiology



The study of the factors that predispose to or
precipitate the disease


External agent, a susceptible host, and an environment
that brings the host and agent together is a disease
etiology triad

Surveillance



Watching over a population and recording data likely
to have epidemiological significance, usually with the
aim of early detection of disease. Essentially an
interventionist exercise compared with monitoring,
which is passive.


Case



Disease in populations exists as a continuum of
severity rather than as an all or none phenomenon


The real question in population studies is not “has the
person got the disease?” but “How much of the disease
has he or she got?”


Diagnostic continuum is dichotomized into “cases”
and “non
-
cases” on the basis of statistical, clinical,
prognostic or operational options


Hence case definition should be precise and
unambiguous.


Epidemiological case definitions are narrower and
more rigid than clinical ones

Incidence



It is the rate at which new cases occur in a population
during a specified period


(number of new cases) / (Population at risk) * (Time
during which cases were ascertained)


Prevalence


Point prevalence


The proportion of a population that are cases at a point
in time

Period prevalence


The proportion of a population that are cases at any
time within a stated period





Attributable
risk and relative risk



Attributable risk is the disease rate in exposed persons
to that in people who are unexposed


Relative risk is the ratio of the disease rate in exposed
persons to that in people who are unexposed


Attributable risk = rate of disease in unexposed
persons * (relative risk


1)

Confounding



Causing confusion about causation due to 2 or more
variables associated with the disease


Confounding may give rise to spurious associations
when in fact there is no causal relation, or at other
extreme, it may obscure the effects of a true cause


Bias



Bias

is the deviation of inferences from the truth


Selection bias

is the biased selection of individuals
into the study


Information bias

is the biased collection or biased
analysis of the data


Motto of the epidemiologist could well be “dirty hands
but a clean mind” (manus sordidae, mens pura)



Chance



A measure of how likely it is that some event will occur


Random, unpredictable influences on events


The association between the exposure and disease is
considered to be “statistically significant” if the
probability that the test statistic < 0.05


Sensitivity



The proportion of persons with the disease who are
correctly identified by defined criteria


The proportion of persons with the disease who are
correctly identified by a screening test


The ability of a system to detect epidemics and other
changes in disease occurrence


A sensitive test detects high proportion of the true
cases


Specificity



The proportion of persons without a disease who are
correctly identified by a test


The number of true negative results divided by the
total number of all those without the disease

Randomization


Randomization is used to obtain a similar allocation of
individuals to each group, the groups are followed at
the same time


Purpose of randomization: To obtain unbiased
estimates of differences among treatment responses
(means or effects) and to obtain an unbiased estimate
of the random error variation in the experiment


Replication and Local
control



Replication is the repetition of an experiment in order
to test the validity of its conclusion


Local control is blocking or grouping to eliminate or to
control the various sources of variation (error)


Replication and local control are necessary to achieve a
reduction in the random variation among treatment
effects in the experiment




Observational (non
-
experimental)
studies


Person
-
level unit of observation


1.

Longitudinal measurements



a.

Cohort samples



b.

Case control samples


2.

Cross
-
sectional measurements


Aggregate level units of observation (ecological
studies)


Reference: Epidemiology Kept Simple: An
Introduction to Traditional and Modern
Epidemiology; by B. Burt
Gerstman


Personal
-
level vs. Aggregate
-
level


Personal level study on smoking might collect
information on each person’s smoking habits, age and
disease status


Aggregate level of study on smoking might collect
information on each region’s per capita cigarette
consumption, age distribution and disease rate


Longitudinal studies


Longitudinal studies are studies in which the sequence
of events in individuals can be delineated over time


In cohort studies the incidence of disease in exposed
and non
-
exposed groups are compared


In case
-
control studies people with disease (cases) and
people without disease (controls) are sampled from
the source population and exposure histories of cases
and controls are compared

Longitudinal vs. Cross
sectional
studies


Longitudinal measurements relates exposures and
diseases in individuals at various time references


Cross
-
sectional measurements are not definitively
time sequenced in individuals


In cross
-
sectional studies the analysis of data is
gathered from samples at one point in time. Since both
the outcome and the variables are measured at the one
time these studies are not strong at showing cause
-
effect relationships.


Experimental studies



In experimental studies, the investigator introduces or
removes an exposure in order to observe its influence
on a health outcome. Such allocations may be based
on chance mechanism (randomized trials) or on other
deliberate mechanisms built into the study’s protocol
(non
-
randomized trials)

Other disease informatics lectures:

Supercourse: Epidemiology, the Internet and Global Health


Lecture numbers 31981, 30331, 28921, 25381, 25371, and 34011