CMBI - What If

disturbedtonganeseBiotechnology

Oct 2, 2013 (4 years and 1 month ago)

65 views

The CMBI: Bioinformatics

Content



Bioinformatics


Bioinformatics
@CMBI


Bioinformatics

tools & databases


Hanka

Venselaar

CMBI

UMC Radboud

February

2009

h.venselaar
@
cmbi.ru.nl


2
/37 ©CMBI 2009

What is bioinformatics?



Bioinformatics is the use of computers in solving information problems
in the life sciences



You are "doing bioinformatics" when you use computers to store,
retrieve, analyze or predict the sequence, function and/or structure of
biomolecules.


Bioinformatics

3
/
37
©CMBI
2009

Human genome, great expectations

Data


Knowledge, insight !!!

Bioinformatics

4
/37 ©CMBI 2009

Why do we need Bioinformatics?

Flood of biological data:



DNA
-
sequences (genomes)


protein sequences and structures


gene expression profiles (transcriptomics)


cellular protein profiles (proteomics)


cellular metabolite profiles (metabolomics)


We want to :



collect and store the data


integrate, analyze, compare and mine the data


predict genes, protein function and protein structure


predict physiology (models, mechanisms, pathways)


understand how a whole cell works

Bioinformatics

5
/37 ©CMBI 2009

A large fraction of the human genes has an unknown function

(Science, 2001)

Bioinformatics

6
/37 ©CMBI 2009

What is protein function?





Homology

Genomic context

Bioinformatics

7
/37 ©CMBI 2009

How can we predict function of proteins?

“similar sequence with known
function. E.g. proteine kinase


“new, unknown

protein”

Extrapolate the function

Compare with

database of proteins

BLAST

The importance of sequence similarity and sequence alignment


Similar sequences have:


A similar evolutionary origin


A similar function


A similar 3D structure


Bioinformatics

8
/37 ©CMBI 2009


CMBI
-

Centre for Molecular and Biomolecular Informatics





Dutch national centre for computational molecular sciences research



Research groups


Comparative Genomics (Huynen)


Bacterial Genomics (Siezen)


Computational Drug Design (De Vlieg)


Bioinformatics of Macromolecular Structures (Vriend)



Training & Education


MSc, PhD and PostDoc programmes


International workshops


Hotel Bioinformatica


High school courses



Computational facilities, databases, and software packages via (inter
-
)national service
platforms (NBIC, EBI, etc)



NBIC: National BioInformatics Centre.

Bioinformatics


@CMBI

9
/37 ©CMBI 2009

Computational Drug Discovery (CDD) Group


Head: Prof. Jacob de Vlieg



Key goal

Develop molecular modeling and computer
-
based simulation techniques for
structure
-
based drug design, translational medicine and protein family based
approaches to design and identify drug
-
like compounds



Key Research Fields


Structural bioinformatics for drug design


Bioinformatics for genomics (microarray analysis, text mining, etc)


Translational medicine informatics






Academic Research

New scientific
approaches

Training & education

Applications

Exciting
real life

problems


‘wet’ validation

CDD

Bridging academic research and applied genomics


Bioinformatics


@CMBI

10
/37 ©CMBI 2009

Examples of CDD Projects



Exploiting Structural Genomics Information To Incorporate Protein Flexibility In Drug
Design



Protein knowledge building through comparative genomics and data integration




In silico

studies on p63 as a new drug
-
target protein













Bioinformatics


@CMBI

11
/37 ©CMBI 2009

International Computational Drug Discovery Course





Course covers the entire research pipeline
from genomics and proteomics in target
discovery to Structure Based Drug Design
and QSAR in drug optimization.



Lectures and practicals



2 week course



June/July 2009



www.cmbi.ru.nl/ICDD2008


Bioinformatics


@CMBI

12
/37 ©CMBI 2009

Bacterial Genomics Group


Head: Prof Roland Siezen



Research interest: Biological questions in the interest of Dutch Food Industry



How can we improve:


fermentation


safety


health



Micro
-
organisms studied:
Gram
-
positive food bacteria:



lactic acid bacteria (
Lactococcus, Lactobacillus
)



spoilage bacteria (
Listeria, Clostridium, Bacillus cereus)




listeria

lactococcus

Bioinformatics


@CMBI

13
/37 ©CMBI 2009

Bacterial Genomics:

from sequence to predicted function

Key research fields:


Genome sequencing and interpretation


Network reconstruction and analysis


Systems biology, dynamic modelling

Raw sequence data:

2 to 5 million nucleotides


AAACACTTAGACAATCAATATAAAGATGAA
GTGAACGCTCTTAAAGAGAAGTTGGAAAAC
TTGCAGGAACAAATCAAAGATCAAAAAAGG
ATAGAAGAACAAGAAAAACCACAAACACTT
AGACAATCAATATAAAGATGAAGTGAACGC
TCTTAAAGAGAAGTTGGAAAACTTGCAGGA
ACAAATCAAAGATCAAAAAAGGATAGAAGA
ACAAGAAAAACCACAAACACTTAGACAATC
AATATAAAGATGAAGTGAACGCTCTTAAAG
AGAAGTTGGAAAACTTGCAGGAACAAATCA
AAGATCAAAAAAGGATAGAAGAACAAGAAA
AACCACAAACACTTAGACAATCAATATAAA
GATGAAGTGAACGCTCTTAAAGAGAAGTTG
GAAAACTTGCAGGAACAAATCAAAGATCAA
AAAAGGATAGAAGAACAAGAAAAACCACAA
ACACTTAGACAATCAATATAAAGATGAAGT
GAACGCTCTTAAAGAGAAGTTGGAAAACTT
GCAGGAA

A virtual cell:

overview of predicted pathways

Bioinformatics


@CMBI

14
/37 ©CMBI 2009

Bacterial Genomics:

Example


Differential NF
-
κB pathways induction by
Lactobacillus plantarum

in the duodenum of healthy
humans correlating with immune tolerance

Peter van Baarlen et al., PNAS, Febr 3, 2009

Bioinformatics


@CMBI

15
/
37
©CMBI
2009

Comparative Genomics Group


Head: Prof. Martijn Huynen



Research Focus:


How do the proteins encoded in genomes interact with each other to produce
cells and phenotypes ?


To predict such functional interactions between proteins as there exist e.g. in
metabolic pathways, signalling pathways or protein complexes

A genome is more than the sum of its genes
-
>


Use “genomic context” for function prediction


Types of genomic context:


Gene fusion/fission

Chromosomal location

Gene order/neighbourhood

Co
-
evolution

Co
-
expression

Bioinformatics


@CMBI

16
/37 ©CMBI 2009

Turning data into knowledge

Research topics:


Develop computational genomics techniques that exploit the information in
sequenced genomes and functional genomics data


Make testable predictions about pathways and the functions of proteins therein.


Evolution of the eukaryotic cell and in the origin and evolution of organelles like
the mitochondria and the peroxisomes


Education:


Comparative Genomics Course, 3 EC, April 2009

Comparative

genomics


Prediction of protein function, pathways

Bioinformatics


@CMBI

17
/37 ©CMBI 2009

Frataxin

Example


Frataxin is a well
-
known disease gene (Friedreich's ataxia) whose function has
remained elusive despite more than six years of intensive experimental research.



Using computational genomics we have shown that frataxin has co
-
evolved with
hscA and hscB and is likely involved in iron
-
sulfur cluster assembly in conjunction
with the co
-
chaperone HscB/JAC1.


Prediction

Confirmation

Bioinformatics


@CMBI

18
/37 ©CMBI 2009

Bioinformatics of macromolecular structures


Head: Prof. Gert Vriend



Research Focus: Understanding proteins (and their environment)



Proteins are the core of life, they do all the work, and they give you
feelings, contact with the outside world, etc.



Proteins, therefore, are the most important molecules on earth.



We want to understand life; why are we what we are, why do we do what
we do, how come you can think what you think?



Bioinformatics


@CMBI

19
/37 ©CMBI 2009

Bioinformatics of macromolecular structures


Research topics Vriend group



Homology modeling technology and applications


Application of bioinformatics in medical research (Hanka Venselaar)


Structure validation and structure determination improvement


Molecular class specific information systems (e.g. GPCRDB &
NucleaRDB)


Data mining


WHAT IF molecular modelling and visualization software


Bioinformatics


@CMBI

Hearing loss


Unknown structure

MGTPWRKRKGIAGPGLPDLSCALVLQPRAQVGTMSPAI
ALAFLPLVVTLLVRYRHYFRLLVRTVLLRSLRDCLSGLRI
EE
R
AFSYVLTHALPGDPGHILTTLDH
W
SSRC
E
YLSHMG
PVKGQILMRLVEEKAPACVLELGTYCGYSTLLIARALPP
GGRLLTVERDPRTAAVAEKLIRLAGFDEHMVELIVGSSE
DVIPCLRTQYQLSRADLVLLAHRPRCYLRDLQLLEAHAL
LPAGATVLADHVLFPGAPRFLQYAKSCGRYRCRLHHTG
LPDFPAIKDGIAQLTYAGPG

DFNB63:

Homology Modeling

Homology modeling:

Prediction of 3D structure based
upon a highly similar structure

Bioinformatics


@CMBI

21
/37 ©CMBI 2009

Prediction of 3D structure based upon a highly similar structure


Add sidechains, Molecular
Dynamics simulation on model

Unknown structure

NSDSECPLSHDG

NSDSECPLSHDG

|| || | ||

NSYPGCPSSYDG

Alignment of model
and template
sequence

Known structure

Known structure

Back bone copied

Copy backbone and conserved residues

Model!

Homology Modeling

Bioinformatics


@CMBI

Hearing loss

Structure!

MGTPWRKRKGIAGPGLPDLSCALVLQPRAQVGTMSPAI
ALAFLPLVVTLLVRYRHYFRLLVRTVLLRSLRDCLSGLRI
EE
R
AFSYVLTHALPGDPGHILTTLDH
W
SSRC
E
YLSHMG
PVKGQILMRLVEEKAPACVLELGTYCGYSTLLIARALPP
GGRLLTVERDPRTAAVAEKLIRLAGFDEHMVELIVGSSE
DVIPCLRTQYQLSRADLVLLAHRPRCYLRDLQLLEAHAL
LPAGATVLADHVLFPGAPRFLQYAKSCGRYRCRLHHTG
LPDFPAIKDGIAQLTYAGPG

DFNB63:

Homology Modeling

Bioinformatics


@CMBI

23
/37 ©CMBI 2009

Saltbridge between Arginine and

Glutamic acid is lost in both cases



Arginine 81
-
> Glutamic acid


Glutamic acid 110
-
> Lysine

Mutations:

Homology Modeling

Bioinformatics


@CMBI

24
/37 ©CMBI 2009

Mutation:


Tryptophan 105
-
> Arginine

Hydrophobic contacts from the
Tryptophan are lost,
introduction of an hydrophilic
and charged residue

Homology Modeling

Bioinformatics


@CMBI

25
/37 ©CMBI 2009

The three mutated residues are all important
for the correct positioning of Tyrosine
111

Tyrosine
111
is important for substrate binding

Ahmed et al.,

Mutations of LRTOMT, a fusion gene
with alternative reading frames, cause
nonsyndromic deafness in humans.

Nat Genet. 2008 Nov;40(11):1335
-
40.



Interested?

Contact Hanka Venselaar
(h.venselaar@cmbi.ru.nl)

Homology Modeling

Bioinformatics


@CMBI

26
/37 ©CMBI 2009


Hotel Bioinformatica

Hotel functions



Temporary housing, teaching and
supervision of experimentalists for
data analysis at the CMBI



Centralization of UMC
-
wide
bioinformaticians



Shared (weekly) seminars of CMBI
with ‘inhouse bioinformaticians’



Collaboration/advice in acquiring
grants with a Bioinformatics aspect


Interested? Contact Martijn Huynen (m.huynen@cmbi.ru.nl)

Bioinformatics


@CMBI

27
/37 ©CMBI 2009

Bioinformatics data types



mRNA
expression
profiles


MS data

Large amount of data

Growing very very fast

Heterogeneous data types

Bioinformatics


Tools &


Databases

28
/
37
©CMBI
2009

Biological Databases


Information is the core of bioinformatics


Literally thousands of databases exist that are relevant for biology,
medicine, and/or chemistry


Bioinformatics


Tools &


Databases

29
/
37
©CMBI
2009

Important records in SwissProt/UniProt (1)

Bioinformatics


Tools &


Databases

30
/37 ©CMBI 2009

Important records in SwissProt/UniProt (2)

Cross references


Direct hyperlinks to:


EMBL


PDB


OMIM,


InterPro


etc. etc.

Features



post
-
translational modifications


signal peptides


binding sites,


enzyme active sites


domains,


disulfide bridges


etc. etc.

Bioinformatics


Tools &


Databases

31
/37 ©CMBI 2009

Protein Databank & Structure Visualization



PDB structures have a unique identifier, the PDB Code:

4 digits (often 1 digit & 3 letters, e.g. 1CRN).



Download PDB structures, give correct file extension: 1CRN.pdb



Structures from PDB can directly be visualized with:


1.
Yasara (www.yasara.org)

2.
SwissPDBViewer (http://spdbv.vital
-
it.ch/)

3.
Protein Explorer (http://www.umass.edu/microbio/rasmol/)

4.
Cn3D (http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml)



Bioinformatics


Tools &


Databases

32
/37 ©CMBI 2009

OMIM Database

OMIM
-

Online Mendelian Inheritance in Man



a large, searchable, current database of human genes, genetic traits,
and hereditary disorders


contains information on all known mendelian disorders and over
12,000 genes


focuses on the relationship between phenotype and genotype



Bioinformatics


Tools &


Databases

33
/37 ©CMBI 2009

Browsing genomes




UCSC

http://genome.ucsc.edu/

Only eukaryotic genomes


NCBI

Ensembl

http://www.ensembl.org/


Bioinformatics


Tools &


Databases

34
/37 ©CMBI 2009

Sequence Retrieval with MRS (1)

Google

= Th
é

best generic search and retrieval system

MRS

= Maarten’s Retrieval System (
http://mrs.cmbi.ru.nl

)


MRS is the Google of the biological database world


Search engine (like Google)




Input/Query = word(s)



Output = entry/entries from database


Searching is very intuitive:



Select database(s) of choice



Formulate your query



Hit “Search”



The result is a “query set” or “hitlist”



Analyze the results


Bioinformatics


Tools &


Databases

35
/37 ©CMBI 2009


Sequence Retrieval with
MRS (2)





Formulate query.

But think about your query first!!

Select database

MRS hitlist

Bioinformatics


Tools &


Databases

36
/37 ©CMBI 2009


BLAST and CLUSTAL with
MRS







Blast




brings you to the MRS
-
page from which you can



do Blast

searches.




Blast results



brings you to the page where MRS stores your Blast



results of the current session.




Clustal



brings you to the MRS page from which you can do



Clustal sequence alignments
.

Bioinformatics


Tools &


Databases

37
/37 ©CMBI 2009


Your Exercise Today






The practicum: FAMILIAL
VISCERAL AMYLOIDOSIS


Today for PhD students

Friday (13:00) for MMD students






CMBI, Course room, ground floor NCMLS


You
will study
Lysozyme
:



Protein


Gene


Mutations causing familial visceral
amyloidosis


3D structure




HAVE FUN!!

Bioinformatics


Tools &


Databases

The Practicum


You can find the practicum at
http://swift.cmbi.ru.nl/teach/lyso/


38
/37 ©CMBI 2009

Work with MRS

Work with
Yasara

Read the text carefully


User login = c(your pc number)
f.e

c07


User password = t0psp0rt (with zero’s)


The program
Yasara

is on your desktop