Advanced Training Workshop in Bioinformatics of Membrane Protei

vivaciousefficientBiotechnology

Oct 1, 2013 (4 years and 11 days ago)

85 views

Advanced Training Workshop in Bioinformatics of Mem
brane Proteins

Membrane Protein
Structure Prediction

David Jones and Tim Nugent

23
rd

February 2008


1. MEMSAT 3 via the Fpred Server


Firstly we are going to be predicting some topologies using MEMSAT 3 v
ia
the Fpred server. Fpred performs feature
-
based function prediction based on
PSI
-
BLAST profiles. These profiles are also used to run MEMSAT 3 and the
results are displayed graphically. You can either search by:

i) IPI accession number

ii) FASTA sequence













You can access the Fpred server here:

http://bioinf.cs.ucl.ac.uk/fpred/

A good protein to start with is
Bacteriorhodopsin.
Obtain the FASTA sequence
from
http://www.expasy.org/uniprot/

by searching for the accession number
P02945.

http://www.expasy.org/uniprot/P02945.fas


Paste the sequence in to Fpred and wait for the results. If the sequen
ce has been
processed recently, you may be able to click on a link taking you directly to the
results. Otherwise click here:

http://bioinf.cs.ucl.ac.uk/cgi
-
bin/fpred/
buildFeat.pl?MD5=n0o7g2v0f3q36ju8


A good site for investigating the topologies of membrane proteins which have
crystal structures is OPM (Orientation of Proteins in Membranes):














You can access OPM here:

http://opm.phar.umich.edu


Search for Bacteriorhodopsin using the PDB code 1m0l, and compare the
MEMSAT 3 results with OPM's topology which is derived from the crystal
structure. Check the position of the N
-
terminus, the number of predicted
transmembr
ane helices and their boundaries. Each TM helix should overlap the
known position by
at least

5 residues in a good prediction.

Next take the FASTA sequence for Formate dehydrogenase (P0AE
K7) and run
it through Fpred. Again, compare the results to the topol
ogy at OPM using the
PDB code 1kqf.

http://www.expasy.org/uniprot/P0AEK7.fas

http://bioinf.cs.ucl.ac.uk/cgi
-
b
in/fpred/buildFeat.pl?MD5=vvh92pfujfs6b8hq


2. Transmembrane proteins with signal peptides


Get the FASTA sequence for the Acetylcholine receptor using accession
P02711. Compare the MEMSAT 3 topology prediction from Fpred with that at
OPM using PDB ID 2bg
9.

http://www.expasy.org/uniprot/P02711.fas

http://bioinf.cs.ucl.ac.uk/cgi
-
bin/fpred/buildFeat.pl?MD5=h9duag
9loko0h39b

Is the prediction correct? If not, why not? Look at that feature annotations on
the Uniprot page to see if you can find a reason.

http://www.expasy.org/uniprot/P02711

Like transmembrane helic
es, signal peptides have a high proportion of
hydrophobic residues and are frequently predicted as TM helices. A mis
-
prediction such as this will heavily disrupt the predicted topology, with inside
and outside loop regions often being reversed. It is often

useful to run a
dedicated signal peptide predictor such as SignalP before making a prediction.


3. Transmembrane proteins with re
-
entrant Helices

Another substructure common with transmembrane helices is the re
-
entrant
helix. These are helices that enter
and exit the membrane from the same side


i.e. they do not span the bilayer. Like signal peptides, re
-
entrant helices often
have a hydrophobic profile, so are commonly predicted as transmembrane
helices.

Paste the FASTA sequence for the Chloride channel (
P37019) into Fpred.
Compare the results with the topology at OPM using PDB ID 1ots.

http://www.expasy.org/uniprot/P37019.fas

http://bioinf.cs.ucl.ac.uk/cgi
-
bin/fpred/buildFeat.pl?MD5=j3t35vo5qebkcpsf

Does the prediction agree with OPM? The OPM topology lists regions within
the plane of the membrane, it does not discriminate between transmembrane
helices and re
-
e
ntrant helices.

The Chloride channel actually contains the following 3 re
-
entrant helices:

A.127,141, B.179,190;C.389,402

Download the PDB structure from OPM and open it using
Rasmol
. You can
use the following commands to highlight these re
-
entrant helices
. Compare
them to the true transmembrane helices and note how they do not fully span the
membrane.

RasMol> select 127
-
141:A,179
-
190:A,389
-
402:A

RasMol> colour red


4. The PONGO Server


PONGO is a webserver that runs a number of the top
-
performing
transmemb
rane topology prediction methods on a given sequence. The results
are displayed graphically allowing easy comparison. A consensus approach is
particularly useful with topologies that are difficult to predict. Currently,
PONGO uses the following methods: TM
HMM 2.0, MEMSAT 3,
ENSEMBLE, PRODIV TMHMM 0.91, TMHMMdomfix and ENSEMBLE
2.0.















You
can access the PONGO server here:

http://pongo.biocomp.unibo.it/pongo/

A protein whose transmembrane topol
ogy has caused significant controversy is
the

Batten disease protein, CLN3. Downl
oad the FASTA sequence from
Uniprot using the accession number Q13286. Paste the sequence into the
PONGO server and wait for the results. This time, there is no crystal struct
ure
to compare the results to!

http://www.expasy.org/uniprot/Q13286.fas

How well do the different methods agree with each other?

What do you think the true topology of the protein is?






5. Advance
d MEMSAT 3 usage


You can access MEMSAT 3 via the PSI
-
PRED server here:

http://bioinf.cs.ucl.ac.uk/psipred/

You can also download the source code and run the
program

locally for the
complete output:

http://bioinf.cs.ucl.ac.uk/downloads/memsat/

MEMSAT 3 uses a dynamic programming algorithm to produce all possible
topologies before returning the highest scoring one. A high confidence
prediction can

be measured by a large difference in score between the first and
second prediction. In case the top ranked topology seems to be incorrect,
consider the alternatives.

As MEMSAT 3 makes predictions based on PSI
-
BLAST alignments, it is also
possible to use
it in conjunction with a custom database. For example, you
could create a database of mammalian CLN3 sequences and then use
MEMSAT 3 to create a PSI
-
BLAST profile based solely on those sequences.
Taking this approach may increase prediction accuracy.