Advanced Training Workshop in Bioinformatics of Membrane Protei


Oct 1, 2013 (3 years and 10 months ago)


Advanced Training Workshop in Bioinformatics of Mem
brane Proteins

Membrane Protein
Structure Prediction

David Jones and Tim Nugent


February 2008

1. MEMSAT 3 via the Fpred Server

Firstly we are going to be predicting some topologies using MEMSAT 3 v
the Fpred server. Fpred performs feature
based function prediction based on
BLAST profiles. These profiles are also used to run MEMSAT 3 and the
results are displayed graphically. You can either search by:

i) IPI accession number

ii) FASTA sequence

You can access the Fpred server here:

A good protein to start with is
Obtain the FASTA sequence

by searching for the accession number

Paste the sequence in to Fpred and wait for the results. If the sequen
ce has been
processed recently, you may be able to click on a link taking you directly to the
results. Otherwise click here:

A good site for investigating the topologies of membrane proteins which have
crystal structures is OPM (Orientation of Proteins in Membranes):

You can access OPM here:

Search for Bacteriorhodopsin using the PDB code 1m0l, and compare the
MEMSAT 3 results with OPM's topology which is derived from the crystal
structure. Check the position of the N
terminus, the number of predicted
ane helices and their boundaries. Each TM helix should overlap the
known position by
at least

5 residues in a good prediction.

Next take the FASTA sequence for Formate dehydrogenase (P0AE
K7) and run
it through Fpred. Again, compare the results to the topol
ogy at OPM using the
PDB code 1kqf.

2. Transmembrane proteins with signal peptides

Get the FASTA sequence for the Acetylcholine receptor using accession
P02711. Compare the MEMSAT 3 topology prediction from Fpred with that at
OPM using PDB ID 2bg

Is the prediction correct? If not, why not? Look at that feature annotations on
the Uniprot page to see if you can find a reason.

Like transmembrane helic
es, signal peptides have a high proportion of
hydrophobic residues and are frequently predicted as TM helices. A mis
prediction such as this will heavily disrupt the predicted topology, with inside
and outside loop regions often being reversed. It is often

useful to run a
dedicated signal peptide predictor such as SignalP before making a prediction.

3. Transmembrane proteins with re
entrant Helices

Another substructure common with transmembrane helices is the re
helix. These are helices that enter
and exit the membrane from the same side

i.e. they do not span the bilayer. Like signal peptides, re
entrant helices often
have a hydrophobic profile, so are commonly predicted as transmembrane

Paste the FASTA sequence for the Chloride channel (
P37019) into Fpred.
Compare the results with the topology at OPM using PDB ID 1ots.

Does the prediction agree with OPM? The OPM topology lists regions within
the plane of the membrane, it does not discriminate between transmembrane
helices and re
ntrant helices.

The Chloride channel actually contains the following 3 re
entrant helices:

A.127,141, B.179,190;C.389,402

Download the PDB structure from OPM and open it using
. You can
use the following commands to highlight these re
entrant helices
. Compare
them to the true transmembrane helices and note how they do not fully span the

RasMol> select 127

RasMol> colour red

4. The PONGO Server

PONGO is a webserver that runs a number of the top
rane topology prediction methods on a given sequence. The results
are displayed graphically allowing easy comparison. A consensus approach is
particularly useful with topologies that are difficult to predict. Currently,
PONGO uses the following methods: TM
HMM 2.0, MEMSAT 3,

can access the PONGO server here:

A protein whose transmembrane topol
ogy has caused significant controversy is

Batten disease protein, CLN3. Downl
oad the FASTA sequence from
Uniprot using the accession number Q13286. Paste the sequence into the
PONGO server and wait for the results. This time, there is no crystal struct
to compare the results to!

How well do the different methods agree with each other?

What do you think the true topology of the protein is?

5. Advance
d MEMSAT 3 usage

You can access MEMSAT 3 via the PSI
PRED server here:

You can also download the source code and run the

locally for the
complete output:

MEMSAT 3 uses a dynamic programming algorithm to produce all possible
topologies before returning the highest scoring one. A high confidence
prediction can

be measured by a large difference in score between the first and
second prediction. In case the top ranked topology seems to be incorrect,
consider the alternatives.

As MEMSAT 3 makes predictions based on PSI
BLAST alignments, it is also
possible to use
it in conjunction with a custom database. For example, you
could create a database of mammalian CLN3 sequences and then use
MEMSAT 3 to create a PSI
BLAST profile based solely on those sequences.
Taking this approach may increase prediction accuracy.