Protein Bioinformatics PH260

austrianceilΒιοτεχνολογία

1 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

193 εμφανίσεις

__________________

Name (please print)





Protein Bioinformatics PH260.655


Final Exam


=> take
-
home questions

=> open
-
note



=> please use either


* the Word
-
file to type your answers


or


* print out the PDF and hand
-
write your answ
ers





=> due: May 20th, 2010, 3:30 pm








GOOD LUCK!


In HW#5, you learned about the importance of the Pentose
-
Phosphate Pathway and one of its key
enzymes for
Agrobacterium tumefaciens
. The other key enzyme of this pathway which was

detected
highly abundant in the pH 7 mass spectrometry data set is
Transketolase
.


You become curious about the nature of this protein and start to explore it a bit more, e.g. in order to
have enough structure
-
function related information for subsequent
cloning / mutagenesis / purification
experiments:


Please retrieve the protein sequence of Transketolase (TktA) from
A. tumefaciens C58

in fasta format
from BioCyc.




1) While looking the protein sequence up in BioCyc, you become aware of the gene reactio
n
schematic, and you see that
Agrobacterium

actually harbors three genes associated with a transketolase
function: Tkt, tktB and tktA.


They are thought to have originated from gene duplication, but while Tkt and TktB are located on
the main, circular chr
omosome,

TktA is located on a second, linear chromosome.

Please circle the correct expression below:

1a) tkt and tktB are orthologs paralogs ?


1b) tktB and tktA are orthologs paralogs ?



2)
Now find out about som
e of TktA's physical properties, choosing an appropriate proteomic tools
program, e.g. from the Ex
PAS
y server.

2a) Molecular weight:

2b) pI:

2c) Average hydropathy:



2d) Please briefly
rank the value you found in 2c
by

marking one of the words below:



hydrophobic


amphiphilic


hydrophilic


3) Of course, you need to know whether there is a crystal structure known for your protein or at
least for a homolog:



Search for structures using the PDB‘s advanced search option for FASTA / BLAST sequ
ences.

3a) Please briefly explain the significance of e
-
values.



3b) Which e
-
value cut
-
off do you have to choose in order to obtain less than 30 results?


4) Alas, there is no Transketolase structure from
Agrobacterium

in the database. Instead, you de
cide to
have a closer look at one of the
E.coli

structures with a good e
-
value,
2R5N,

because it contains several
ligands.

=> Please download the fasta sequence
associated with 2R5N
and the PDB file of 2R5N.

4a) There is 1 polymers and 2 chains reported

in the 2R5N structure. What does that mean for TktA's
oligomerization state and the number of assigned domains (see ”Derived Data” section)?


4b) Look at the ”Derived Data” section and summarize in general terms or with the help of a
schematic what SCOP,
CATH and Pfam classification
s

tell you about the enzyme‘s overall
folding architecture and functional domains.




=> Please state in simple terms which domains are predicted to be involved in binding the


cofactor TPP (= Thiamine pyrophosphate = Thia
mine diphosphate = ThDP) ?















4c) Derive the annotated domain boundaries from the "Sequence" section (use the SCOP
annotation) and list the respective residue numbers and names below:





thiazolium


ring

amino
-
pyrimidine
moiety

5) You find one page of an old publication about Trans
ketolase function, showing the multiple
sequence alignment below. You vaguely recall that several entirely conserved Histidine residues
are important for catalysis, but you cannot look the paper up because the library server is on
maintenance till the end
of the week.


CLUSTAL 2.0.12 multiple sequence alignment

2R5N_Transketolase
D
AV
Q
K
A
K
SGH
P
G
APM
G
MA
D
IA
E
VLW
R
D
FL
K
HN
P
QN
P
S
WA
D
R
D
R
FVL
SNGHGS
MLI
YS
LL
H

76

Y. pestis
D
AV
Q
K
A
K
SGH
P
G
APM
G
MA
D
IA
E
VLW
R
D
Y
L
NHN
P
TN
P
H
WA
D
R
D
R
FVL
SNGHGS
MLI
YS
LL
H

76

V. P. aeroginosa

D
AV
Q
K
A
NSGH
P
G
APM
G
MA
D
IA
E
VLW
R
D
Y
M
QHN
P
SN
P
Q
WA
N
R
D
R
FVL
SNGHGS
MLI
YS
LL
H

76

A. tumefaciens
D
AV
E
K
A
NSGH
P
G
LPM
G
AA
D
VA
T
VLF
T
R
Y
L
K
F
D
P
K
APLWA
D
R
D
R
FVL
S
A
GHGS
MLL
YS
LL
Y

79

M. tuberculosis
D
AV
Q
K
V
GNGH
P
GT
AM
S
LAPLA
YT
LF
Q
R
T
M
R
H
D
P
S
D
TH
WL
G
R
D
R
FVL
S
A
GHSS
L
T
L
Y
I
Q
L
Y

120



*.*
:
*.

.****

.*.

*

:
*

.*
: :
..
:
*

.

*

.*******

**.*
: :
*

*
:


2R5N_Transketolase
L
TGY
D
-
LPM
EE
L
K
N
F
R
Q
L
HS
K
T
P
GH
P
E
V
GYT
A
G
V
E
TTTG
PL
GQG
IA
N
AV
G
MAIA
E
K
T
LAA

135

Y. pestis
L
TGY
D
-
LPM
EE
L
K
N
F
R
Q
L
HS
K
T
P
GH
P
E
YGYT
A
G
V
E
TTTG
PL
GQG
IA
N
AV
G
FAIA
E
R
T
L
G
A

135

V. ch
olerae
L
SGY
E
-
L
S
I
DD
L
K
N
F
R
Q
L
HS
K
T
P
GH
P
E
YGY
AP
G
I
E
TTTG
PL
GQG
I
TN
AV
G
MAIA
E
K
ALAA

164

P. aeroginosa
L
TGY
D
-
L
G
I
ED
L
K
N
F
R
Q
L
NS
R
T
P
GH
P
E
YGYT
A
G
V
E
TTTG
PL
GQG
IA
N
AV
G
MALA
E
K
VLAA

135

A. tumefaciens
L
TGY
ED
M
T
I
DE
I
KR
F
R
Q
F
GS
K
T
A
GH
P
E
YGH
A
TG
I
E
TTTG
PL
GQG
IA
N
AV
G
MAIA
E
RK
L
E
E

139

M. tuberculosis
L
GG
F
G
-
L
E
L
S
D
I
E
S
L
R
T
W
GS
K
T
P
GH
P
E
F
R
HT
P
G
V
E
I
TTG
PL
GQG
LA
S
AV
G
MAMA
S
R
Y
E
R
G

179


*

*
: : :
.
::: :
*

*
:
*.****

::
.*
:
*

********
::
.***
:
*
:
*.
:


2R5N_Transketolase
Q
F
N
---
R
P
GH
D
IV
D
HYTY
AFM
G
D
GC
MM
E
G
I
SH
E
V
CS
LA
GT
L
K
L
G
K
LIAF
Y
D
D
NG
I
S
I
D
GH

192

Y. pestis
Q
F
N
---
R
P
GH
D
IV
D
HHTY
AFM
G
D
GC
MM
E
G
I
SH
E
V
CS
LA
GT
M
K
L
G
K
L
T
AF
Y
DD
NG
I
S
I
D
GH

192

V. cholerae
Q
F
N
---
K
P
GH
D
IV
D
H
F
TY
VFM
G
D
GC
LM
E
G
I
SH
E
A
CS
LA
GT
L
G
L
G
K
LIAFW
DD
NG
I
S
I
D
GH

221

P. aeroginosa
Q
F
N
---
R
D
GH
AVV
D
HYTY
AFL
G
D
GC
MM
E
G
I
SH
E
VA
S
LA
GT
L
R
L
N
K
LIAF
Y
DD
NG
I
S
I
D
G
E

192

A. tumefaciens
E
F
------
GS
D
L
QSH
F
TY
VL
CG
D
GC
LM
E
G
I
SH
E
AIALA
GH
L
K
L
N
K
LVLFW
DD
NN
I
T
I
D
G
E

193

M. tuberculosis
LF
D
P
D
A
E
P
G
A
S
PF
D
HY
I
Y
VIA
S
D
G
D
I
EE
G
V
TS
E
A
SS
LAAV
QQ
L
GN
LIVF
Y
D
R
NQ
I
S
I
EDD

239


*

*

.*.

*.
:
.**

:
**
::
*.

:
**.

*.
:
*

*
:
*

*

*
:
*
:
..


2R5N_Transketolase
V
E
G
WF
T
DD
T
AM
R
F
E
A
YG
W
H
VI
R
D
I
D
GH
D
AA
S
I
KR
AV
EE
A
R
AV
T
D
K
P
S
LLM
C
K
T
II
G
F
GS
P

252

Y. pestis
V
E
G
WF
T
DD
T
AA
R
F
E
A
YG
W
H
VV
R
G
V
D
GHN
A
D
S
I
K
AAI
EE
A
H
K
V
T
D
K
P
S
LLM
C
K
T
II
G
F
GS
P

252

V. cholerae
V
E
G
WF
S
DD
T
P
KR
F
E
A
YG
W
H
VI
PAV
D
GH
D
A
D
AI
N
AAI
E
AA
K
A
E
TS
R
P
T
LI
CT
K
T
II
G
F
GS
P

281

P. aeroginosa
V
HG
WF
T
DD
T
P
KR
F
E
A
YG
W
Q
VI
R
N
V
D
GH
D
A
DE
I
K
T
AI
D
T
A
RK
S
-
D
Q
P
T
LI
CC
K
T
VI
G
F
GS
P

251

A. tumefaciens
V
G
L
S
D
ST
D
Q
IA
R
F
Q
AV
H
W
NT
I
R
-
V
D
GH
D
P
D
AIAAAI
E
AA
Q
K
S
-
D
R
P
T
FIA
C
K
T
VI
G
F
G
AP

251

M. tuberculosis
TN
IAL
C
ED
T
A
A
R
Y
R
A
YG
W
H
V
Q
E
V
E
GG
E
N
VV
G
I
EE
AIA
N
A
Q
AV
T
D
R
P
S
FIAL
R
T
VI
GY
PAP

299


.

*

*
:
.*

*
:
.

.*.
:
*

*
:
*
:
.
:
*
::: :
*
:
**
: :
*


2R5N_Transketolase
N
K
A
GTH
D
SHG
APL
G
D
A
E
IAL
T
R
E
Q
L
G
W
K
Y
-
APF
E
IP
S
E
I
Y
A
Q
W
D
A
K
E
A
GQ
A
K
-
E
S
AW
N
E
K

310

Y. pestis
N
K
A
GTH
D
SHG
APL
G
E
A
E
VAA
T
R
E
AL
G
W
K
Y
-
PAF
E
IP
Q
D
I
Y
AAW
D
A
K
E
A
G
K
A
K
-
E
AAW
N
E
K

310

V. cholerae
N
K
A
GSH
D
CHG
APL
GN
DE
I
K
AA
R
E
FL
G
W
E
H
-
APF
E
IPA
D
I
Y
AAW
D
A
K
Q
A
G
A
S
K
-
E
AAW
N
E
K

339

P. aeroginosa
N
K
QG
K
EE
CHG
APL
G
A
DE
IAA
T
R
AAL
G
W
E
H
-
APF
E
IPA
Q
I
Y
A
E
W
D
A
K
E
TG
AA
Q
-
E
A
E
W
N
KR

309

A. tumefaciens

N
K
QGTH
K
V
HGN
PL
G
A
EE
IAAA
RK
S
L
N
W
E
A
-
E
AFVIP
ED
VL
D
AW
R
LA
G
L
R
ST
K
T
R
Q
D
W
E
A
R

310

M. tuberculosis
N
LM
D
TG
K
A
HG
AAL
G
DDE
VAAV
KK
IV
G
F
D
P
D
K
T
F
Q
V
R
ED
VL
THT
R
G
LVA
R
G
K
Q
A
H
E
R
W
Q
L
E

359


*

..

.

**

.**

*
:
.
: :
.
:
.

.*

: :: :
.

*
:
.


2R5N_Tr
ansketolase
FAA
Y
A
K
A
Y
P
Q
E
AA
E
F
T
RR
M
K
G
E
MP
S
D
F
D
A
K
A
K
E
FIA
K
L
Q
A
N
PA
K
IA
S
RK
A
SQN
AI
E
AF
G
P

370

Y. pestis
FAA
Y
A
K
A
Y
P
E
LAA
E
F
KRR
V
SG
E
LPA
N
WAV
E
S
KK
FI
E
Q
L
Q
A
N
PA
N
IA
S
RK
A
SQN
AL
E
AF
G
K

370

V. cholerae
FAA
Y
A
K
A
Y
PA
E
AA
E
Y
KRR
VA
G
E
LPA
N
W
E
AA
TS
E
IIA
N
L
Q
A
N
PA
N
IA
S
RK
A
SQN
AL
E
AF
G
K

399

P. aeroginosa
FAA
YQ
AA
H
P
E
LAA
E
LL
RR
L
K
G
E
LPA
D
FA
E
K
AAA
Y
VA
D
VA
N
K
G
E
T
IA
S
RK
A
SQN
AL
N
AF
G
P

369

A. tumefaciens
L
E
A
T
E
T
A
K
---
K
A
E
F
KRR
FA
G
D
LP
GN
F
D
SS
I
D
AF
KKK
II
E
NN
P
T
VA
T
RK
A
S
ED
S
L
E
VI
NG

367

M. tuberculosis
F
D
AWA
RR
E
P
E
RK
ALL
D
R
LLA
Q
K
LP
D
G
W
D
A
D
LP
----
H
W
E
P
GS
K
ALA
T
R
AA
SG
AVL
S
AL
G
P

415


:
*

*

*

.

.
:
*

.
:
.

:
*
:
*

**

:
..
:
.



2R5N_Transketolase
LLP
E
FL
GGS
A
D
LAP
SN
L
T
LW
SGS
K
AI
N
----------
ED
AA
GNY
I
HYG
V
R
E
F
G
M
T
AIA
NG

420

Y. pestis
VLP
E
FL
GGS
A
D
LAP
SN
L
T
IW
SGS
K
S
L
S
----------
DD
LA
G
NY
I
HYG
V
R
E
F
G
M
S
AIM
NG

420

V. cholerae
LLP
E
FM
GGS
A
D
LAP
SN
L
T
MW
SGS
K
S
L
T
A
---------
ED
A
SGNY
I
HYG
V
R
E
F
G
M
T
AII
NG

450

P. aeroginosa
LLP
E
LL
GGS
A
D
LA
GSN
L
T
LW
K
GC
K
G
V
S
A
---------
DD
AA
GNY
VF
YG
V
R
E
F
G
M
S
AIM
NG

420

A. tumefaciens
ILP
E
MV
GGS
A
D
L
T
P
SNNT
K
TSQ
M
K
S
I
T
P
---
------
T
D
F
SG
R
Y
L
HYG
I
R
E
HG
MAAAM
NG

418

M. tuberculosis
K
LP
E
LW
GGS
A
D
LA
GSNNTT
I
K
G
A
D
S
F
G
PP
S
I
ST
K
E
YT
A
H
W
YG
R
T
L
H
F
G
V
R
E
H
AM
G
AIL
SG

475


***
:
******
:
**

*

.

...

.

*.

:
.
:
*
:
**..*

*

.*


2R5N_Transketolase
I
S
L
HGG
FLP
YTST
FLMFV
E
Y
A
R
N
AV
R
MAALM
K
Q
R
Q
VMV
YTH
D
S
I
G
L
G
ED
G
P
THQ
PV
E
Q
VA

480

Y. pestis
IAL
HGG
FIP
YG
A
T
FLMFV
E
Y
A
R
N
AV
R
MAALM
K
I
R
S
VFV
YTH
D
S
I
G
L
G
ED
G
P
THQ
PV
E
Q
MA

480

V. cholerae
IAL
HGG
FVP
YG
A
T
FLMFM
E
Y
A
R
N
AM
R
MAALM
K
V
QN
I
Q
V
YTH
D
S
I
G
L
G
ED
G
P
THQ
PV
E
Q
IA

510

P. aeroginosa
VAL
HGG
FIP
YG
A
T
FLIFM
E
Y
A
R
N
AV
R
M
S
ALM
K
Q
R
VL
Y
VF
TH
D
S
I
G
L
G
ED
G
P
THQ
PI
E
Q
LA

480

A. tumefaciens
IAL
HGG
LIP
Y
A
GG
FLIF
S
D
YC
R
P
S
I
R
LAALM
G
I
R
VV
H
VL
TH
D
S
I
G
V
G
ED
G
P
THQ
PV
E
Q
IA

478

M. tuberculosis
IVL
HG
P
T
R
A
YGGT
FL
Q
F
S
D
Y
M
R
PAV
R
LAALM
D
I
D
T
I
Y
VW
TH
D
S
I
G
L
G
ED
G
P
THQ
PI
E
H
L
S

535


:

***

.*

.

**

*

:
*

*

::
*
::
***

:
*

******
:
*********
:
*
:::

5a) Explain the symbols underneath the alignment:

.

*

:


5b) List all completely conserved Histidines. Please include their residue numbers corresponding
to the 2R5N sequence.




Now you would l
ike to visualize your findings in 3D. Please open 2R5N in Jmol.

=>
HINT
: It might be helpful to use the "select" commands in the Jmol scrip command


line, as shown in the Extra
-
homework set posted under May 18th.


6) From your answer in 4
c, color the functional domains according to the assignment by SCOP.
It is okay to color both chains.

Please draw a crude schematic of the overall protein architecture (comprising both chains) using
oval
-

or egg
-
like shapes for each domain.

Please indi
cate inter
-
domain contacts.













7) Examine the structure more closely and identify the different types of ligands by mousing
oven them and comparing them with the Ligand Component section on the PDB structure
summary page.

=> Select all ethanedio
l molecules and hide everything else.


(Hint: Ligands are considered "Hetero"atoms

in Jmol

...)


=> How many ethanediol molecules do you see? Where could they
have
originate
d

from?





example
:

domain a

domain b

domain c

inter
-
domain
contact

You might want to change their color to "translucent" and select ag
ain all atoms.

The important ligands should now be better visible.

Using the center and zoom tools

in Jmol
, navigate the structure to a position from where you
have a good overview of the structural environment of a sugar and a TPP ligand in one of the
ac
tive sites.


8) Now, you want to distinguish between conserved Histidines in the active site and elsewhere in
the structure:

=> Choose a clear representation to display
all

conserved His
-
residues found in 5b


and note their residue numbers.

=>
Hint
:

A good strategy could be again to use the command line selection tools.


8a) Which Histidines are NOT in the active site?




8b) List those Histidines which ARE in the active site and also indicate which domain they are
located on.





8c) Focus on the

Histidine residue that
interacts closely

with the phosphate moi
et
y of the sugar
and note the His
-
residue number.




8d) Measure the
shortest distance between an atom of this His and the sugar. What is the
distance?




8e) What kind of bond is formed bet
ween the sugar and this Histidine residue?




9) There is a
calcium atom

present in the active site.

9a)
What atoms are bound to the calcium that are not part of the protein
?




9b) Which molecule do th
ese

atoms

belong to?





===================>
PLEASE TRUN OVER <====================


10)
For Extra credit:

Homology Modelling of TktA from
Agrobacerium tumefaciens

using Swiss Model.


Having dealt in so much detail with a homologous structure from
E. coli
, you would now like to
get an impression of

how a 3D
-
model of the
Agrobacterium

enzyme might look like.


http://swissmodel.expasy.org/

=> Automated Mode


Submit a modelling request using your email address and
A. tumefaciens'

TktA protein sequence.
You will be notified by email when

the process is finished (typically takes 30
-
60 min.). To
retrieve the result, you will need to enter a user code which will be sent to your email address.


=> What structure did Swiss model choose as a homology modeling template. Please list the
PDB code
.



=> Did you encounter this structure code already somewhere in the course of the exam??