Structural Databases the PDBe - European Bioinformatics Institute

tastelesscowcreekΒιοτεχνολογία

4 Οκτ 2013 (πριν από 4 χρόνια και 8 μέρες)

94 εμφανίσεις

Structure Databases:

The Protein Data Bank

Swanand

Gore & Gerard
Kleywegt

PDBe



EBI

May 7
th

2010, 9
-
10 am

Macromolecular Crystallography Course

Outline


Structural Biology and Bioinformatics



Databases in Structural Bioinformatics



Protein Data Bank



PDBe

Promise of Structural Biology


Basic research


Insights in biophysics of folding


Insights into Evolution


Insights into enzymatic catalysis



Applications


Design of drug / antibody /
epitope

/ pesticide / enzymes


Design of new materials


Understanding disease



Structural bioinformatics


Big computational and informatics toolbox


Full of techniques to translate insights to application


Databases are a vital aspect


Sequence
-
Structure
-
Function

Sequence

Function

Prediction

Modelling

Determination

Archival / Retrieval

Classification

Structure

Searching

Mining

Comparison

Alignment

Design

Engineering

A rich toolbox

Structural

Bio
-
info
-
computing

Structure
Refinement

Databases

Annotation

Classification

Comparison

Analysis
Mining

Prediction

Databases are central to structural
bioinformatics pipeline

Primary
Structural

Databases

Determine

Annotate

Align

Compare

Mine

Classify

Model

Predict

Secondary
Structural

Databases

Databases help in Structure
Determination


Dihedral preferences


Ramachandran

contours


Sidechain

rotamer

libraries


RNA backbone and puckers


Likely ring conformations


Small
-
molecules (CCDC)


Molecular replacement


Choice of probe using homology


fragment
-
based MR


Validation


Electron density server and
PrEDS



Dunbrack
, R.L., Jr.
Rotamer

libraries in the 21st century.
Curr
.
Opin
.
Struct
. Biol.

12:431
-
440, 2002.


Jane S. Richardson et al (2008)

"RNA Backbone: Consensus All
-
angle Conformers and Modular String Nomenclature (an RNA Ontology

Consortium contribution)"

RNA

14

:465
-
481


The Cambridge Structural Database: a quarter of a million crystal structures and rising, F. H. Allen, /
Acta

Cryst
./, B*58*, 380
-
388, 2002


S.C. Lovell et al. (2003) "Structure Validation by C
α
Geometry:
φ,ψ
and C
β
Deviation." Proteins: Structure, Function and Genetics 50, 437
-
450.


Claude et al.
CaspR
: a web server for automated molecular replacement using homology modelling. Nucleic Acids Res. 2004 Jul 1;32(Web Server issu
e):
W606
-
9.


McCoy, A.J., Grosse
-
Kunstleve
, R.W., Adams, P.D., Winn, M.D.,
Storoni
, L.C. and Read, R.J. (2007).
Phaser

crystallographic software.
J. Appl.
Cryst
.

40: 658
-
674.


Gubbi

et al. (2007)
Solving Protein Structures Using Molecular Replacement Via Protein Fragments, Lecture Notes In Artificial
Intelligence;.Vol
. 4578. 627.


GJ
Kleywegt

et al. (2004) "The Uppsala Electron
-
Density Server",
Acta

Crystallographica
, D60, 2240
-
2249

Databases are vital to
archiving structures!


Structures represent invaluable
scientific insights


But it is costly to solve a
structure


Time, effort, money


Organize and safe
-
keep
painstakingly determined data


Formal mechanisms of
arranging, searching, backing
up


Wide
-
ranged access to
invaluable repository without
compromising data integrity


Very low cost of maintenance
in comparison with the cost of
content!

Databases are vital to archiving structures


“Database is a structured collection of data held
in computer storage, often incorporating software
to make it accessible in various ways”


Databases


Provide accessibility with safety and persistence


Provide context for your data against other data


Facilitate comparisons and data
-
mining


Primary structural databases


Experimental data and model coordinates


NDB,
wwPDB
, BMRB, CSD, EMDB


Secondary structural databases


Classification, function annotation


SCOP, EC2PDB, PALI, and many
many

more!

Databases / Archival / Retrieval


Formats of databases


Flat files (
csv
,
tsv
, columnar), supporting scripts


Relational (
MySQL
, Oracle): professional, indexed


Access


Modes: read, write, edit, delete (PDB provides entry deposition mechanisms)


Means: Download (
wwPDB

ftp), Command
-
line or GUI (SQL queries, Oracle
desktop client), Web
-
based interfaces (
PDBeDatabase

service)


Access frequency


Schema design


Tables, primary keys,
foreign keys, views….


Normal forms: avoid
data repetition,
inconsistencies

Databases for Classification


Structural hierarchy


CATH


Class, Architecture, Topology, Homology


SCOP


Class, Fold,
Superfamily
, Family



Enzyme hierarchy


EC
-
PDB


Oxidoreductase
,
ligase
,
lyase
,
isomerase
,
hydrolase
,
transferase
.



Functional ontology


GOA


Gene Ontology: Cellular component,
Biological process, Molecular Function


Linked to structures via SIFTS




Christos A.
Ouzounis

et al. (2005)

Classification schemes for protein structure and function Nature Reviews Genetics 4, 508
-
519.


Andreeva

et al. (2007) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res. 36:D419


Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium (2000) Nature Genet. 25: 25
-
29


Barrell

D. et al. (2009) The GOA database in 2009
--
an integrated Gene Ontology Annotation resource. Nucleic Acids Research 2009 37: D39
6
-
D403.

Databases for Comparison


Structural and structure
-
sequence alignments










Phylogeny


Evolutionary trace


Evolutionarily important residues


Mapping onto structure


Mizuguchi

K, Deane CM, Blundell TL,
Overington

JP. (1998) HOMSTRAD: a database of protein structure alignments for homologous families.
Protein Science

7:2469
-
2471.


SISYPHUS
-

structural alignments for proteins with non
-
trivial relationships
Andreeva

et al,
Nucleic Acid Research Database Issue 2007, 35, D253
-
D259


Gowri
, V. S. Et al. (2003). Integration of related sequences with protein three
-
dimensional structural families in an updated Version

of PALI database. Nucleic Acids Res. 2003 31: 486
-
488.


Bhaduri

A,
Pugalenthi

G,
Sowdhamini

R. PASS2: an automated database of protein alignments organised as structural
superfamilies
.
BMC Bioinformatics
. 2004, 5:35


DBAli

tools: mining the protein structure space. Marc A. Marti
-
Renom

et al. Nucleic Acids Research, doi:10.1093/
nar
/gkm236


Whelan,

S., P.I.W.

de

Bakker, & N.

Goldman. (2003).
Pandit
: a database of protein and associated nucleotide domains with inferred trees.
Bioinformatics

19:1556
-
1563


The
Pfam

protein families
database:,R.D
.
Finn,et

al, Nucleic Acids Research

(2010)

Database Issue 38:D211
-
222


Morgan, D.H., D.M.
Kristensen
, D.
Mittleman
, and O.
Lichtarge
. ET Viewer: An Application for Predicting and Visualizing Functional Sites in Protein Structures. Bioinformatics. 2006 Aug 1
5;2
2(16):2049
-
50

Databases for Annotation


SNPs


Servant F.
rt

al (2002)
ProDom
: Automated clustering of homologous domains. Briefings in Bioinformatics.
vol

3, no 3:246
-
251


Marchler
-
Bauer
A,et

al CDD: specific functional annotation with the Conserved Domain Database.
Nucleic Acids Res.

2009 Jan;37(Database issue):D205
-
10


Hulo

N.,
Bairoch

A.,
Bulliard

V.,
Cerutti

L.,
Cuche

B., De Castro E.,
Lachaize

C.,
Langendijk
-
Genevaux

P.S.,
Sigrist

C.J.A.
The 20 years of PROSITE.
Nucleic Acids Res. 2007


SitesBase
: a database for structure
-
based protein

ligand

binding site comparisons , Nicola D. Gold and Richard M. Jackson, Nucleic Acids Research, 2006, Vol. 34, Database issue D231
-
D2
34


sc
-
PDB: an Annotated Database of
Druggable

Binding Sites from the Protein Data Bank, Esther
Kellenberger

et al,
J. Chem. Inf. Model.
, 2006, 46 (2), pp 717

727


Binding MOAD, a high
-
quality protein

ligand

database. Mark L. Benson et al, Nucleic Acids Research 2008 36(Database issue):D674
-
D678


SNPeffect

v2.0: a new step in investigating the molecular phenotypic effects of human non
-
synonymous SNPs . Joke
Reumers

at al, Bioinformatics 2006 22(17):2183
-
2185


Domains


Active /
allosteric

sites

Databases for Annotation


CREDO: A Protein
-
Ligand

Interaction Database for Drug
Discovery.Adrian

Schreyer, Tom Blundell.
Chemical Biology & Drug Design
, Vol. 73, No. 2. (February 2009), pp. 157
-
167


BIPA: a database for protein

nucleic acid interaction in 3D structures.
Semin

Lee and Tom L Blundell, Bioinformatics 2009 25(12):1559
-
1560


PIBASE: a comprehensive database of structurally defined protein interfaces. Davis FP and
Sali

A, Bioinformatics. 2005 May 1;21(9):1901
-
7.


JAIL: a structure
-
based interface library for macromolecules. Stefan
Günther

et al. Nucleic Acids Res. 2009 January; 37(Database issue): D338

D341


Elke

Michalsky

et al.,
SuperLigands



a database of
ligand

structures derived from the Protein Data Bank,
BMC Bioinformatics

2005, 6:122


Voronoia
: analyzing packing in protein structures. Rother K et al. Nucleic Acids Res. 2009 Jan;37(Database issue):D393
-
5.


CASTp
: Computed Atlas of Surface Topography of proteins.
Binkowski

et al. Nucleic Acids Res. 2003 Jul 1;31(13):3352
-
5.


The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Craig T. Po
rte
r, Gail J. Bartlett, and Janet M. Thornton (2004)
Nucl
. Acids. Res. 32: D129
-
D133.


Binding partners


Small molecule: TIMBAL, CREDO


Protein, DNA


PiBase


JAIL, BIPA


Residues critical to enzyme mechanism


Surface properties, cavities:

V
oronoia
,


Databases of Analysis / Mining


Secondary structure: SSEP


Active sites


Oliva

et al (1997) An automated classification of the structure of protein loops.
J Mol
Biol

266 (4): 814
-
830.


SSEP: secondary structural elements of proteins , V.
Shanthi
, P.
Selvarani
, Ch.
Kiran

Kumar, C. S.
Mohire

and K.
SekarNucleic

Acids Research, 2003, Vol. 31, No. 13 3404
-
3405


PepX
: a structural database of non
-
redundant protein
-
peptide complexes.
Vanhee

F et al., Nucleic Acids Res. 2010 Jan;38(Database issue):D545
-
51.


Baeten

L, et al. (2008) Reconstruction of Protein Backbones from the
BriX

Collection of Canonical Protein Fragments.
PLoS

Comput

Biol

4(5): e1000083. doi:10.1371/journal.pcbi.1000083


Bystroff

C & Baker D. (1998). Prediction of local structure in proteins using a library of sequence
-
structure motifs.
J Mol
Biol

281, 565
-
77.


LigBase
: a database of families of aligned
ligand

binding sites in known protein sequences and structures. Stuart AC et al., Bioinformatics. 2002 Jan;18(1):200
-
1.


PTGL

a web
-
based database application for protein topologies. Patrick May et al. Bioinformatics 2004 20(17):3277
-
3279; doi:10.1
093/bioinformatics/bth367


Fitzkee
, N. C., Fleming, P. J, Rose G. D. (2005) The Protein Coil Library: a structural database of
nonhelix
,
nonstrand

fragments derived from the PDB.
Proteins.

58 (4): 852
-
4.



Protein
-
peptide
interactions



Loop databases


Protein Coil Library


Protein Loop
Classification


Loops in Proteins



Protein Topology Graph
Library



Frequent structural
motifs





Databases in Prediction


Oligomeric

state


PISA at
PDBe



3D coordinates


ab
-
initio folding


homology models




Possible binding partners and binding modes



small
-
molecule (PRECISE)


protein
-
protein (ADAN)



Dynamics, conformational changes


MolMovDB



Cellular location


LOC3D: annotate sub
-
cellular localization for protein structures. Nair R,
Rost

B., Nucleic Acids Res. 2003 Jul 1;31(13):3337
-
40.


MolMovDB
: analysis and visualization of conformational change and structural flexibility. Echols N et al., Nucleic Acids Res. 2003 Ja
n 1
;31(1):478
-
82.


ADAN: a database for prediction of protein
-
protein interaction of modular domains mediated by linear motifs.
Encinar

JA et al., Bioinformatics. 2009 Sep 15;25(18):2418
-
24.
Epub

2009 Jul 14.


PRECISE: a Database of Predicted and Consensus Interaction Sites in Enzymes .
Shu
-
Hsien

Sheu

et al., Nucleic Acids Research, 2005, Vol. 33, Database issue D206
-
D211


MODBASE, a database of annotated comparative protein structure models and associated resources. Ursula Pieper et al.,
Nucleic Acids Research

37, D347
-
D354, 2009.


Krissinel

E,
Henrick

K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. (2007) 372:774

797.


S. M. Larson .
Folding@Home

and
Genome@Home
: Using distributed computing to tackle previously intractable problems in computational biology. Mod Meth Comp
Biol
, R. Grant,
ed
, Horizon Press (2003)

Specialized databases with structures


MCSIS (GPCRs,
Prions

etc)





Carbohydrates


KEGG
Glycans


Antibodies (
Abysis
)



Lysozymes



Abysis
: http://www.bioinf.org.uk/abysis/


Horn F.,
Vriend

G., Cohen FE. Collecting and harvesting biological data: the GPCRDB and
NucleaRDB

information systems. Nucleic Acids Res. 29:346
-
349 (2001)


LySDB

-

Lysozyme

Structural
DataBase
. Mohan KS et al.,
Acta

Crystallogr

D
Biol

Crystallogr
. 2004 Mar;60(Pt 3):597
-
600.

The Protein Data Bank


Unique primary database


Single archive of experimentally determined macromolecular
(biopolymer) structures


~ 65000 entries


Distributed online


Updated weekly


Numerous databases derived and enriched with PDB data


Many frontends
-

RCSB,
PDBe
,
PDBsum
, OCA, MMDB, Jena, SIB



“The PDB” is a flat
-
file archive


PDB formatted coordinate files


any experimental data when submitted


The Protein Data Bank


International Effort


Curated

by RCSB,
PDBe
,
PDBj
, BMRB


ftp archive currently operated by RCSB

FTP traffic at PDB sites

RCSB PDB

200 million

data downloads

PDBe

37 million

data downloads

PDBj

14 million

data downloads

The Protein Data Bank


When is a biopolymer PDB
-
worthy?


Polypeptides


Gene products


Non
-
ribosomal


Synthetic peptides > 23 residues


Unless clearly biologically significant




Polynucleotides


> 3 residues




Sugars


> 3 sugar residues




Fibers


Only repeating unit deposited

Annual Growth of PDB

Primary databases
differ
by magnitudes
in size.

UniprotKB

10
7

protein sequences

GenBank

10
11

base pairs

10
8

gene sequences

< 10
5

structures

http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100

http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html

http://www.ebi.ac.uk/uniprot/TrEMBLstats/

Annual Growth of PDB

Dominated by x
-
ray!

EM rising…

Redundancy in PDB

(as in Nov’08)


Entries > 54,000


Chains > 120,000


Copies of a chain in same entry


Homo
-
oligomers


Same chains in different entries


Determined by multiple labs


Determined under different conditions


Complexed

with different partners


Mutants



Chains < 8700 at seq.id < 30%


Orthologs
,
paralogs

are very similar




Using non
-
redundant chains from PDB


PISCES server


WHATIF, CATH, SCOP, DALI sets


G. Wang and R. L.
Dunbrack
, Jr. PISCES: a protein sequence culling server.
Bioinformatics
, 19:1589
-
1591, 2003.

File formats at PDB


The .
pdb

format


Header


Remarks


experimental setup


Refinement details


oligomeric

state


deviations from expected geometry


Biochemical entities


Biopolymers, het groups


Coordinates


3D model of the entity


Multiple coordinates for same entity can exists


MODELs,
altloc

identifiers





Structure factors


.
cif

file


File formats at PDB

XML

mmCIF

The PDB format: header

123456789+123456789+123456789+123456789+123456789+123456789+123456789+123456789+


HEADER RETINOIC
-
ACID TRANSPORT 28
-
SEP
-
94 1CBS
1CBS

2

COMPND CELLULAR RETINOIC
-
ACID
-
BINDING PROTEIN TYPE II COMPLEXED 1CBS 3

COMPND 2 WITH ALL
-
TRANS
-
RETINOIC ACID (THE PRESUMED PHYSIOLOGICAL 1CBS 4

COMPND 3 LIGAND) 1CBS 5

SOURCE HUMAN (HOMO SAPIENS) 1CBS 6

SOURCE 2 EXPRESSION SYSTEM: (ESCHERICHIA COLI) BL21 (DE3) 1CBS 7

SOURCE 3 PLASMID: PET
-
3A 1CBS 8

SOURCE 4 GENE: HUMAN CRABP
-
II 1CBS 9

AUTHOR G.J.KLEYWEGT,T.BERGFORS,T.A.JONES 1CBS 10

REVDAT 1 26
-
JAN
-
95 1CBS 0 1CBS 11

Column 1
-
6

Record type

Column 7
-
72
-

human
-
readable, mostly

textual information

The PDB format: coordinates

HETATM 1 C ACE A 0 4.279 14.829 14.190 1.00 19.08 C

HETATM 2 O ACE A 0 3.706 14.098 15.038 1.00 20.62 O

HETATM 3 CH3 ACE A 0 3.827 16.236 14.001 1.00 20.22 C

ATOM 4 N MET A 1 5.514 14.621 13.695 1.00 17.77 N

ATOM 5 CA MET A 1 6.269 13.401 13.959 1.00 16.51 C

ATOM 6 C MET A 1 6.702 13.319 15.400 1.00 16.41 C

ATOM 7 O MET A 1 7.036 12.248 15.870 1.00 15.38 O

ATOM 8 CB MET A 1 7.529 13.301 13.085 1.00 16.52 C

ATOM 9 CG MET A 1 7.292 12.805 11.676 1.00 16.48 C

Atom nr

Residue type

Atom name

Chain name

Residue nr

“B
-
factor”

Occupancy

X, Y, Z coordinates

Protein Data Bank in Europe


PDBe


European node of
wwPDB


Started 1996 as MSD at EBI


Deposition site since 1999


Started EMDB in 2002



PDBe

operations


Handle deposition and annotation of PDB and EMDB entries


Build advanced structure databases


Build services for search, browsing, analysis


Liaise with broader structural biology community


Coordinate with other databases e.g.
Uniprot



Funding



PDBe
: Protein Data Bank in Europe. S.
Velankar

et al
.,


Nucleic Acids Research, doi:10.1093/
nar
/gkp916

PDBe

Deposition and Annotation


Checks


Is format correct?


Are biopolymer sequences in biochemical
entities consistent with 3D models?


Are hetero groups named correctly?


Where all does model deviate from
expected geometry?



Record various types of information


Experiment: Method, conditions, data
resolution,
spacegroup
, completeness etc.


Sample: source, expression system,
engineered etc.


Refinement: program, target



AutoDep

Deposition

Tool

AutoDep

provides valuable
information to depositors


Validation of structure factors


EDS criteria


http://www.ebi.ac.uk/pdbe
-
xdep/autodep/index.jsp

AutoDep

provides valuable
information to depositors

Heterogen

summary and Validation against ideal representations of
ligands

AutoDep

provides valuable
information to depositors

Oligomeric

state
-

PQS

Sequence
-
structure alignment

Uniprot
,
Pfam
,
Interpro

AutoDep

provides valuable
information to depositors


Revisions, withdrawal, release


Release sequence
-
only
immediately


Release coordinates
immediately


Hold for 1 year


Release after publication




Communication with depositors


Help depositors understand and
conform to PDB standards


Discussing errors

PDBe

Services

PISA, SSM/
PDBeFold
,
PDBeMotif
,
PDBeChem
, SIFTS,
PDBeStatistics
,
PDBeSearch
,
PDBeView

PDBe

Services

PDBe

Services

PDBeView



the Atlas pages


http://www.ebi.ac.uk/pdbe
-
srv/view/

PDBe

Services

PDBeFold

(SSM): has my fold been seen before? Or is it novel!

PDB

???



E. Krissinel and K. Henrick
, Secondary
-
structure matching (SSM), a new tool for fast protein structure alignment in three dimensions.
Acta

Cryst
. (2004). D60, 2256
±
2268.

PDBe

Services


Why compare structures?


Reveal conformational changes


Ligands
, mutations, crystal packing,
pH.
.


Judge structural variability


NMR ensembles, structure families


Discover common structural motifs


Identify fold


Infer function


Sequence
-
alignments do not work well for
distant evolutionary relationships


Structures diverge much slowly than
sequences


Structure improves quality of alignment


Better inference of function, e.g. when
active sites match well

PDBeFold

(SSM)


The relation between the divergence of sequence and structure in proteins.
Chothia

C,
Lesk

AM. EMBO J. 1986 Apr;5(4):823
-
6.



PDBe

Services

PDBeFold

(SSM) algorithm

H
1

S
1

S
2

S
3

S
4

H
2

H
1

H
2

H
3

H
4

S
1

H
5

H
6

S
2

S
3

S
4

S
5

S
6

S
7

Match SSE graphs to get initial alignment

Iterative expansion of C
a
-
alignment

PDBe

Services

PDBeFold

(SSM)

SSM can carry out genuine multiple structure
alignment to reveal a motif common to a
family of structures

PDBe

Services

PDBePISA


What is the likely biological assembly of a

given structure?


Can

I learn about it from crystal
-
packing of chains?

PDB file
(ASU)

Biological
Unit

Crystal Symmetry

ASU

PISA

Generate possible assemblies

Rank according to free energy

PDBe

Services

PDBePISA

PDB entry 1P30

A monomer?

Biological unit 1P30

Homotrimer
!

PDBe

Services

PDBePISA

PDB entry 2TBV

A trimer?

Biological Unit 2TBV

180
-
mer!

PDBe

Services

PDBePISA

PDBe

Services

PDBePISA

PDB entry
1E94

2 Biological Units in 1E94:

A dodecamer and a hexamer!

PDBe

Services

PDBeMotif


A very powerful

engine to

search PDB


Structure
-
sequence general searches


Chemical substructure


Predefined frequent motifs


Arbitrary secondary structure patterns


Φψ

patterns


Protein sequences


Prosite

motif,
Uniprot
, CSA
accessions


Raw sequence


Regular expression


Interactions between
ligands
, protein


Seq
-
distance between protein motifs


PDB header searches


Specialized searches


Envionment

around an interaction


Motif binding


Occurrence of a motif inside another



MSDmotif
: exploring protein sites and motifs. Adel
Golovin

and Kim
Henrick
.
BMC Bioinformatics 2008, 9:312

PDBe

Services

PDBeMotif
: which motif does my substructure bind often?

Staurosporine


Kinase

inhibitor

PDBe

Services

PDBeMotif
: which
ligands

and chemical fragments does my sequence motif bind?

Tyrosine protein
kinase
-
specific
active
-
site
signature:

[LIVMFYC]
-
{A}
-
[HY]
-
x
-
D
-
[LIVMFY]
-
[RSTAC]
-
{D}
-
{PF}
-
N
-
[LIVMFYC](3)

Motif
binding
statistics

Chemical
fragments

PDBe

Services

PDBeMotif
: how does a sequence motif look like in 3D?

Tyrosine protein
kinase
-
specific
active
-
site
signature:

[LIVMFYC]
-
{A}
-
[HY]
-
x
-
D
-
[LIVMFY]
-
[RSTAC]
-
{D}
-
{PF}
-
N
-
[LIVMFYC](3)

Sequence hits

3D alignment

PDBe

Services

PDBeMotif
: which sequences often host a
Ramachandran

path?

3D
fragment

φ
/
ψ

sequence

-
156/
-
155,
-
103/17,
-
134/161

Search

Sequence pattern

PDBe

Services

PDBeAnalysis
: selections and statistics


Structure

Statistics


frequency plots on 1 or 2 properties of entries


Residue

Statistics


Choose residues and make frequency plots of a property


Choose residues in entry meeting certain filters, and plot their
property


Atom

Statistics


Choose atom
-
sets in entries and plots distance, angle, dihedrals
between them


Structure Selection


Create a subset of entries using various filters


Database Browser


Web
-
based SQL

query page to internal database


Geometric Validation coupled with 3D viewer



http://www.ebi.ac.uk/pdbe
-
as/pdbevalidate/

PDBe

Services

PDBeAnalysis
: selections and statistics

Resolution
vs

Rfactor

CA1
-
CA2
-
CA3
-
CA4

Torsion distribution

Low res

High res

PDBe

Services

PDBeAnalysis
: geometric validation

Table and plot of geometric checks

Phi
-
psi, chi, omega, B
-
value,

bonds, angles,
chiralities

AstexViewer

coordinated with plots

PDBe

Community Work


X
-
ray


CCP4 software: MMDB,
PISA, SSM, harvesting


Validation Task Force



NMR


CCP
-
NMR software


Validation task force



EM


Validation and
standards


Ongoing software
development



SIFTS
-

coordinating
with other
biodatabases



CAPRI
-

Provide
infrastructure for
submission and
maintenance of entries



PiMS



Information
management system for
protein crystallography
experiments

PDBe

Community Work


EuroCarbDB


Databases and
bioinformatic

tools in
glycobiology

and
glycomics



BIObar


A toolbar for browsing biological data and
databases, a Mozilla
plugin

for your browser



Outreach and training


Roadshows
: invite us!


Tutorials

PDBe

Services: Future Emphasis


To go from being a historic structural archive to a valuable
resource for structural biomedicine



PDBeXplore


Provide relevant interesting avenues to access structural
information


Ligands
, Assemblies, Enzymes, GO, CATH, Sequences,
Publications, Pathways



PDBe

Validation Resource


Provide a comprehensive battery of validation tools during
deposition and to the end
-
user


Migrate and enhance EDS server


Partner with CCDC to bring cutting edge
ligand

validation

Summary


Structural Bioinformatics and
Biocomputing

are essential to
fulfilling the promise of structural biology



Databases are indispensible to all aspects of structural
bioinformatics



PDB is the primary repository of structures and numerous
databases are developed based on PDB.



PDBe

provides high
-
quality services to depositors and end
-
users,
and is an active member of structure
-
determination community.



PDBe

is open to all suggestions to make our services better and
more relevant to your work.

Acknowledgements


Alejandro
and organizers
at
IPMont



PDBe

group


Sameer

Velankar
,
Jawahar

Swaminathan



Designers, developers, maintainers of various
structural databases at
PDBe

and elsewhere