STUDIES ON COMBINING SEQUENCE AND STRUCTURE FOR PROTEIN CLASSIFICATION

ticketdonkeyAI and Robotics

Nov 25, 2013 (3 years and 10 months ago)

66 views

STUDIES ON COMBINING SEQUENCE AND STRUCTURE FOR
PROTEIN CLASSIFICATION


Bong
-
Hyun Kim, Ph.D.

The University of Texas Southwestern Medical Center at Dallas, 2009

Supervising Professor:
Nick Grishin, Ph.D.


Full PDF available after 12/1/2012


Keywords: protein classification; protein evolution; fold change; homology; structural similarity;
sequence similarity; bioinformatices; computational biology


The ultimate goal of our research is to develop a better understanding of how proteins evolve di
fferent
structures and functions. A large scale protein clustering can provide a useful platform to identify such
principles of protein evolution. Manual classification schemes accurately group homologous proteins,
but they are slow and subjective. Automat
ic protein clustering methods are largely based on sequence
information. Therefore, they often do not accurately reflect remote homologies that can be recognized
by structural information. We hypothesized that combining evolutionary signals from protein se
quence
and 3D structure will improve automated protein classification. To test this hypothesis, we clustered
proteins into evolutionary groups using both sequence and structure by a fully automated method. We
developed a stringent algorithm, self?consisten
cy grouping (SCG) method, which clusters proteins if all
the proteins in the group are more similar to each other than to proteins outside the group.
Comparison of SCG and other commonly used clustering methods to a widely accepted manual
classification sc
heme, Structural Classification of Protein (SCOP), showed SCG groups to better reflect
the reference classification. In depth analysis of SCG clusters highlights new non?trivial evolutionary
links between proteins. SCG clustering can be further developed a
s a reference for evolutionary
classification of proteins.