Browsing by Subject "Databases, Protein"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Classification and Differentiation of Homologs and Structural Analogs(2007-08-08) Cheng, Hua; Grishin, NickIt is both meaningful and useful to study protein sequence, structure, and function in the context of evolution. In divergent evolution, homologs, or proteins having descended from a common ancestor, usually share sequence, structure, and functional properties, and an unknown protein's structure and function can be hypothesized from its experimentally characterized homologs. In convergent evolution, proteins from distinct evolutionary lineages converge to similar structures or functions, and these proteins are called "analogs". To classify proteins into evolutionary families, it is necessary to differentiate these two opposite scenarios. Statistically significant sequence similarity is commonly accepted as adequate evidence for homology. Yet in the absence of significant sequence similarity, discrimination between homology and analogy frequently requires manual work. This dissertation describes an effort in developing an automatic tool to differentiate remote homologs and structural analogs.Item Exploring Sequence-Structure-Function Relationships in Proteins Using Classification Schemes(2005-12-19) Cheek, Sara Anne; Grishin, Nick V.With the rapid growth in the number of available protein sequences and structures, the necessity of interpreting this data in comprehensive and meaningful ways becomes increasingly apparent. Identifying and categorizing the functional, structural, and evolutionary relationships between proteins is a key step in understanding protein evolution. Protein classification is a useful means of organizing biological data for the purpose of exploring these sequence-structure-function relationships in proteins. In this work, two-tier classification schemes are constructed for the organization of large protein classes. One level of this hierarchy reflects structural similarity ("fold groups"), while the second level indicates an evolutionary relationship between members ("families"). Kinases are a ubiquitous group of enzymes that participate in a variety of cellular pathways. Despite that all kinases catalyze similar phosphoryl transfer reactions, they display remarkable diversity in structural fold and substrate specificity. All available kinase sequences and structures have been classified into fold groups and families. This classification presents the first comprehensive structural annotation of a large functional class of proteins. The question of how different structural folds accomplish the same fundamental elements of the kinase reaction is investigated. Disulfide-rich domains are small protein domains whose global folds are stabilized predominantly by disulfide bonds. In order to understand the structural and functional diversity among available disulfide-rich proteins, a comprehensive classification of these domains has been performed. The resulting fold groups and families describe more distant structural and evolutionary relationships than previously acknowledged among disulfide-rich domains. Variations in disulfide bonding patterns of these domains are also evaluated. Several existing classification databases have been developed for the purpose of cataloguing all available protein structures. Because such databases are often manually curated, recently solved structures are not included and useful information regarding their relatedness to other proteins is not immediately available. To address this limitation, an algorithm has been developed to make classification assignments with evolutionary relevance for domains in newly solved structures, with the objective of reliably reproducing assignments to an existing classification scheme in an automatic manner.