Browsing by Subject "Secondary structure prediction"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Improving secondary structure prediction with covariation analysis and structure-based alignment system of RNA sequences(2013-12) Shang, Lei, active 2013; Gutell, RobinRNA molecules form complex higher-order structures which are essential to perform their biological activities. The accurate prediction of an RNA secondary structure and other higher-order structural constraints will significantly enhance the understanding of RNA molecules and help interpret their functions. Covariation analysis is the predominant computational method to accurately predict the base pairs in the secondary structure of RNAs. I developed a novel and powerful covariation method, Phylogenetic Events Count (PEC) method, to determine the positional covariation. The application of the PEC method onto a bacterial 16S rRNA sequence alignment proves that it is more sensitive and accurate than other mutual information based method in the identification of base-pairs and other structural constraints of the RNA structure. The analysis also discoveries a new type of structural constraint – neighbor effect, between sets of nucleotides that are in proximity in the three dimensional RNA structure with weaker but significant covariation with one another. Utilizing these covariation methods, a proposed secondary structure model of an entire HIV-1 genome RNA is evaluated. The results reveal that vast majority of the predicted base pairs in the proposed HIV-1 secondary structure model do not have covariation, thus lack the support from comparative analysis. Generating the most accurate multiple sequence alignment is fundamental and essential of performing high-quality comparative analysis. The rapid determination of nucleic acid sequences dramatically increases the number of available sequences. Thus developing the accurate and rapid alignment program for these RNA sequences has become a vital and challenging task to decipher the maximum amount of information from the data. A template-based RNA sequence alignment system, CRWAlign-2, is developed to accurately align new sequences to an existing reference sequence alignment based on primary and secondary structural similarity. A comparison of CRWAlign-2 with eight alternative widely-used alignment programs reveals that CRWAlign-2 outperforms other programs in aligning new sequences with higher accuracy. In addition to aligning sequences accurately, CRWAlign-2 also creates secondary structure models for each sequence to be aligned, which provides very useful information for the comparative analysis of RNA sequences and structures. The CRWAlign-2 program also provides opportunities for multiple areas including the identification of chimeric 16S rRNA sequences generated in microbiome sequencing projects.Item Protein dynamics in sequence and conformational spaces(2016-08) Chen, Szu-Hua; Elber, Ron; Ren, Pengyu; Johnson, Kenneth A.; Ellington, Andrew; Makarov, Dmitrii E.Proteins are biological macromolecules that are involved in a wide range of cellular processes. The diverse functions of proteins are closely related to their dynamics and structures. Structures are frequently coded in a complex manner in the amino acid sequences. In this dissertation I discuss the dynamics of a special class of proteins through studies of their sequences and structures. These proteins are “switches,” which are made of highly similar sequences that fold to dramatically different structures. The existence of protein switches provides a great challenge to structure prediction algorithms as well as to our understanding of the process of protein structure evolution. To identify protein switches, we developed methods that assign switch sequences to structures with high accuracy. One method uses short MD simulations to enrich structural ensembles of protein switches in the neighborhood of their initial conformations for scoring by contact maps. The other method uses evolutionary profiles and contact maps of the wild-type proteins. Both methods were first tested against a series of experimentally engineered proteins in a switching system and then applied to examine a large number of computationally sampled protein switches for a particular pair of structures in sequence space. From the sampled switch sequences we found that making a point mutation near the N- and C-termini of the sequences is more likely to make the proteins switch between structures. To study the conformational change of a protein switch with a fixed sequence between two metastable states in conformational space, we proposed a new algorithm, named “Chain Growth”, to calculate reaction pathways. Unlike commonly used methods that require an initial guess of a path and minimize the energy of the path by local quenching, our method propagates the path in small segments and optimizes the whole path globally. These features avoid the problems of generating very distorted initial structures that other methods frequently encounter and allow more efficient minimization of the path. We provided computational examples of using Chain Growth to calculate the minimum energy path on the Müller potential energy surface as well as to the studies of conformational changes of alanine dipeptide and folding of tryptophan zipper.