Browsing by Subject "Protein evolution"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Investigating the behaviors and limitations of phylogenetic models of protein-coding sequence evolution(2016-05) Spielman, Stephanie Jill; Wilke, C. (Claus); Bull, James; Barrick, Jeffrey; Hillis, David; Hofmann, HansProbabilistic models which infer the strength and direction of natural selection from protein-coding sequences are among the most widely-used tools in comparative sequence analysis. A variety of phylogenetic models of coding-sequence evolution have been developed. However, these models have been produced independently from one another. As a consequence, it has been entirely unknown whether inferences from different models reveal similar or incompatible information about the evolutionary process. In this dissertation, I derive and study the mathematical relationship between two probabilistic models of protein-coding sequence evolution: dN/dS-based models, which estimate evolutionary rates, and mutation–selection models, which estimate site-specific amino-acid fitnesses. I demonstrate how this relationship reveals the behavioral properties, limitations, and applicabilities of different inference frameworks, which leads to concrete recommendations for how these models should best be employed in evolutionary sequence analysis. In Chapter 2, I develop a flexible and extendable software, implemented as a module in the Python programming language, for simulating sequences along phylogenies according to standard evolutionary models. This software platform provides an independent and user-friendly platform for testing model behavior, or indeed developing novel evolutionary models, thus enabling robust comparisons of modeling frameworks. In Chapter 3, I derive a mathematical relationship between dN/dS and amino-acid fitness values, and I show that mutation– selection models fully encompass information encoded in dN/dS models, provided that sequences are evolving under purifying selection. I further use this relationship to show that certain commonly-used dN/dS-based models are strongly and systematically biased. I additionally show that standard metrics used for model selection in phylogenetics (e.g. Akaike Information Criterion) may be positively misleading and indicate strong support for incorrect models. Finally, in Chapter 4, I apply the mathematical relationship developed in Chapter 3 to study the accuracy of two competing mutation–selection inference implementations, whose relative merits have been heavily debated in the literature. My approach demonstrates that mutation–selection inference platforms that treat amino-acid fitnesses as fixed-effect variables precisely estimate site-specific evolutionary constraints. By contrast, inference platforms that treat fitnesses as random-effect variables systematically underestimate the strength of natural selection across sites. Taken together, the work presented in this dissertation yields novel insights into how these popular evolutionary models can best be applied to sequence data, how their results should be interpreted, and finally how future model development should be conducted in order to yield robust and reliable inference methods.Item Protein design, modeling, and the evolution of proteins(2016-12) Jackson, Eleisha Lynnette; Wilke, C. (Claus); Moran, Nancy; Hofmann, Hans; Sullivan, Christopher; Ellington, AndrewProteins are crucial players in the functional processes that allow for cellular life. Changes in the sequences of proteins have consequences for how these proteins function. Therefore, the study of how proteins change over time has been a central question in the field of evolutionary biology. As our understanding of how proteins function and change increases, we are not only able to test our hypotheses but we are also able to design and model new proteins, which is the ultimate test of our knowledge of how proteins function. Using the information from our protein modeling attempts, we can learn more about how natural proteins function and change over time. In this dissertation, I used protein modeling techniques to understand protein evolution. In Chapter 2, I assessed how closely designed proteins recapitulate observed patterns in natural proteins. I have found that designing proteins with a flexible-backbone protocol results in site variability that more closely mimics what is seen in natural proteins. In addition, I have also found that, in designed proteins, hydrophobic residues are often underrepresented in the core of the protein. These results suggest that our scoring functions and/or backbone sampling methods could be further improved. In Chapter 3, I used protein design to predict site-wise evolutionary rates in proteins. I found that protein design is a poor predictor of evolutionary rate, explaining only approximately 7% of the variation in rate across sites in enzymes. In Chapter 4, I used protein design and homology modeling to predict tolerance to deletions in enhanced green fluorescent protein. I also compared these predictions to predictions made using other structural properties including solvent accessibility, local packing density and secondary structure. I found that when combining computational scores from modeled structures along with other structural properties (i.e., local packing density, solvent accessibility and secondary structure) as predictors, I was largely able predict whether or not a given deletion would result in a functional protein product. Finally, in Chapter 5, I developed a computational pipeline to assess binding affinity in protein-protein interactions. I used this pipeline to recapitulate patterns of Machupo virus entry across various species. Taken together, the work presented in this dissertation has given us insight into which structural constraints affect protein evolution.