Browsing by Subject "Molecular evolution"
Now showing 1 - 12 of 12
Results Per Page
Sort Options
Item Evolution of structure-function relationships in the GFP-family of proteins(2014-08) Modi, Chintan Kishore; Matz, Mikhail V.One of the most intriguing questions in evolutionary biology is how biochemical and structural complexity arise through small and incremental changes; however answering this question requires an explicit set of candidate residues and an experimental system in which to test them. This dissertation aims to understand how biochemical complexity evolves and assesses the structure-function relationship in the green fluorescent protein (GFP) protein family using an ancestral reconstruction approach. In the second chapter, I studied the evolution of biochemical complexity in Kaede-type red fluorescent proteins (FPs) from Faviina corals. An increase in biochemical complexity is represented by the emergence of red fluorescence because it necessitates the synthesis of a tri-cyclic chromophore from a precursor bi-cyclic chromophore through an additional autocatalytic reaction step. The autocatalytic reaction is fully enabled by as many as twelve historical mutations. Here, I showed that the red fluorescent chromophore evolved from an ancestral green chromophore by perturbing the ancestral protein stability at multiple levels of protein structure. Moreover, only three historical mutations are sufficient to initiate the selection-accessible evolutionary trajectory leading to emergence of red fluorescence. The third chapter investigates six mutations proximate to the chromophore in the Kaede-type FP that could have facilitated autocatalytic synthesis of the red chromophore by enlarging the chromophore-containing cavity and modifying its microenvironment. Two of these six mutations were found to strongly affect the protein’s stability and oligomeric tendency. Additionally, I showed that the dimeric least divergent Kaede-type FP, R1-2, evolved from the tetrameric green ancestor. Taken together the results of these studies indicate that the step-up in biochemical complexity in the Kaede-type FPs was achieved via disruption of the existing stable interactions at tertiary and quaternary protein structure levels. In the fourth chapter, I resurrected the common ancestor of all FPs cloned from the order Leptothecata (class Hydrozoa), which are characterized by the highest known homo-oligomeric diversity. I showed that the ancestor was a green monomeric FP with a large Stokes shift. The ancestral FP together with the extant Leptothecata FPs could server as a model system to study the evolution of function and homo-oligomerization, and the desirable photophysical characteristics would make this ancestral FP a useful bio-marker in bio-medical research.Item Evolutionary and functional analyses of primate genes reveal critical host-virus interactions(2014-12) Meyerson, Nicholas Ryan; Sawyer, Sara L.; Krug, Robert M; Dudley, Jaquelin P; Bull, James J; Ehrlich, Lauren IRViruses exert a tremendous evolutionary pressure on their hosts. By hijacking cellular machinery and resources, viruses have been wildly successful at infecting and propagating throughout all domains of life. In the following dissertation, the interactions between primates and some of the viruses that infect them are examined through an evolutionary lens. I begin by introducing the long-standing battle between mammals and viruses that has raged on for hundreds of millions of years. I propose a theoretical framework to understand how slowly evolving mammals are able to keep pace with rapidly evolving viruses, and how we might use this framework to monitor future virus outbreaks. The core of my analyses stems from an evolutionary concept known as the host-virus arms race. This tug-of-war for survival between hosts and viruses leaves an imprint in the DNA of each organism involved that can be detected using statistical analyses. In Chapter 2, I describe these analyses in great detail and perform many tests to ensure that they are being used and applied appropriately. The remainder of my studies focuses on detecting novel signatures of positive selection in primate genes that are likely caused by ancient host-virus arms races. I characterize the evolutionary history of several primate genes that have been implicated in viral life cycles and provide functional evidence that viruses drove their rapid divergence. In doing so I make three important discoveries. First, I characterize a genetic variant of CD4, the cellular receptor for HIV-1, in an owl monkey species that could make them a viable HIV-1 model system. Second, I show that gorilla-specific mutations in RANBP2, a gatekeeper of the cell nucleus, can inhibit HIV-1 infection. And finally, evolutionary signatures in TRIM25, a component of the innate immune system, revealed its ability to inhibit influenza A virus replication by binding incoming viral ribonucleoproteins.Item High throughput directed enzyme evolution using fluorescence activated cell sorting(2003-05) Olsen, Mark Jon; Iverson, Brent L.; Georgiou, GeorgeItem Investigating the behaviors and limitations of phylogenetic models of protein-coding sequence evolution(2016-05) Spielman, Stephanie Jill; Wilke, C. (Claus); Bull, James; Barrick, Jeffrey; Hillis, David; Hofmann, HansProbabilistic models which infer the strength and direction of natural selection from protein-coding sequences are among the most widely-used tools in comparative sequence analysis. A variety of phylogenetic models of coding-sequence evolution have been developed. However, these models have been produced independently from one another. As a consequence, it has been entirely unknown whether inferences from different models reveal similar or incompatible information about the evolutionary process. In this dissertation, I derive and study the mathematical relationship between two probabilistic models of protein-coding sequence evolution: dN/dS-based models, which estimate evolutionary rates, and mutation–selection models, which estimate site-specific amino-acid fitnesses. I demonstrate how this relationship reveals the behavioral properties, limitations, and applicabilities of different inference frameworks, which leads to concrete recommendations for how these models should best be employed in evolutionary sequence analysis. In Chapter 2, I develop a flexible and extendable software, implemented as a module in the Python programming language, for simulating sequences along phylogenies according to standard evolutionary models. This software platform provides an independent and user-friendly platform for testing model behavior, or indeed developing novel evolutionary models, thus enabling robust comparisons of modeling frameworks. In Chapter 3, I derive a mathematical relationship between dN/dS and amino-acid fitness values, and I show that mutation– selection models fully encompass information encoded in dN/dS models, provided that sequences are evolving under purifying selection. I further use this relationship to show that certain commonly-used dN/dS-based models are strongly and systematically biased. I additionally show that standard metrics used for model selection in phylogenetics (e.g. Akaike Information Criterion) may be positively misleading and indicate strong support for incorrect models. Finally, in Chapter 4, I apply the mathematical relationship developed in Chapter 3 to study the accuracy of two competing mutation–selection inference implementations, whose relative merits have been heavily debated in the literature. My approach demonstrates that mutation–selection inference platforms that treat amino-acid fitnesses as fixed-effect variables precisely estimate site-specific evolutionary constraints. By contrast, inference platforms that treat fitnesses as random-effect variables systematically underestimate the strength of natural selection across sites. Taken together, the work presented in this dissertation yields novel insights into how these popular evolutionary models can best be applied to sequence data, how their results should be interpreted, and finally how future model development should be conducted in order to yield robust and reliable inference methods.Item Ion channels and the tree of life(2014-12) Liebeskind, Benjamin Joseph; Zakon, H. H.; Hillis, David M., 1958-; Aldrich, Richard W; Hofmann, Hans A; Matz, Mikhail VThe field of comparative neurobiology has deep roots. I will begin by giving an overview of the parts of its history that I feel are most relevant for this dissertation. Within this history lies a wealth of zoological research and penetrating theories that are underutilized by modern evolutionary biologists. The age of whole-genome sequencing provides a perfect opportunity to revisit and perhaps update this corpus to better understand the phylogenetic history of organismal behavior. The first three chapters of my dissertation will be case studies on the evolution of sodium-selective ion channels. Sodium channels are responsible for much of the electrical signaling in animal nervous systems and muscles, but their evolutionary relationships have not yet been explored with the modern tools of phylogenetics and comparative genomics. Chapter 1 will deal with the classic Nav channels which create action potentials in nerves and muscles. There I will show that this gene family pre-dates the nervous system and even animal multicellularity. Chapter two will investigate sodium leak channels, which likley create the leak conductance measured by Hodgkin and Huxley. These channels turn out to be close relatives of fungal calcium channels, a relationship which illuminates the evolution of both groups. Chapter three is on bacterial sodium channels and their use as models for other sodium channel types. The final chapter will turn away from sodium channels in particular and discuss the evolution of animal nervous systems by means of ion channel genomics. In that chapter I will show that the genomic complements of ion channels that animals with nervous systems possess evolved independently to large degree, and that the early evolution of nervous systems also involved periods of gene loss. I will end with a more general discussion of convergent evolution, a key theme of this dissertation, and its effect on comparative analyses in the age of genomics.Item Matrix and tensor decomposition methods as tools to understanding sequence-structure relationships in sequence alignments(2010-12) Muralidhara, Chaitanya; Alter, Orly, 1964-; Gutell, RobinWe describe the use of a tensor mode-1 higher-order singular value decomposition (HOSVD) in the analyses of alignments of 16S and 23S ribosomal RNA (rRNA) sequences, each encoded in a cuboid of frequencies of nucleotides across positions and organisms. This mode-1 HOSVD separates the data cuboids into combinations of patterns of nucleotide frequency variation across the positions and organisms, i.e., "eigenorganisms"' and corresponding nucleotide-specific segments of "eigenpositions," respectively, independent of a-priori knowledge of the taxonomic groups and their relationships, or the rRNA structures. We show that this mode-1 HOSVD provides a mathematical framework for modeling the sequence alignments where the mathematical variables, i.e., the significant eigenpositions and eigenorganisms, are consistent with current biological understanding of the 16S and 23S rRNAs. First, the significant eigenpositions identify multiple relations of similarity and dissimilarity among the taxonomic groups, some known and some previously unknown. Second, the corresponding eigenorganisms identify positions of nucleotides exclusively conserved within the corresponding taxonomic groups, but not among them, that map out entire substructures inserted or deleted within one taxonomic group relative to another. These positions are also enriched in adenosines that are unpaired in the rRNA secondary structure, the majority of which participate in tertiary structure interactions, and some also map to the same substructures. This demonstrates that an organism's evolutionary pathway is correlated and possibly also causally coordinated with insertions or deletions of entire rRNA substructures and unpaired adenosines, i.e., structural motifs which are involved in rRNA folding and function. Third, this mode-1 HOSVD reveals two previously unknown subgenic relationships of convergence and divergence between the Archaea and Microsporidia, that might correspond to two evolutionary pathways, in both the 16S and 23S rRNA alignments. This demonstrates that even on the level of a single rRNA molecule, an organism's evolutionary pathway is composed of different types of changes in structure in reaction to multiple concurrent evolutionary forces.Item Molecular evolution in new world leaf-nosed bats of the family Phyllostomidae with comments on the superfamily Noctilionoidea(Texas Tech University, 1981-05) Honeycutt, Rodney LNot availableItem Molecular phylogenetics of the genus Sigmodon based on nuclear and mitochondrial DNA sequences(Texas Tech University, 2002-08) Carroll, Darin SNot availableItem Molecular systematics of the genus Sigmodon(Texas Tech University, 1998-12) Peppers, Lottie LThe genus Sigmodon is comprised of six North American (S. alleni, S. arizonae, S.fidviventer, S. leucotis, S. mascotensis, and S. ochrognathus), three South American (5. alstoniy S. inopinatus^ and S. peruanus), and one widespread species {S. hispidus) found in both North and South America. Phylogenetic relationships among cotton rats are poorly understood due to a lack of morphological differentiation among taxa and the absence of comprehensive studies involving both North and South American forms. This study addressed phylogenetic relationships among members of the hispidus ondfulviventer groups, monophyly of South American taxa, and genetic differentiation within 5. Hispidus by assessing nucleotide sequence variation in the mitochondrial cytochrome b gene. Results of phylogenetic analyses do not support the conventional hispidus and fulviventer groups. South American taxa do not form a monophyletic unit, and 5. hispidus appears to contain several cryptic species. Percent sequence divergence values between currently recognized species ranged from 8.2 to 21.8 percent, and indicate either an earlier date of diversification than traditionally accepted, or an accelerated rate of molecular evolution.Item Plastid genome rearrangement, gene loss, and sequence divergence in geraniaceae, passifloraceae, and annonaceae.(2013-12) Blazier, John Christensen; Jansen, Robert K., 1954-Plastid genomes of flowering plants are largely identical in gene order and content, but a few lineages have been identified with many gene and intron losses, genomic rearrangements, and accelerated rates of nucleotide substitutions. These aberrant lineages present an opportunity to understand the modes of selection acting on these genomes as well as their long-term stability. My research has focused on two areas within plastid genome evolution in Geraniaceae: first, an investigation of the diversity of unusual plastid genomes in a single genus, Erodium (Geraniaceae) for chapters one and three. Chapter two focuses on the evolution of subunits of the plastid-encoded RNA polymerase (PEP). The first chapter described the loss of plastid-encoded NADPH dehydrogenase (ndh) genes from a clade of 13 Erodium species. Divergence time estimates indicate this clade is less than 5 million years old. This recent loss of ndh genes in Erodium presents an opportunity to investigate changes in photosynthetic function through comparative biochemistry between Erodium species with and without plastid-encoded ndh genes. Second, I examined the evolution of the gene encoding the alpha subunit (rpoA) of PEP in three disparate angiosperm lineages—Pelargonium (Geraniaceae), Passiflora (Passifloraceae), and Annonaceae—in which this gene has diverged so greatly that it is barely recognizable. PEP is conserved in the plastid genomes of all photosynthetic angiosperms. I found multiple lines of evidence indicating that the genes remain functional despite retaining only ~30% sequence identity with rpoA genes from outgroups. The genomes containing these divergent rpoA genes have undergone significant rearrangement due to illegitimate recombination and gene conversion, and I hypothesized that these phenomena have also driven the divergence of rpoA. Third, I conducted a survey of plastid genome evolution in Erodium with the completion of 15 additional whole genomes. Except for Erodium and some legumes, all angiosperm plastid genomes share a quadripartite structure with large and small single copy regions (LSC, SSC) and two inverted repeats (IR). I discovered a species of Erodium that has re-formed a large inverted repeat. Demonstrating a precedent for loss and regain of the IR also impacts models of evolution for other highly rearranged plastid genomes.Item The role of structure in protein evolution(2014-12) Meyer, Austin Garig; Wilke, C. (Claus)Identifying sites under evolutionary pressure and predicting the effects of substitutions at those sites are among the greatest standing problems in bioinformatics and computational biology. Moreover, the two problems have traditionally been separated by the enormous chasm that exists between molecular evolutionary biologists interested in the evolutionary process and theoretical chemists interested in free energy changes. As a result, identifying sites under selective pressure has most often left out any semblance of structural biology and biochemistry; likewise, theoretical chemistry tends to rely strictly on first principles calculations rather than thinking first about biologically simple and interpretable results. Here, I have tried to integrate these two intuitions with regard to protein function and evolution. First, I developed a model that implements structural measurements into a traditional structure-blind molecular evolutionary model. This structure-aware model performs significantly better at identifying sites under both purifying and diversifying selection than its structure-blind counter part. Second, I go further to understand the extent to which structural features of any kind can predict the evolutionary process. By comparing site-wise evolution between human and avian influenza, I find that structural features can account for 24% to 36% of the evolutionary pressure on influenza hemagglutinin. Third, I developed a computational method based on first principles molecular dynamics simulations to predict the biological effect of substitutions in the Machupo virus--Human receptor protein--protein interface. I found that relatively simple energetic proxies offer a reasonable substitute for rigorous free energy calculations; such simple proxies could allow non-experts to naively implement first principles methods without being forced to consider all possible degrees of freedom for post hoc calculations.Item A snapshot of the unity and diversity of biological systems at the level of chemistry : structural and mechanistic studies of Cg10062, a homologue of cis-3-chloroacrylic acid dehalogenase, FG41 malonate semialdehyde decarboxylase and the catalytic domain of pyruvate dehydrogenase phosphatase 1(2010-05) Guo, Youzhong, 1974-; Hackert, Marvin L.; Whitman, Christian P.; Zhang, Zhiwen; Fast, Walter L.; Liu, Hung-wenThe tautomerase superfamily is composed of a group of proteins characterized by two key features: the N-terminal proline and a beta-alpha-beta-motif. This superfamily has been divided into five families represented by 4-oxalocrotonate tautomerase (4-OT), 5-(carboxymethyl)-2-hydroxymuconate isomerase (CHMI), cis-3-chloroacrylic acid dehalogenase (cis-CaaD), malonate semialdehyde decarboxylase (MSAD), and macrophage migration inhibitory factor (MIF). Cg10062 is a homologue of cis-CaaD, but has several distinct biochemical properties from cis-CaaD. For example, Cg10062 can be irreversibly inhibited by (R)- or (S)-oxirane-2-carboxylate, whereas cis-CaaD can only be irreversibly inhibited by (R)-oxirane-2-carboxylate. FG41MSAD is a homologue of MSAD, with comparable decarboxylase activity but missing Arg-73 known to be crucial for the MSAD activity. In order to understand the unique biochemical characteristics of Cg10062 and FG41MSAD, we have solved five crystal structures. These crystal structures have established a solid structural basis for understanding the mechanisms of their activities. The eukaryotic protein phosphatases are composed of a group of proteins that are responsible for reversible phosphorylation. The eukaryotic protein phosphatases have been divided into three families, the phosphoprotein phosphatase (PPP) family, the protein phosphatase Mg2+- or Mn2+-dependent (PPM) family and the protein Tyr phosphatase (PTP) family. PDP1 is a member of PPM family. PDP1 is also an important component of the large pyruvate dehydrogenase complex (PDC) which catalyzes the decarboxylation of pyruvate to yield acetyl-CoA with the accompanying reduction of NAD+. In order to understand the mechanism in which it dephosphorylates its target protein we have solved the structure of the catalytic domain of PDP1. Analysis of these structures in the light of their evolutionary contexts enables us to appreciate the unity and diversity of the biological systems at the chemical level and help us solve interesting problems, such as the possible physiological functions for some members within the tautomerase superfamily.