The role of structure in protein evolution

Date

2014-12

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Identifying sites under evolutionary pressure and predicting the effects of substitutions at those sites are among the greatest standing problems in bioinformatics and computational biology. Moreover, the two problems have traditionally been separated by the enormous chasm that exists between molecular evolutionary biologists interested in the evolutionary process and theoretical chemists interested in free energy changes. As a result, identifying sites under selective pressure has most often left out any semblance of structural biology and biochemistry; likewise, theoretical chemistry tends to rely strictly on first principles calculations rather than thinking first about biologically simple and interpretable results. Here, I have tried to integrate these two intuitions with regard to protein function and evolution. First, I developed a model that implements structural measurements into a traditional structure-blind molecular evolutionary model. This structure-aware model performs significantly better at identifying sites under both purifying and diversifying selection than its structure-blind counter part. Second, I go further to understand the extent to which structural features of any kind can predict the evolutionary process. By comparing site-wise evolution between human and avian influenza, I find that structural features can account for 24% to 36% of the evolutionary pressure on influenza hemagglutinin. Third, I developed a computational method based on first principles molecular dynamics simulations to predict the biological effect of substitutions in the Machupo virus--Human receptor protein--protein interface. I found that relatively simple energetic proxies offer a reasonable substitute for rigorous free energy calculations; such simple proxies could allow non-experts to naively implement first principles methods without being forced to consider all possible degrees of freedom for post hoc calculations.

Description

text

Citation

Collections