Improving the accuracy and realism of Bayesian phylogenetic analyses



Journal Title

Journal ISSN

Volume Title



Central to the study of Life is knowledge both about the underlying relationships among living things and the processes that have molded them into their diverse forms. Phylogenetics provides a powerful toolkit for investigating both aspects. Bayesian phylogenetics has gained much popularity, due to its readily interpretable notion of probability. However, the posterior probability of a phylogeny, as well as any dependent biological inferences, is conditioned on the assumed model of evolution and its priors, necessitating care in model formulation. In Chapter 1, I outline the Bayesian perspective of phylogenetic inference and provide my view on its most outstanding questions. I then present results from three studies that aim to (i) improve the accuracy of Bayesian phylogenetic inference and (ii) assess when the model assumed in a Bayesian analysis is insufficient to produce an accurate phylogenetic estimate. As phylogenetic data sets increase in size, they must also accommodate a greater diversity of underlying evolutionary processes. Partitioned models represent one way of accounting for this heterogeneity. In Chapter 2, I describe a simulation study to investigate whether support for partitioning of empirical data sets represents a real signal of heterogeneity or whether it is merely a statistical artifact. The results suggest that empirical data are extremely heterogeneous. The incorporation of heterogeneity into inferential models is important for accurate phylogenetic inference. Bayesian phylogenetic estimates of branch lengths are often wildly unreasonable. However, branch lengths are important input for many other analyses. In Chapter 3, I study the occurrence of this phenomenon, identify the data sets most likely to be affected, demonstrate the causes of the bias, and suggest several solutions to avoid inaccurate inferences. Phylogeneticists rarely assess absolute fit between an assumed model of evolution and the data being analyzed. While an approach to assessing fit in a Bayesian framework has been proposed, it sometimes performs quite poorly in predicting a model’s phylogenetic utility. In Chapter 4, I propose and evaluate new test statistics for assessing phylogenetic model adequacy, which directly evaluate a model’s phylogenetic performance.