Southeastern Louisiana University
Modeling Heterogeneous Data Sources for Time-Scaling Phylogenetic Trees
Jeremy Brown, LSU
Full Project (May 1, 2019 - April 30, 2022)
Pilot Project (May 1, 2018 - April 30, 2019)
Inference of phylogenetic trees is of interest to researchers in many areas of biology. Phylogenetic trees are used in population-level analyses of evolution and ecology, studies of agriculture, and the study of medicine. Time-scaling of phylogenetic trees allows researchers to put absolute ages on the nodes in a phylogeny. In time-scaling a phylogeny, researchers disentangle the relationship between evolutionary rate and the time since lineages diverged (i.e., an ancestral population became two independent daughter lineages). This allows researchers to test hypotheses about the rate at which evolutionary change accumulates, how lineages diversify, and how phenotypic and genomic features evolve. These analyses, while powerful, are also parameter-rich and complex. While researchers are looking to perform increasingly complicated phylogenetic analyses, datasets are also growing larger and more heterogeneous. Molecular datasets often encompass many loci, if not whole genomes. Time-scaling a phylogeny involves external data sources from which absolute time can be derived. This means that researchers are often incorporating data that address the absolute age of lineages, such as geological features or occurrence records, into these analyses, substantially increasing the complexity of the models needed to describe the data. As a result, researchers are dealing with more and more data, while simultaneously attempting increasingly parameter-rich methods. Ants (Formicidae) are an excellent system to test if recent methodological advancements can be used more widely. There are large molecular, morphological, and age data resources available. Ants display many characteristics analogous to pathogenic evolution, such as periods of rapid lineage accumulation and strong among-lineage variation in their rate of evolution. It is, therefore, possible to test the predictive ability and analytical capability of new methods for modeling heterogeneous data in this system, and to make recommendations of broader applicability.