![]() |
Senior Thesis 2024 Computer Science Department School of Computer Science, Carnegie Mellon University
to Study Complex Categorical Trait Evolution Ruby Redlich Senior Thesis May 2024
Comparative genomics approaches seek to associate molecular evolution with the e volution of phenotypes across a phylogeny. Many of these methods, including our evolutionary rates-based method, RERconverge, lack the ability to analyze non-ordinal, multicategorical traits. To address this limitation, we introduce an expansion to RERconverge that associates shifts in evolutionary rates with the convergent evolution of multi-categorical traits. The categorical RERconverge expansion includes methods for performing categorical ancestral state reconstruction, statistical tests for associating relative evolutionary rates with categorical variables, and a new method for performing phylogeny-aware permutations, "permulations", on multi-categorical traits. In addition to demonstrating our new method on a three-category diet phenotype, we compare its performance to binary RERconverge analyses and two existing methods for comparative genomic analyses of categorical traits: phylogenetic simulations and a phylogenetic signal based method. Our results show that our new categorical method outperforms phylogenetic simulations at identifying genes and enriched pathways significantly associated with the diet phenotypes and that the categorical ancestral state reconstruction drives an improvement in our ability to capture diet-related enriched pathways compared to binary RERconverge when implemented without user input on phenotype evolution. Through investigation of the PIEZO1 gene, we also illustrate how diet-relevant genes detected by our method can possess convergent patterns of amino acid sequence change. An additional case study using the binary pair bonding phenotype illustrates how our categorical expansion can still be applied successfully to binary traits as indicated by our identification of relevant biological pathways related to male gametes, ovarian follicles, and behavioral response to drugs. The categorical expansion to RERconverge will provide a strong foundation for applying the comparative method to categorical traits on larger data sets with more species and more complex trait evolution than have previously been analyzed. 31 pages
Advisor
|
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |