Senior Thesis 2024
Computer Science Department
School of Computer Science, Carnegie Mellon University



RERconverge Expansion: Using Relative Evolutionary Rates
to Study Complex Categorical Trait Evolution

Ruby Redlich

Senior Thesis

May 2024

Thesis Document


Comparative genomics approaches seek to associate molecular evolution with the e volution of phenotypes across a phylogeny. Many of these methods, including our evolutionary rates-based method, RERconverge, lack the ability to analyze non-ordinal, multicategorical traits. To address this limitation, we introduce an expansion to RERconverge that associates shifts in evolutionary rates with the convergent evolution of multi-categorical traits. The categorical RERconverge expansion includes methods for performing categorical ancestral state reconstruction, statistical tests for associating relative evolutionary rates with categorical variables, and a new method for performing phylogeny-aware permutations, "permulations", on multi-categorical traits. In addition to demonstrating our new method on a three-category diet phenotype, we compare its performance to binary RERconverge analyses and two existing methods for comparative genomic analyses of categorical traits: phylogenetic simulations and a phylogenetic signal based method. Our results show that our new categorical method outperforms phylogenetic simulations at identifying genes and enriched pathways significantly associated with the diet phenotypes and that the categorical ancestral state reconstruction drives an improvement in our ability to capture diet-related enriched pathways compared to binary RERconverge when implemented without user input on phenotype evolution. Through investigation of the PIEZO1 gene, we also illustrate how diet-relevant genes detected by our method can possess convergent patterns of amino acid sequence change. An additional case study using the binary pair bonding phenotype illustrates how our categorical expansion can still be applied successfully to binary traits as indicated by our identification of relevant biological pathways related to male gametes, ovarian follicles, and behavioral response to drugs. The categorical expansion to RERconverge will provide a strong foundation for applying the comparative method to categorical traits on larger data sets with more species and more complex trait evolution than have previously been analyzed.

31 pages

Advisor
Andreas Pfenning
Mentors
Amanda Kowalczyk
Heather Sestili


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu