CMU-CB-13-102 Lane Center for Computational Biology School of Computer Science, Carnegie Mellon University
Computational Methods for Learning Population History Ming-Chi Tsai July 2013 Ph.D. Thesis
Understanding how species have arisen, dispersed, and intermixed over time is a fundamental question in population genetics with numerous implications for basic and applied research. It is also only by studying the diversity in human and different species that we can understand what makes us different and what differentiates us from other species. More importantly, such analysis could give us insights into applied biomedical questions such as why some people are at a greater risk for diseases and why people respond differently to pharmaceutical treatments. While there are a number of methods available for the analysis of population history, most state-of-the-art algorithms only look at certain aspects of the whole population history. For example, phylogenetic approaches typically look only at non-admixed data in a small region of a chromosome while other alternatives examine only specific details of admixture events or their influence on the genome. We first describe a basic model of learning population history under the assumption that there was no mixing of individuals from different populations. The work presents the first model that jointly identifies population substructures and the relationships between the substructures directly from genetic variation data. The model presents a novel approach to learning population trees from large genetic datasets that collectively converts the data into a set of small phylogenetic trees and learns the robust population features across the tree set to identify the population history.
We further develop a method to accurately infer quantitative parameters, such
as the precise times of the evolutionary events of a population history from genetic
data. We first propose a basic coalescent-based MCMC model specifically for learning
time and admixture parameters from two-parental and one-admixed population
scenarios. As a natural extension,
160 pages | |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |