Skip to Content


No results


No results


No results

Search Results


No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Department of Statistics Dissertation Defenses

On Some Approximate Inference Approaches in Population Genetics

Yifan Jin

The study of evolution has been a central focus of biology for several centuries. One of the fields concerned with the evolutionary process is population genetics. Population genetics studies the genetic composition of populations. To extract useful information from the genetic data, one of the central problems the coalescent has become the primary tool for modeling genealogies. The coalescent was proposed by Kingman in a series of path-breaking papers. Many other authors subsequently built on Kingman’s ideas, leading to a rich understanding of many mathematical and theoretical aspects of evolution.

The coalescent can be interpreted as a probabilistic model for generating random gene trees. Subsequently, it was extended to model recombinations, and also to incorporate mutation processes along the tree. Although these models can in principle be used to compute the likelihood of a given genetic data set, it is not feasible to do so in practice in many cases of interest. This is especially true when the loci have a different evolutionary history due to recombination—then, in order to evaluate the likelihood function, one must integrate out an astronomical number of possible ancestry scenarios that could have generated the data. In order to lift the computational burden that arose in practice, various approximate models have been proposed.

Two of the most important approximations are the Li Stephens haplotype copying model, and the sequentially Markov coalescent. This thesis seeks to understand the fundamental aspects of these approximate inference approaches. Chapter 2 introduces concepts and previous work to provide the context necessary for the rest of the thesis. Chapter 3 consists of joint work focused on the Bayesian posterior consistency of the sequentially Markov coalescent and the ergodicity1 of the sequentially Markov coalescent process. By slightly modifying pairwise sequentially Markov coalescent in a way that does not adversely affect inference, we prove frequentist guarantees about its posterior distribution. We also analyze the ergodicity property of the underlying sequentially Markov coalescent process using the theory of piecewise deterministic Markov process. Chapter 4 first presents a new interpretation of the Li Stephens model in terms of changepoint detection. We derive a new, efficient algorithm for determining the complete solution surface of both the haploid and diploid variants of the Li Stephens algorithm. Chapter 5 is devoted to estimators which combined information from the sample frequency spectrum and pairwise sequentially Markov coalescent. We summarize the results of this dissertation and discuss the drawbacks and some potential directions of our work in Chapter 6.

Livestream Information

August 8, 2022 (Monday) 8:00am

Explore Similar Events

  •  Loading Similar Events...
Report Event As Inappropriate Contact Event Organizers
Back to Main Content