*Sponsored by the Department of Biological Sciences and the Center for Computation & Technology*

Our knowledge about much of biology is indirect: rather than directly observing a process we observe some noisy result of that process. In addition, we effectively never have a complete mathematical description mapping events to the observations. Given these challenges, what is the right mental framework to use to understand computational biology? In this talk I will describe the use of probabilistic models to learn from biological data. I will start by drawing a close analogy to the more familiar terrain of solving equations and performing integration in mathematics, and then describe how these same concepts can be generalized to the probabilistic setting. I will illustrate how this works in practice with examples from our research into the formation of antibody-making B cells and the reconstruction of evolutionary trees, as well as point out the many opportunities for innovations in this area.

My group develops and applies evolutionary methods for molecular sequence data (i.e. DNA and RNA). We enjoy all facets of computational biology research, from diving deeply into biological questions, to mathematical and statistical analysis, algorithm development, and efficient algorithm implementation. Our recent work has developed new methods to analyze metagenomic, viral, and immune cell sequence data, as well as pursued more pure methodological questions in evolutionary tree reconstruction. We also work to improve the software environment for computational biologists, both by developing our own open-source tools and contributing to work on larger projects. For more details see http://matsen.fhcrc.org/