Regularization is the process of adding information to an estimation problem so as to avoid extreme estimates. Put differently, it safeguards against foolishness. Both Bayesian and frequentist methods can incorporate prior information which leads to regularized estimates, but they do so in different ways. In this blog post, I illustrate these tw... Read more 15 Apr 2019 - 18 minute read

“Which variables are important?” is a key question in science and statistics. In this blog post, I focus on linear models and discuss a Bayesian solution to this problem using spike-and-slab priors and the Gibbs sampler, a computational method to sample from a joint distribution using only conditional distributions. Variable selection is a beas... Read more 31 Mar 2019 - 45 minute read

In a previous blog post, we looked at the history of least squares, how Gauss justified it using the Gaussian distribution, and how Laplace justified the Gaussian distribution using the central limit theorem. The Gaussian distribution has a number of special properties which distinguish it from other distributions and which make it easy to wor... Read more 28 Feb 2019 - 13 minute read

Judea Pearl said that much of machine learning is just curve fitting1 — but it is quite impressive how far you can get with that, isn’t it? In this blog post, we will look at the mother of all curve fitting problems: fitting a straight line to a number of points. In doing so, we will engage in some statistical detective work and discover the met... Read more 11 Jan 2019 - 19 minute read

The blog post reviews and summarizes the book “Ten Great Ideas about Chance” by Diaconis and Skyrms. A much shorter version of this review has been published in Significance, see here. In ten short chapters, Persi Diaconis and Brian Skyrms provide a bird’s eye perspective on probability theory and its connection to other disciplines. The book... Read more 11 Jan 2019 - 15 minute read