DTSA 5727: Computational Bayesian Statistics in Data Science

  • Specialization: Bayesian Statistics for Data Science
  • Instructor: Brian Zaharatos
  • Prior knowledge needed: Statistics 

 View on Coursera    

Learning Outcomes

  • Articulate the need for computational approaches, such as Markov chain Monte Carlo (MCMC) algorithms, to Bayesian inference.
  • Implement various MCMC algorithms to find posterior distributions, including rejection sampling, Gibbs sampling, Metropolis-Hastings, and various advanced MCMC algorithms.
  • Implement Bayesian computation in the Stan computing environment.
  • Apply computational Bayesian statistical methods to real-world data science problems.

Course Content

Duration: 6h

Some Bayesian inference problems are easily solved with basic algebra and calculus. For example, with a beta prior distribution over the probability of success in a binomial process, it is easy to show that the posterior distribution over the probability of success is also a beta distribution. However, many other, more complicated problems are not as easily solved. Instead, they require computational methods for approximating posterior distributions and their summary statistics. In this module, students will learn some computational algorithms for posterior distribution summaries, including the gradient ascent algorithm for calculating the MAP (maximum a posteriori) estimator, and Monte Carlo methods for computing other summary statistics from the posterior distribution.

Duration: 4h

In this module, we introduce rejection sampling as a means of producing independent draws from a posterior density distribution where the density distribution's normalizing constant might not be known.

Duration: 2h

This module focuses on Gibbs sampling which is an Markov Chain Monte Carlo (MCMC) method for generating random draws from a posterior density distribution when the distribution of one model parameter conditioned on the other model parameters is known.

Duration: 7h

This module introduces the Metropolis sampling algorithm, another MCMC method for generating approximately independent, random draws from a posterior density distribution.  The module also covers the Metropolis-Hastings extension of the Metropolis sampling algorithm and ends with a brief overview of some of the adaptations to the Metropolis-Hastings algorithm. 

Duration: 9h

This module introduces STAN and demonstrates its use in R using Google Colab.  STAN provides an efficient implementation of an adaptive Metropolis-Hastings algorithm, to overcome some of the limitations of the Metropolis-Hastings algorithm.  

Duration: 2h

You will complete a proctored final exam worth 35% of your final grade. You must attempt the final in order to earn a grade in the course. If you've upgraded to the for-credit version of this course, please make sure you review the additional for-credit materials in the Introductory module and anywhere else they may be found.

Note: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click View on Coursera button above for the most up-to-date information.