DTSA 5003 Statistical Inference and Hypothesis Testing in Data Science Applications

Specialization: Data Science Foundations: Statistical Inference
Instructor: Dr. Jem Corcoran, Associate Professor in Applied Mathematics
Prior knowledge needed: Probability Theory: Foundation for Data Science and Statistical Inference for Estimation in Data Science

Learning Outcomes

Successful completion of this course demonstrate your achievement of the following learning outcomes for the MS-DS program:

Define a composite hypothesis and the level of significance for a test with a composite null hypothesis.
Define a test statistic, level of significance, and the rejection region for a hypothesis test. Give the form of a rejection region.
Perform tests concerning a true population variance.
Compute the sampling distributions for the sample mean and sample minimum of the exponential distribution.

Course Content

Module 1 | Fundamental Concepts of Hypothesis Testing

Duration: 8h

In this module, we will define a hypothesis test and develop the intuition behind designing a test. We will learn the language of hypothesis testing, which includes definitions of a null hypothesis, an alternative hypothesis, and the level of significance of a test. We will walk through a very simple test.

Module 2 | Composite Tests, Power Functions, and P-Values

Duration: 8h

In this module, we will expand the lessons of Module 1 to composite hypotheses for both one and two-tailed tests. We will define the “power function” for a test and discuss its interpretation and how it can lead to the idea of a “uniformly most powerful” test. We will discuss and interpret “p-values” as an alternate approach to hypothesis testing.

Module 3 | t-Tests and Two-Sample Tests

Duration: 8h

In this module, we will learn about the chi-squared and t distributions and their relationships to sampling distributions. We will learn to identify when hypothesis tests based on these distributions are appropriate. We will review the concept of sample variance and derive the “t-test”. Additionally, we will derive our first two-sample test and apply it to make some decisions about real data.

Module 4 | Beyond Normality

Duration: 4h

In this module, we will consider some problems where the assumption of an underlying normal distribution is not appropriate and will expand our ability to construct hypothesis tests for this case. We will define the concept of a “uniformly most powerful” (UMP) test, whether or not such a test exists for specific problems, and we will revisit some of our earlier tests from Modules 1 and 2 through the UMP lens. We will also introduce the F-distribution and its role in testing whether or not two population variances are equal.

Module 5 | Likelihood Ratio Tests and Chi-Squared Tests

Duration: 7h

In this module, we develop a formal approach to hypothesis testing, based on a “likelihood ratio” that can be more generally applied than any of the tests we have discussed so far. We will pay special attention to the large sample properties of the likelihood ratio, especially Wilks’ Theorem, that will allow us to come up with approximate (but easy) tests when we have a large sample size. We will close the course with two chi-squared tests that can be used to test whether the distributional assumptions we have been making throughout this course are valid.

Module 6 | Final Exam

Duration: 4.5h

You will complete a proctored exam worth 21% of your grade made up of multiple choice and free response questions. You must attempt the final in order to earn a grade in the course. If you've upgraded to the for-credit version of this course, please make sure you review the additional for-credit materials in the Introductory module and anywhere else they may be found.

Note: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click View on Coursera button above for the most up-to-date information.

Search

Other ways to search:

DTSA 5003 Statistical Inference and Hypothesis Testing in Data Science Applications

Learning Outcomes

Course Content