Brief Description of Course Content
Introduces students to the tools methods and theory behind extracting insights from data. Covers algorithms of cleaning and munging data, probability theory and common distributions, statistical simulation, drawing inferences from data, and basic statistical modeling.
Specific Goals for the Course
- Recognize the importance of data collection, identify limitations in data collection methods and other sources of statistical bias, and determine their implications and how they affect the scope of inference.
- Use statistical software to summarize data numerically and visually, and to perform data analysis.
- Have a conceptual understanding of the unified nature of statistical inference.
- Apply estimation and testing methods to analyze single variables or the relationship between two variables in order to understand natural phenomena and make data-based decisions.
- Model numerical response variables using a single explanatory variable or multiple explanatory variables in order to investigate relationships between variables.
- Interpret results correctly, effectively, and in context without relying on statistical jargon.
- Critique data-based claims and evaluate data-based decisions.
- Data Exploration and Probability
- Conditional probability and Bayes rule
- Discrete/continuous random variables and computing with distributions
- Joint distributions, covariance, correlation and sums of random variables
- Using Jupyter python environment
- Python tools for data science – NumPy and Pandas
- Basic statistical estimation, random samples, bootstrap and resampling techniques, unbiased estimators and confidence intervals for measure data
- Linear Regression and classification
- Maximum likelihood estimation and analysis of variance
Counting theory, Probabilities, Integration
Return to Course List