CSCA 5512: Data Mining Methods
Preview this course in the non-credit experience today!
Start working toward program admission and requirements right away. Work you complete in the non-credit experience will transfer to the for-credit experience when you upgrade and pay tuition. See How It Works for details.
Cross-listed with DTSA 5505
Course Type: Computer Science Elective
Specialization: Data Mining Foundations and Practice
Instructor: Dr. Qin (Christine) Lv, Associate Professor of Computer Science
Prior knowledge needed:
- Programming languages: Basic to intermediate experience with Python, Jupyter Notebook
- Math: Basic experience with Probability and Statistics, Linear Algebra
- Technical requirements: Windows or Mac, Linux, Jupyter Notebook
Learning Outcomes
Identify the core functionalities of data modeling in the data mining pipeline.
Apply techniques that can be used to accomplish the core functionalities of data modeling and explain how they work.
Evaluate data modeling techniques, determine which is most suitable for a particular task, and identify potential improvements.
Course Grading Policy
Assignment | Percentage of Grade |
---|---|
Programming Assignment: Frequent Pattern Analysis | 20% |
Programming Assignment: Classification | 20% |
Programming Assignment: Clustering | 20% |
Peer Review: Peer Review: Outlier Analysis, Research Frontiers | 20% |
CSCA 5512 Data Mining Methods Final Exam | 20% |
Course Content
Duration: 8 hours
This week starts with an overview of this course, Data Mining Methods, then focuses on frequent pattern analysis, including the Apriori algorithm and FP-growth algorithm for frequent itemset mining, as well as association rules and correlation analysis.
Duration: 6 hours
This week introduces supervised learning, classification, prediction, and covers several core classification methods including decision tree induction, Bayesian classification, support vector machines, neural networks, and ensemble methods. It also discusses classification model evaluation and comparison.
Duration: 6 hours
This week introduces you to unsupervised learning, clustering, and covers several core clustering methods including partitioning, hierarchical, grid-based, density-based, and probabilistic clustering. Advanced topics for high-dimensional clustering, bi-clustering, graph clustering, and constraint-based clustering are also discussed.
Duration: 5 hours
This week discusses three different types of outliers (global, contextual, and collective) and how different methods may be used to identify and analyze such outliers. It also covers some advanced methods for mining complex data, as well as the research frontiers of the data mining field.
Duration: 1.75 hours
This module contains materials for the final exam. The exam is proctored using ProctorU.
- You will need to arrange for a time to take the proctored exam.
- It is a one-hour exam.
- You may submit your answers only once.
- The exam contains only multi-choice questions.
- There are no programming questions in the exam.
- You are not allowed to use any notes or access other websites when you take your exam.
Notes
- Cross-listed Courses: Courses that are offered under two or more programs. Considered equivalent when evaluating progress toward degree requirements. You may not earn credit for more than one version of a cross-listed course.
- Page Updates: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click the View on Coursera button above for the most up-to-date information.