CSCA 5502: Data Mining Pipeline
Preview this course in the non-credit experience today!
Start working toward program admission and requirements right away. Work you complete in the non-credit experience will transfer to the for-credit experience when you upgrade and pay tuition. See How It Works for details.
Cross-listed with DTSA 5504
Course Type: Computer Science Elective
Specialization: Data Mining Foundations and Practice
Instructor: Dr. Qin (Christine) Lv, Associate Professor of Computer Science
Prior knowledge needed:
- Programming languages: Basic to intermediate experience with Python, Jupyter Notebook
- Math: Basic experience with Probability and Statistics, Linear Algebra
- Technical requirements: Windows or Mac, Linux, Jupyter Notebook
Learning Outcomes
Identify the key components of the data mining pipeline and describe how they're related
Apply techniques to address challenges in each component of the data mining pipeline.
Identify particular challenges presented by each component of the data mining pipeline.
Course Grading Policy
Assignment | Percentage of Grade |
---|---|
Peer Review: Data Mining Example | 10% |
Peer Review: Data Mining Issues | 10% |
Programming Assignment: Data Understanding | 20% |
Programming Assignment: Data Preprocessing | 20% |
Programming Assignment: Data Warehousing | 20% |
CSCA 5502 Data Mining Pipeline Final Exam | 20% |
Course Content
Duration: 7 hours
This week provides you with an introduction to the Data Mining Specialization and this course, Data Mining Pipeline. As you begin, you will get introduced to the four views of data mining and the key components in the data mining pipeline.
Duration: 5.5 hours
This week covers data understanding by identifying key data properties and applying techniques to characterize different datasets.
Duration: 5.25 hours
This week explains why data preprocessing is needed and what techniques can be used to preprocess data.
Duration: 5 hours
This week covers the key characteristics of data warehousing and the techniques to support data warehousing.
Duration: 1.75 hours
This module contains materials for the final exam. This exam is a proctored exam administered through ProctorU.
- You will need to arrange for a time to take the proctored exam.
- It is a one-hour exam.
- You may submit your answers only once.
- The exam contains only multi-choice questions.
- There are no programming questions in the exam.
- You are not allowed to use any notes or access other websites when you take your exam.
- The exam tests conceptual understanding of the course materials. There is no need to memorize formulas.
Notes
- Cross-listed Courses: Courses that are offered under two or more programs. Considered equivalent when evaluating progress toward degree requirements. You may not earn credit for more than one version of a cross-listed course.
- Page Updates: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click the View on Coursera button above for the most up-to-date information.