This online data science specialization is intended for both data science professionals and domain experts who want to learn about fundamental concepts and core techniques in data mining for discovering patterns in large-scale data sets, with a specific focus on issues related to effectiveness and efficiency.

By completing this specialization, you will be able to:

  • Identify the core components of the data mining pipeline and describe their relative functionalities
  • Prepare, organize, and analyze data in order to build a data mining pipeline
  • Identify different data mining methods and apply them to appropriate problems
  • Evaluate which data mining methods work best in which scenarios and improve on those methods
  • Develop real-world solutions across the full data mining pipeline


  • Data Mining Pipeline
  • Data Mining Methods
  • Data Mining Projects

This specialization can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program.

Enroll Now