DTSA 5512 Introduction to Computer Vision

Same as CSCA 5222 

Specialization: Computer Vision

Instructors: Dr. Tom Yeh

Prior knowledge needed: Basic to intermediate Linear Algebra, Trigonometry, Vectors & Matrices

View on CourseraCourse Syllabus

Learning Outcomes

  • Analyze a complex computing problem and to apply principles of computing and other relevant disciplines to identify solutions.
  • Design, implement, and evaluate a computing-based solution to meet a given set of computing requirements in the context of the program’s discipline.
  • Communicate effectively in a variety of professional contexts.
  • Recognize professional responsibilities and make informed judgments in computing practice based on legal and ethical principles.
  • Function effectively as a member or leader of a team engaged in activities appropriate to the program’s discipline.
  • Apply computer science theory and software development fundamentals to produce computing-based solutions. 

Course Content

Duration: 6h

Welcome to Introduction to Computer Vision, the first course in the Computer Vision specialization. In this first module, you'll be introduced to how this course operates "by Hand" and "in Excel." Then, you'll build a foundation in image matrices and arrays to explore different image types: binary, grayscale, and RGB. Next, you'll transition into using functions to perform basic image operations such as addition, negation, and masking. You'll then be introduced to the concept of image transformation through linear algebra. Finally, you'll perform translation, scaling, and rotation matrix operations.

Duration: 3h

This module dives into feature extraction—quantitative measures that describe image content. Students compute features such as image mass, center, and statistical moments to describe the shape and structure of images. These are implemented both manually and in Excel. The module also explores how to compare images using distance metrics and similarity measures, offering insight into how visual data can be analyzed, categorized, and classified.

Duration: 3h

Filtering techniques are central to detecting patterns in images. This module introduces learners to 1D and 2D filters, covering foundational concepts like convolution, cross-correlation, and Gaussian smoothing. Through both manual and spreadsheet-based exercises, learners apply various filters (e.g., mean, Laplacian, Sobel) and morphological operations like dilation and erosion. These filtering methods enhance image features, detect edges, and prepare data for further processing.

Duration: 3h

This module delves into key concepts of camera models and their role in computer vision and photogrammetry. You will learn about the Extrinsic Matrix, exploring how it defines the position and orientation of a camera in 3D space. Understand the Pinhole Camera Model, a simplified optical system that forms the basis for many computer vision applications, alongside the Intrinsic Matrix, which captures the internal parameters of the camera. Epipolar geometry is examined, with a focus on its significance in 3D reconstruction and stereo vision. The module covers the motivation behind epipolar geometry, breaking down its basic components, and explaining the Essential Matrix, which encapsulates the geometric relationship between camera views, as well as the Fundamental Matrix, a core component in epipolar geometry that represents the relationship between two cameras in stereo vision.

Duration: 2h 10m

You will complete a non-proctored exam worth 20% of your grade. You must attempt the final in order to earn a grade in the course. If you've upgraded to the for-credit version of this course, please make sure you review the additional for-credit materials in the Introductory module and anywhere else they may be found.

Note: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click View on Coursera button above for the most up-to-date information.