Computer Vision: Introduction

Image and Camera Fundamentals

Overview

The Introduction covers the foundations for studying computer vision: the structure of digital images, mathematical camera models, and the basics of coordinate transforms.

Learning Objectives

Understand the structure of digital images and color spaces
Understand the pinhole camera model
Master homogeneous coordinates and transformation matrices
Understand projection from 3D to 2D

Chapter 1 Digital Image Fundamentals
Pixels, resolution, bit depth
Chapter 2 Color Spaces
RGB, HSV, YUV, color conversion
Chapter 3 Pinhole Camera Model
Focal length, field of view, perspective projection
Chapter 4 Homogeneous Coordinates and Transformation Matrices
Unified representation of translation, rotation, and scaling
Chapter 5 Camera Matrix
Intrinsic parameters, extrinsic parameters
Chapter 6 Exercises
Summary exercises for the Introduction

Prerequisites

Basics of linear algebra (matrices, vectors)
High school mathematics (trigonometric functions)

Key Concepts

Pinhole Camera Model

Projection from a 3D point $(X, Y, Z)$ to image coordinates $(u, v)$:

$$\begin{pmatrix} u \\ v \\ 1 \end{pmatrix} \sim K \begin{pmatrix} X \\ Y \\ Z \end{pmatrix}, \quad K = \begin{pmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{pmatrix}$$

Homogeneous Coordinates

A 2D point $(x, y)$ is represented as $(x, y, 1)$, and a 3D point $(X, Y, Z)$ as $(X, Y, Z, 1)$.

This allows translation to be expressed as matrix multiplication.

Camera Matrix

$$P = K[R | t]$$

$K$: intrinsic parameters, $R$: rotation matrix, $t$: translation vector

Overview

Learning Objectives

Table of Contents

Prerequisites

Key Concepts