Introduction

This course introduces students to the fundamental challenges and techniques in computer vision, with a focus on extracting information from images using automatic methods (50 hours). It combines theoretical foundations with practical implementations and emphasizes current techniques based on deep learning models and architectures.

  • Luis Baumela, 30h
  • Roberto Valle, 20h

Unit 1: Introduction to Computer Vision

This initial unit presents the general context and challenges of computer vision:

  • Understanding the goals of image-based perception.
  • Common applications such as object detection and scene understanding.
  • Key concepts in visual interpretation from a computational perspective.

Unit 2: Digital Image Processing

This unit covers essential preprocessing techniques for images:

  • Pixel-level operations.
  • Filtering and noise reduction.
  • Feature extraction basics (edges, textures).
  • Image transformation and enhancement methods.

It emphasizes practical use of deep learning to solve visual recognition and analysis tasks.

Unit 3: Introduction to Deep Models

This unit transitions into the use of deep learning for image analysis:

  • Basics of convolutional neural networks (CNNs).
  • Feature learning with deep architectures.
  • Overfitting and regularization in vision tasks.

Unit 4: Deep Architectures for Vision

A more in-depth look into state-of-the-art deep learning models for computer vision:

  • Modern CNN architectures (e.g., ResNet, U-Net).
  • Transfer learning and pretrained models.
  • Implementation frameworks (e.g., PyTorch, TensorFlow).
  • Performance metrics in vision tasks.

Unit 5: Applications

This unit explores real-world applications and integration:

  • Object recognition and scene segmentation.
  • Image registration and motion tracking.
  • Vision in robotics and autonomous systems.

Unit 6: Advanced Models

Final unit addressing cutting-edge research and complex vision tasks:

  • Multi-camera systems and 3D reconstruction.
  • Camera calibration and geometry.
  • Correspondence matching and depth inference.
  • Limitations and open challenges in current computer vision systems.