Introduction
This course introduces students to the fundamental challenges and techniques in computer vision, with a focus on extracting information from images using automatic methods (50 hours). It combines theoretical foundations with practical implementations and emphasizes current techniques based on deep learning models and architectures.
- Luis Baumela, 30h
- Roberto Valle, 20h
Unit 1: Introduction to Computer Vision
This initial unit presents the general context and challenges of computer vision:
- Understanding the goals of image-based perception.
- Common applications such as object detection and scene understanding.
- Key concepts in visual interpretation from a computational perspective.
Unit 2: Digital Image Processing
This unit covers essential preprocessing techniques for images:
- Pixel-level operations.
- Filtering and noise reduction.
- Feature extraction basics (edges, textures).
- Image transformation and enhancement methods.
It emphasizes practical use of deep learning to solve visual recognition and analysis tasks.
Unit 3: Introduction to Deep Models
This unit transitions into the use of deep learning for image analysis:
- Basics of convolutional neural networks (CNNs).
- Feature learning with deep architectures.
- Overfitting and regularization in vision tasks.
Unit 4: Deep Architectures for Vision
A more in-depth look into state-of-the-art deep learning models for computer vision:
- Modern CNN architectures (e.g., ResNet, U-Net).
- Transfer learning and pretrained models.
- Implementation frameworks (e.g., PyTorch, TensorFlow).
- Performance metrics in vision tasks.
Unit 5: Applications
This unit explores real-world applications and integration:
- Object recognition and scene segmentation.
- Image registration and motion tracking.
- Vision in robotics and autonomous systems.
Unit 6: Advanced Models
Final unit addressing cutting-edge research and complex vision tasks:
- Multi-camera systems and 3D reconstruction.
- Camera calibration and geometry.
- Correspondence matching and depth inference.
- Limitations and open challenges in current computer vision systems.