Publications

Attention maps for the 'Drink.Frombottle' action on Toyota-Smarthome (CS)

Pose-guided token selection for the recognition of activities of daily living

In this paper we propose an improved token selection method that integrates semantic information from the ADL recognition task with that of human motion.

Reducing Head Pose Estimation Data Set Bias With Synthetic Data

In this paper we report the existence of data set biases in the most widely used head pose estimation benchmarks, which lead to an optimistic estimation of model performance in real-world scenarios.

Ground-truth motion heatmap for different number of channels

Spatiotemporal Face Alignment for Generalizable Deepfake Detection

In this paper we propose a multi-task network which leverages spatiotemporal features extracted from video inputs to provide more robust predictions compared to image-only models.

Latency-accuracy comparison of mobile based architectures tested on a Google Pixel 4 using 256×256 images as input

Efficiency Evaluation of Mobile Vision Transformers

In this paper we evaluate the efficiency of the most popular mobile vision transformer models in terms of latency and accuracy on ImageNet-1k.

Comparison of Geodesic and Opal losses, and the influence functions (gradients) obtained by both

On the representation and methodology for wide and short range head pose estimation

In this paper we analyze the methodology for short- and wide-range HPE and discuss which representations and metrics are adequate for each case.

Simultaneous head pose estimation, facial landmark location and their visibility predictions when processing a video from 300VW

Multi-task head pose estimation in-the-wild

We show that the combination of head pose estimation and landmark-based face alignment significantly improve the performance of the former task.

Cascade of encoder-decoder CNNs with learned coordinates regressor for robust facial landmarks detection

In this paper we investigate the use of a cascade of Neural Net regressors to increase the accuracy of the estimated facial landmarks.

The face parts of 300W, COFW, AFLW and WFLW data bases in the fine stage of our coarse-to-fine ERT

Face Alignment using a 3D Deeply-initialized Ensemble of Regression Trees

In this paper we present a robust and efficient face alignment algorithm based on a coarse-to-fine cascade of ensembles of regression trees.

Representative results using CRN in 300W private

Facial Landmarks Detection using a Cascade of Recombinator Networks

In this paper we investigate the use of a cascade of CNN regressors to make the set of estimated landmarks lie closer to a valid face shape.

Example of a monolithic ERT regressor vs our coarse-to-fine approach

A Deeply-initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment

In this paper we present a real-time facial landmark regression method based on a coarse-to-fine Ensemble of Regression Trees (ERT).

Representative results with yaw errors greater than 15 degrees for AFLW

Benchmarking Head Pose Estimation In-the-Wild

In this paper we present a real-time facial landmark regression method based on a coarse-to-fine Ensemble of Regression Trees (ERT).

Estimation of head-pose orientation using different face image patches

Head-Pose Estimation In-the-Wild Using a Random Forest

In this paper we present a real-time algorithm that estimates the head-pose from unrestricted 2D gray-scale images.