Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties

Georg Poier1, Konstantinos Roditakis2,3, Samuel Schulter1, Damien Michel2, Horst Bischof1, Antonis A. Argyros2,3

1Institute for Computer Graphics and Vision, Graz University of Technology
2Institute of Computer Science, FORTH
3Computer Science Department, University of Crete

This page is about our BMVC 2015 paper on hand pose estimation.

(a) A learned joint regressor might fail to recover the pose of a hand due to ambiguities or lack of training data. (b) We make use of the inherent uncertainty of a regressor by enforcing it to generate multiple proposals. The crosses show the top three proposals for the proximal interphalangeal joint of the ring finger for which the corresponding ground truth position is drawn in green. The marker size of the proposals corresponds to degree of confidence. (c) Our subsequent model-based optimisation procedure exploits these proposals to estimate the true pose.
Motivation figure part a; depth image of a hand overlaid with anatomically invalid hand pose estimation result Motivation figure part b; depth image of a hand overlaid with 3 proposal positions for the proximal interphalangeal joint of the ring finger, where the top confident proposal is at the wrong finger Motivation figure part c; depth image of a hand overlaid with anatomically valid pose closely resembling the true pose
(a) (b) (c)


Model-based approaches to 3D hand tracking have been shown to perform well in a wide range of scenarios. However, they require initialisation and cannot recover easily from tracking failures that occur due to fast hand motions. Data-driven approaches, on the other hand, can quickly deliver a solution, but the results often suffer from lower accuracy or missing anatomical validity compared to those obtained from model-based approaches. In this work we propose a hybrid approach for hand pose estimation from a single depth image. First, a learned regressor is employed to deliver multiple initial hypotheses for the 3D position of each hand joint. Subsequently, the kinematic parameters of a 3D hand model are found by deliberately exploiting the inherent uncertainty of the inferred joint proposals. This way, the method provides anatomically valid and accurate solutions without requiring manual initialisation or suffering from track losses. Quantitative results on several standard datasets demonstrate that the proposed method outperforms state-of-the-art representatives of the model-based, data-driven and hybrid paradigms.


The BMVC 2015 paper, extended abstract and slides:


Our results for the ICVL and NYU datasets: (19 MB)
(The package contains a readme and an example script describing the data.)
The results for the synthetic test sequence (TrackSeq) can be found together with the dataset (see below).

Synthetic Dataset

For the BMVC 2015 paper we used synthetically generated data to evaluate the influence of major processing steps and compare to the approach from FORTH. The datasets have been acquired at FORTH. The packages contain our results (for TrackSeq) as well as further descriptions and example scripts illustrating usage of the data.


  1. Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties (bib) (project)Georg Poier, Konstantinos Roditakis, Samuel Schulter, Damien Michel, Horst Bischof, and Antonis A. Argyros In Proc. British Machine Vision Conference (BMVC), 2015
    (oral presentation)

Copyright 2010 ICG

Valid XHTML 1.1 Valid CSS