An RGB-D dataset and evaluation methodology for detection and 6D pose estimation of texture-less objects
30 industry-relevant objects: no discriminative color, no texture, often similar in shape, some objects are parts of others.
Three synchronized sensors used to capture the training and test images: Primesense CARMINE 1.09 (a structured-light RGB-D sensor), Microsoft Kinect v2 (a time-of-flight RGB-D sensor), and Canon IXUS 950 IS (a high-resolution RGB camera).
Training images (38K from each sensor) depict individual objects against a black background.
Test images (10K from each sensor) originate from 20 test scenes. The scene complexity varies from simple scenes with several isolated objects to very challenging ones with multiple object instances and a high amount of clutter and occlusion.
Two types of 3D models for each object: a manually created CAD model and a semi-automatically reconstructed one.
A new evaluation methodology which deals with pose ambiguity that can be caused by object symmetries and occlusions.
T. Hodaň, P. Haluza, Š. Obdržálek, J. Matas, M. Lourakis, X. Zabulis, T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects,
IEEE Winter Conference on Applications of Computer Vision (WACV), 2017, Santa Rosa, USA
Objects included in the dataset. Each object is captured from a systematically sampled view sphere - with 10° step in elevation (from 85° to -85°) and 5° step in azimuth.
Sample test images. The images are overlaid with colored 3D object models at the ground truth poses. Each test scene is captured from a systematically sampled view hemisphere - with 10° step in elevation (from 75° to 15°) and 5° step in azimuth.