9th International Workshop on Recovering 6D Object Pose (R6D)

Organized at ECCV 2024, September 29 AM, Milan

6D object pose estimation

Recording

The workshop was held at the ECCV 2024 conference in Milan on September 29, 2024. The workshop was streamed via Zoom and recorded. Anyone could have joined virtually, ECCV registration was not required.

The video was recorded and edited by Faraz Kamili.

Program

09:00Opening: Tomas Hodan (Meta)
09:10Talk 1: Xiaowei Zhou (Zhejiang University) - Detector-free feature matching for pose estimation
09:40Talk 2: Benjamin Busam (TUM) - Relaxing pose estimation: Unseen objects and visual challenges
10:10Coffee break at posters (posters of BOP winners, workshop papers and invited conference papers)
11:00Talk 3: Stan Birchfield (NVIDIA) - Practical robotic scene awareness
11:30Talk 4: Jakob Engel (Meta) - Towards Contextual AI: Tracking objects and our interactions with them
12:00BOP Challenge 2024: Van Nguyen Nguyen, Tomas Hodan, Martin Sundermeyer (all authors)
13:00End of workshop

News

Introduction

The R6D workshops cover topics related to vision-based estimation of 6D object pose (3D translation and 3D rotation), which is an important problem for application fields such as robotic manipulation, augmented reality, and autonomous driving. The 9th workshop edition organized at ECCV 2024 will feature four invited talks by experts in the field, presentation of the BOP Challenge 2024 awards, and oral/poster presentations of accepted workshop papers. The workshop will be attended by researchers working on related topics in both academia and industry.

Over the past five years, we have witnessed major improvements in model-based object pose estimation: the accuracy has increased by more than 50% since 2019, competitive results have been achieved when training only on synthetic images, and the top 2023 method for unseen objects (CAD models not available at training) achieves the same level of accuracy as the best 2020 method for seen objects (BOP'23 report). While the model-based problem is relevant for warehouse or factory settings, where CAD models of the target objects are often available, its applicability is limited in open-world scenarios.

Together with the workshop, we organize BOP Challenge 2024 focused on model-free 6D object pose estimation. In this problem setup, CAD models are not available and methods need to learn new objects in a short onboarding stage from a set of reference images. Such methods will minimize requirements for onboarding new objects and unlock new types of applications, including augmented-reality systems capable of prompt object indexing and re-identification.

Previous workshops: 1st edition (ICCV'15), 2nd edition (ECCV'16), 3rd edition (ICCV'17), 4th edition (ECCV'18), 5th edition (ICCV'19), 6th edition (ECCV'20), 7th edition (ECCV'22), 8th edition (ICCV'23).

BOP Challenge 2024

To measure the progress in the field of object pose estimation, we created the BOP benchmark in 2017 and have been organizing challenges on the benchmark datasets together with the R6D workshops since then. The field has come a long way, with the accuracy in model-based 6D localization of seen objects improving by more than 50% (from 56.9 to 85.6 AR). In 2023, as the accuracy in this classical task had been slowly saturating, we introduced a more practical yet more challenging task of model-based 6D localization of unseen objects, where new objects need to be onboarded just from their CAD models in max 5 min on 1 GPU. Notably, the best 2023 method for unseen objects achieved the same level of accuracy as the best 2020 method for seen objects, while its run time awaits improvements. Besides 6D object localization, we have been also evaluating 2D object detection and 2D object segmentation. See the BOP 2023 paper for details.

In 2024, we introduce a new model-free variant of all tasks, define a new 6D object detection task, and introduce three new publicly available datasets – HOT3D, HOPEv2, and HANDAL.

In the model-free tasks, methods can learn new objects from two types of onboarding videos:

Static onboarding: The object is static (standing on a desk) and the camera is moving around the object and capturing all possible object views. Two videos are available, one with the object standing upright and one with the object standing upside-down. Object masks and 6D object poses are available for all video frames, which may be useful for 3D object reconstruction using methods such as NeRF or Gaussian Splatting.

Dynamic onboarding: The object is manipulated by hands and the camera is either static (on a tripod) or dynamic (on a head-mounted device). Object masks for all video frames and the 6D object pose for the first frame are available, which may be useful for 3D object reconstruction using methods such as BundleSDF or Hampali et al. The dynamic onboarding setup is more challenging but more natural for AR/VR applications.

Three new datasets introduced in BOP Challenge 2024:

Join the challenge

Call for papers

We invite paper submissions about unpublished work. If accepted, the papers will be published in the ECCV workshop proceedings and presented at the workshop.

The papers must have 4–14 pages (additional pages containing only cited references are allowed), follow the format of the main conference (with exception of the number of pages and the submission system) and be submitted to the CMT system.

The covered topics include but are not limited to:

Dates

Paper submission deadline: July 19 July 23, 2024 (11:59PM PST)
Paper acceptance notification: August 9, 2024
Paper camera-ready version: August 23, 2024 (11:59PM PST)
Submission deadline for the BOP Challenge 2024: November 29, 2024 (11:59PM UTC)
Workshop date: September 29 AM, 2024

Organizers

Tomáš Hodaň, Reality Labs at Meta, tomhodan@meta.com
Martin Sundermeyer, Google, msundermeyer42@gmail.com
Van Nguyen Nguyen, ENPC ParisTech, vanngn.nguyen@gmail.com
Stephen Tyree, NVIDIA
Andrew Guo, University of Toronto, NVIDIA
Médéric Fourmy, Czech Technical University in Prague
Jonathan Tremblay, NVIDIA
Yann Labbé, Reality Labs at Meta
Eric Brachmann, Niantic
Bertram Drost, MVTec
Sindi Shkodrani, Reality Labs at Meta
Lukas Ranftl, MVTec
Carsten Steger, Technical University of Munich, MVTec
Vincent Lepetit, ENPC ParisTech
Carsten Rother, Heidelberg University
Stan Birchfield, NVIDIA
Jiří Matas, Czech Technical University in Prague