Affordance and 6DOF Pose in 3D Point Clouds.pptx

mozic0104 20 views 11 slides Jun 03, 2024
Slide 1
Slide 1 of 11
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11

About This Presentation

Affordance pose detection


Slide Content

Language-Conditioned Affordance-Pose Detection in 3D Point Clouds Toan Nguyen, Minh Nhat Vu, Baoru Huang, Tuan Van Vo, Vy Truong, Ngan Le, Thieu Vo, Bac Le, Anh Nguyen

Introduction

Introduction We address the task of language-conditioned affordance-pose joint learning in 3D point clouds. Given a 3D point cloud object, we detect the affordance region and generate appropriate 6-DoF poses for open-vocabulary affordance text input.

Key Contributions

Key Contributions 3DAPNet , a novel and effective method for the task of affordance-pose joint learning. 3DAP , a high-quality dataset of 3D point cloud objects with affordance language labels and affordance-specific 6-DoF poses. Our method is useful in several real-world robotic manipulation tasks.

3DAP Dataset

3DAP Dataset We collect affordance-annotated point clouds from 3D AffordanceNet [1]. Affordances are expressed in the form of language texts. We utilize 6-DoF GraspNet [2] to generate pose candidates and then manually select poses for different affordances. [1] Deng et al. , 3d affordancenet: A benchmark for visual object affordance understanding. In CVPR 2021 [2] Mousavian et al. , 6-dof graspnet: Variational grasp generation for object manipulation. In ICCV 2019

3DAPNet

3DAPNet Our 3DAPNet consists of two branches: 1. Open-vocabulary affordance detection branch: detects unlimited affordances. 2. Language-conditioned pose generation branch: produces poses conditioned on the point cloud and the input text. Our ContextNet combines two conditions for pose generation. Pose generation Affordance Detection ContextNet

Robotic Demonstrations

Thank you very much!
Tags