Diffusion time-step curriculum for one image to 3D generation

Citation DataProceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition Conference (CVPR), Seattle, 2024 June 17-21, Page: 9948-9958

Publication Year2024

0
Citations
18
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Usage
18
- Downloads
  12
- Abstract Views
  6

Conference Paper Description

Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the studentteacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore, we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123), which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4, RealFusion15, GSO and Level50 benchmark demonstrate that DTC123 can produce multiview consistent, high-quality, and diverse 3D assets. Codes and more generation demos will be released in https: //github.com/yxymessi/DTC123

Bibliographic Details

REPOSITORY URLhttps://ink.library.smu.edu.sg/sis_research/9020

URL IDhttps://ink.library.smu.edu.sg/sis_research/9020; https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=10023&context=sis_research

AUTHOR(S)

Xuanyu YI; Zike WU; Qingshan XU; Pan ZHOU; Joo Hwee LIM; Hanwang ZHANG

PUBLISHER(S)

CVPR

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know