lobifestival.blogg.se - Content 3d fnable

#CONTENT 3D FNABLE UPDATE#

# choose stable-diffusion version (support 1.5, 2.0 and 2.1, default is 2.1 now) Python3 main.py -text "a hamburger " -workspace trial -O -backbone grid_taichi Python main.py -file scripts/res64.args -workspace trial_awesome_hamburger -text "a photo of an awesome hamburger " # use CUDA-free Taichi backend with `-backbone grid_taichi` Note that quoted strings can't be loaded from. You can override arguments by specifying them after `-file`.

Python main.py -text "a hamburger " -workspace trial -O -vram_O # reduce stable-diffusion memory usage with `-vram_O` # enable various vram savings (). Python main.py -text "a hamburger " -workspace trial -O # stable-dreamfusion setting # Instant-NGP NeRF Backbone # + faster rendering speed # + less GPU memory (~16G) # - need to build CUDA extensions (a CUDA-free Taichi backend is available) # train with text prompt (with the default settings) # `-O` equals `-cuda_ray -fp16` # `-cuda_ray` enables instant-ngp-like occupancy grid based acceleration.

We use the multi-resolution grid encoder to implement the NeRF backbone (implementation from torch-ngp), which enables much faster rendering (~10FPS at 800x800).

Therefore, we need the loss to propagate back from the VAE's encoder part too, which introduces extra time cost in training. Different from Imagen, Stable-Diffusion is a latent diffusion model, which diffuses in a latent space instead of the original image space.

Since the Imagen model is not publicly available, we use Stable Diffusion to replace it (implementation from diffusers).

The current generation quality cannot match the results from the original paper, and many prompts still fail badly! Notable differences from the paper

This project is a work-in-progress, and contains lots of differences from the paper.

#CONTENT 3D FNABLE UPDATE#

Image-to-3d-0123.mp4 text-to-3d.mp4 Update Logs Colab notebooks: Enhance Image-to-3D quality, support Image + Text condition of Make-it-3D.Support of DeepFloyd-IF as the guidance model.A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model.