FlexiDreamer Single Image-to-3D Generation with FlexiCubes
leveraging a flexible gradient-based extraction known as FlexiCubes
direct acquisition of the target mesh
1 minute, first extracting mesh then texturing
DreamReward Text-to-3DGeneration with Human Preference
3D reward model
BrepGen A B-rep Generative Diffusion Model with Structured Latent Geometry
structured latent geometry in a hierarchical tree, representing a whole CAD solid
generating complicated geometry
ARTIC3D Learning Robust Articulated 3D Shapes from Noisy Web Image Collections
using 2d diffusion as prior
Sketch-A-Shape Zero-Shot Sketch-to-3D Shape Generation
both sketches and CADs into meshes
without requiring any paired datasets during training
Sketch2NeRF Multi-view Sketch-guided Text-to-3D Generation (sketch control to 3D generation)
3Doodle Compact Abstraction of Objects with 3D Strokes
abstract sketches containing 3D characteristic shapes of the objects
Sketch3D Style-Consistent Guidance for Sketch-to-3D Generation
generating realistic 3D Gaussians consistent with color-style described in textual description
Point-CloudCompletion with Pretrained Text-to-image Diffusion Models
from incomplete point cloud to 3d model
Differentiable BlocksWorld: Qualitative 3D Decomposition by Rendering Primitives
turned into big blocks; interpretable, easy to manipulate and suited for physics-based simulations
FlexiCubes: Flexible IsosurfaceExtraction for Gradient-Based Mesh Optimization
hierarchically-adaptive meshes, local flexible adjustments
ComboVerse Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance
learning to combine multiple models
adjust their sizes, rotation angles, and locations to create a 3D asset that matches the given image
GetMesh A Controllable Model for High-quality Mesh Generation and Manipulation
mesh autoencoder, e-organized to the triplane representation by projecting the points to the triplane
combining mesh parts across categories, adding/removing mesh parts,
3D-LFM Lifting Foundation Model ==best==
lfting of 3D structure and camera from 2D landmarks
2L3 Lifting Imperfect Generated 2D Images into Accurate 3D
utilize multi-view 3D reconstruction to fuse generated MV images into consistent 3D objects
SceneDreamer Unbounded3D Scene Generation from 2D Image Collections (landscapes, scenarios)
Michelangelo Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
image text shape prior = mesh
One-2-3-45 Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization
use sd to force consistency of diffusion generation
Magic123 One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors ==best==
for both stages; first stage coarse geometry first then details
DMV3D Denoising Multi-View Diffusion using 3D Large Reconstruction Model (generates nerf)
ImageDream Image-PromptMulti-view Diffusion for 3D Generation ==best==
Customize-It-3D High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior
subject-specific and multi-modal diffusion model; gets NeRF
enhances texture from coarse
HarmonyView HarmonizingConsistency and Diversity in One-Image-to-3D
ZeroShape Regression-basedZero-shot Shape Reconstruction (just the shape)
rained to directly regress the object shape, computationally efficient
FDGaussian Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
extract 3D geometric features from the 2D input
incorporating attention to fuse images from different viewpoints
GeoWizard Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
encodes the image, depth, and normal into latent space, fed into U-Net to generate
T-Pixel2Mesh Combining Global and Local Transformer for 3D Mesh Generation from a Single Image
coarse-to-fine to refine mesh details
Magic-Boost Boost 3D Generation with Mutli-View Conditioned Diffusion
refines coarse generative results, high consistency
SplatterImage: Ultra-Fast Single-View 3D Reconstruction
TGS Triplane meets Gaussian Splatting ==best==
fast single-view 3d reconstruction with transformers
LRM Large Reconstruction Model for Single Image to 3D (5 seconds)
learn to directly predict NeRF from image
OpenLRM Open-Source Large Reconstruction Models
Image-to-3D in 10+ seconds!
Zero-1-to-3 Zero-shotOne Image to 3D Object: Zero123Plusis model
DreamCraft3D Hierarchical3D Generation with Bootstrapped Diffusion Prior ==best==
image to guide the geometry sculpting and texture boosting
imbuing it with 3D knowledge of the scene being optimized = view-consistent guidance for the scene
Wonder3D SingleImage to 3D using Cross-Domain Diffusion ==best==
geometry-aware normal fusion algorithm that merges the generated 2D representations
iFusion Inverting Diffusion for Pose-Free Reconstruction from Sparse Views
repurposing Zero123 for camera pose estimation
ViewFusion Towards Multi-View Consistency via Interpolated Denoising
auto-regressive method, previously generated views as context for the next view generation
Isotropic3D Image-to-3D Generation Based on a Single CLIP Embedding
fine-tune a text-to-3D diffusion model by substituting its text encoder with an image encoder
single image CLIP embedding to generating multi-view images
HyperDreamer Hyper-Realistic 3D Content Generation and Editing from a Single Image
interactively select region via clicks and edit the texture text guidance
modeling region-aware materials with high-resolution textures and enabling user-friendly editing
guidance: semantic segmentation and data-driven priors(albedo, roughness, specular properties)
EscherNet A Generative Model for Scalable View Synthesis
self-attention with M target views to ensure target consistency, more consistent across frames
capable of creating high-quality outputs in less than a second
USDZformat
awesome tripo hd using layer separation
https://twitter.com/toyxyz3/status/1803511432082051201 <button class="pull-tweet" value=https://twitter.com/toyxyz3/status/1803511432082051201>pull</button>
FDGaussian Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
extract 3D geometric features from the 2D input
MVD-Fusion Single-view 3D via Depth-consistent Multi-view Generation
output to multiple 3D representations such as point clouds, textured meshes, and Gaussian splats
SHAP-EDITOR Instruction-guided Latent 3D Editing in Seconds
3D editing directly by a feed-forward network, one second per edit
Repaint123 Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
visibility-aware adaptive repainting (view-consistent)
IMAGE SCULPTING==best==
STABLEIDENTITYinserting identity
threefiner interface for text-guided mesh refinement
SC-Diff 3D Shape Completion with Latent Diffusion Models
realistic shape completions at superior resolutions
Generic 3DDiffusion Adapter Using Controlled Multi-View Editing
MVEdit as 3D counterpart of SDEdit
3D Adapter lifts 2D; also for texture generation
MagicClay Sculpting Meshes With Generative Neural Fields
maintains s consistency between both a mesh and a Signed Distance Field (SDF) representations
DragTex Generative Point-Based Texture Editing on 3D Mesh
draggint the texture
diffusion model to blend locally inconsistent textures between different views
enabling locally consistent texture editing
MVDiffusion Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion, panorama
MVDiffusion++ A Dense High-resolution Multi-view Diffusion Model for Single to Sparse-view 3D Object Reconstruction
2D latent features learns 3D consistency
Zero123++ a Single Image to Consistent Multi-view Diffusion
ZeroNVS Zero-Shot 360-Degree View Synthesis from a Single Real Image
single-image novel view synthesis; multi-object scenes with complex backgrounds, indoor and outdoor
camera conditioning parameterization and normalization scheme
HOLODIFFUSION Training a 3D Diffusion Model using 2D Images
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
using video-diffusion instead, reduces the number of evaluations of the 2D generator network 10-100x
V3D VideoDiffusion Models are Effective 3D Generators
given a single image
CustomNet Zero-shotObject Customization with Variable-Viewpoints in Text-to-Image Diffusion Models
start from sd checkpoints
3D novel view synthesis into object customization process ==best==
with flexible background control
DreamComposer Controllable 3D Object Generation via Multi-View Conditions ==best==
view-aware 3D lifting module to obtain 3D representations which are injected into 2d-model
MVDream Multi-viewDiffusionfor 3D Generation ==best one==
generate nerf without janus, from normal 2d diffusion
sd + multi-view dataset rendered from 3D assets
2D diffusion + consistency of 3D data
SPAD SpatiallyAware Multiview Diffusers
utilize Plucker coordinates derived from camera rays and inject them as positional encoding
janus issue resolved
leverage the multi-view Score Distillation Sampling (SDS) for 3D asset generation
GeNVS Generative Novel View Synthesis with 3D-Aware Diffusion Models (1 image to 3d video)
MVDiffusion EnablingHolisticMulti-view Image Generation with Correspondence-Aware Diffusion
generate new consistent perspective view
SyncDreamer Generating Multiview-consistent Images from a Single-view Image
multiview diffusion model that models the joint probability distribution of multiview images
3D-aware feature attention
Carve3D Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning
Multi-view Reconstruction Consistency (MRC) metric
MVSplat Efficient 3D Gaussian Splatting from Sparse Multi-View Images
via plane sweeping in the 3D space, geometric cues estimating depth, 10× fewer 2× faster
DreamDistribution Prompt Distribution Learning for Text-to-Image Diffusion Models ==best==
finds a prompt distribution of reference images, then use it to generates new 2D-3D instances
Reference-Based3D-Aware Image Editing with Triplane
Leveraging 3D-aware triplanes, our edits are versatile, allowing for rendering from various viewpoints
Learning DisentangledAvatars with Hybrid 3D Representations
hair, face, body and clothing can be learned then disentangled yet jointly rendered
PaintAnything 3D with Lighting-Less Texture Diffusion Models (Tencent) ==best==
lighting-less textures; image painted over input model, generate texture from text
Text2Room ExtractingTextured 3D Meshes from 2D Text-to-Image Models (textures)
DreamSpace Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation (vr)
coarse-to-fine panoramic texture generation which both considers geometry and texture cues
imagine a panoramic then propagate it with inpainting
ControlMat A Controlled Generative Approach to Material Capture; diffusion
uncontrolled illumination as input; get variety of materials which could correspond to the input image
TexFusion Synthesizing 3D Textures with Text-Guided Image Diffusion Models
synthesize textures for given 3D geometries
aggregate the different denoising predictions on a shared latent texture map
Alchemist Parametric Control of Material Properties with Diffusion Models
control material attributes of objects like roughness, metallic, albedo, and transparency in real images
edit material properties in real-world images while preserving all other attributes
TextureDreamer Image-guided Texture Synthesis through Geometry-aware Diffusion
relightable textures from a small number of input images to target 3D shapes
UltrAvatar A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures
removes lighting effects to then rendered under various lighting conditions
Holo-Gen by Unity, generate physically-based rendering (PBR) material properties for 3D objects
FlashTex Fast Relightable Mesh Texturing with LightControlNet
texturing an input 3D mesh with prompt, high-quality and relightable textures
based on the ControlNet architecture
3DStyle-Diffusion Pursuing Fine-grainedText-driven 3D Stylization with 2D Diffusion Models ==best==
given input mesh, use 2d diffusion to generate coherent and relightable textures exploiting depth maps
SyncTweedies A General Generative Framework Based on Synchronized Diffusions
denoising in multiple instance spaces
3D Mesh Texturing, gaussian splat texturing, depth to panorama
MatAtlas Text-driven Consistent Geometry Texturing and Material Assignment
sd as a prior to texture a 3D model
Enhancing TextureGeneration with High-Fidelity Using Advanced Texture Priors
rough texture as the initial input, texture enhancement, eliminating noise and ”point gaps”
NeuralHaircut: Prior-Guided Strand-Based Hair Reconstruction, realism hair modeling, personalization
CT2Hair High-Fidelity 3D Hair Modeling using Computed Tomography
real-world hair wigs as input, populates dense strands
HAAR Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles
latent diffusion model that operates in a common hairstyle UV space
parent: diffusion
Chupa Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models
clothes mesh diffusion using text generation
TryOnDiffusion A Tale of Two UNets (dress one with clothes from another, pose-body change)
DreamPose Fashion Image-to-Video Synthesis via Stable Diffusion
X-MDPT Cross-view Masked Diffusion Transformers for Person Image Synthesis
change the pose keep the clothes, employs masked diffusion transformers on latent patches
OOTDiffusion A highlycontrollable open source tool for virtual clothing try-on ==best one==
GALA Generating Animatable Layered Assets from a Single Scan
from one image get all the clothes in independant separated segmentated pieces-parts
DressCode Autoregressively Sewing and Generating Garments from Text Guidance
generate sewing patterns with text guidance, also for editing
Garment3DGen 3D Garment Stylization and Texture Generation
synthesize 3D garment assets from a base mesh given a single input image as guidance
Design2Cloth 3D Cloth Generation from 2D Masks
synthesize the decomposed 3d meshes from single image
HEAD POSEface-style swap, GET HEAD POSE
HACK Learning a Parametric Head and Neck Model for High-fidelity Animation
BlendFields Few-Shot Example-Driven Facial Modeling
TeCH Text-guided Reconstruction of Lifelike Clothed Humans
diffusion model, imagines it, outputs geometry and texture
Relightableand Animatable Neural Avatar from Sparse-View Video
relightable neural avatars from monocular inputs; pose AND surface intersection, light visibility
SEEAvatar Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance
geometry is constrained to human-body prior; high-quality meshes and textures
AlteredAvatar Stylizing Dynamic 3D Avatars with Fast Style Adaptation
GAvatar Animatable 3D Gaussian Avatars with Implicit Mesh Learning
SDF-based implicit mesh learning; primitive-based 3D Gaussian representation to facilitate animation
Text2Avatar Text to 3D Human Avatar Generation with Codebook-Driven Body Controllable Attribute
intermediate codebook features
AniDress Animatable Loose-Dressed Avatar from Sparse Views Using Garment Rigging Model
generating animatable human avatars in loose clothes from sparse multi-view videos
M3Face A Unified Multi-Modal Multilingual Framework for Human Face Generation and Editing
multi-modal framework for face generation and editing
One2Avatar Generative Implicit Head Avatar For Few-shot User Adaptation ==best==
3D animatable photo-realistic head avatar as prior, personalizable with few shot (one image)
MAGICMIRROR styleand subject transformations ==best==
Real3D-Portrait One-shot Realistic 3D Talking Portrait Synthesis
to generate a talking portrait video, estimating expression and head pose
Media2Face Co-speech Facial Animation Generation With Multi-Modality Guidance
annotated emotional and style labels, auto-encoder decoupling expressions and identities
Bring YourOwn Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters ==best==
automatically animate virtual human faces
EMO Emote Portrait Alive ==best==
audio to video (directly)
AniPortrait Audio-Driven Synthesis of Photorealistic Portrait Animation
driven by audio and a reference portrait image
parent: diffusion
Relightify Relightable 3D Faces from a Single Image via Diffusion Models
create from photo a set of textures(BRDF) to generate realistic 3d face
SelfSwapper Self-Supervised Face Swapping via Shape Agnostic Masked AutoEncoder
mitigates identity leakage by masking facial regions and utilizing disentangled identity and non-identity features
MeshDiffusionpaper 3D Mesh Modeling
LDM3D Latent DiffusionModel for 3D, normal stable diffusion, with depth map generation, ==best==
Text-to-3Dwith classifier score distillation ==best==
guidance alone is enough for effective text-to-3D generation
StableDreamer Taming Noisy Score Distillation Sampling for Text-to-3D ==best quality== (Gaussians)
image-space diffusion for geometric precision, latent-space diffusion for vivid color
high-fidelity(quality) 3D models
Text-to-3DGeneration with Bidirectional Diffusion using both 2D and 3D priors ==best==
both a 3D and a 2D diffusion process (priors), bidirectional guidance (20 minutes)
UniDream Unifying Diffusion Priors for Relightable Text-to-3D Generation
albedo-normal aligned multi-view diffusion (to enable relighting)
PolyDiff Generating 3D Polygonal Meshes with Diffusion Models
operates natively on the polygonal mesh, trained to restore the original mesh structure
RichDreamer A GeneralizableNormal-DepthDiffusion Model for Detail Richness in Text-to-3D
trained with extra image-to-depth and image-normal priors; maps diffused together
HexaGen3D StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation (7 seconds)
AToM Amortized Text-to-Mesh using 2D Diffusion
high-quality textured meshes, 1 second inference, 10 times reduction in training cost, unseen prompts
L3GO: Language Agentswith Chain-of-3D-Thoughts for Generating Unconventional Objects
compose a desired object via trial-and-error within the 3D simulation environment
Instant Text-to-3DMesh with PeRFlow-T2I + TripoSR
WildFusion Learning 3D-Aware Latent Diffusion Models in View Space
trained without direct supervision from multiview or 3D and dont require pose or camera distributions
autoencoder captures the images underlying 3D structure (unlike previous gan approaches)
VolumeDiffusion FlexibleText-to-3D Generation with Efficient Volumetric Encoder
3D latent representation, seconds to minutes
AvatarVerse High-quality & Stable 3D Avatar Creation from Text and Pose
text descriptions and pose guidance
uses mlp nerf
HumanNorm Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation
Fast Registration of Photorealistic Avatars for VR Facial Animation
Synthesizing Moving People with 3D Control
Let 2D DiffusionModel Know 3D-Consistency for Robust Text-to-3D Generation
3D-awareImage Generationusing 2D Diffusion Models
Enhancing High-Resolution3D Generation through Pixel-wise Gradient Clipping
integration into existing 3D generative models, enhancing synthesis of the texture
for image generation Latent Diffusion Model (LDM)
CONTROLNET FOR 3D==best==
GeoDream Disentangling2D and Geometric Priors for High-Fidelity and Consistent 3D Generation
multi-view diffusion serves as native 3D geometric priors
disentangling 2D and 3D priors allows us to refine 3D geometric priors further
DiffusionGAN3D Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors
diffusion guides the 3D generator finetuning with informative direction
CRM SingleImage to 3D Textured Mesh with Convolutional Reconstruction Model
Convolutional Reconstruction Model (CRM), feed-forward single image-to-3D generative
triplane exhibits spatial correspondence of six orthographic images
Sculpt3D Multi-ViewConsistent Text-to-3D Generation with Sparse 3D Prior
modulate the output of the 2D diffusion model to the correct patterns of the template views
Compress3D a Compressed Latent Space for 3D Generation from a Single Image
encodes 3D models into a compact triplane latent space; 7 seconds diffusion
Retrieval-AugmentedScore Distillation for Text-to-3D Generation ==best==
adapt the diffusion model's 2D prior toward view consistency
added controllability and negligible training cost
ViewDiff 3D-Consistent Image Generation with Text-to-Image Models
autoregressive 3D-consistent images at any viewpoint
integrate 3D volume-rendering and cross-frame-attention layers into each block of sd
the prior could instead be an animation prior? since puts attention on the batch
3d consistent space
Instant3D Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model ==best, fast==
generates 4 views and then regresses a NeRF from them
Instant3D Instant Text-to-3D Generation
One-2-3-45++ Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
multi-view image generation, then to 3D using multi-view 3D native diffusion models
MetaDreamer Efficient Text-to-3D Creation With Disentangling Geometry and Texture (2 stages, within 20 minutes)
parent: nerf
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures; latent nerf
AutoDecodingLatent 3D Diffusion Models, view-consistent appearance and geometry
SweetDreamerconverts2D drawings from certain models into 3D by fixing geometry-related issues
SuGaR Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering
regularization term that encourages the gaussians to align well with the surface of the scene
and binds them which enables easy editing, sculpting, rigging, animating, compositing and relighting