CharNeRF: 3D Character Generation from Concept Art using Neural Radiance Field
AnandBhojan1
42 views
55 slides
Aug 01, 2024
Slide 1 of 55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
About This Presentation
3D modeling holds significant importance in the
realms of AR/VR and gaming, allowing for both artistic creativity
and practical applications. However, the process is often timeconsuming
and demands a high level of skill. In this paper, we
present a novel approach to create volumetric representations...
3D modeling holds significant importance in the
realms of AR/VR and gaming, allowing for both artistic creativity
and practical applications. However, the process is often timeconsuming
and demands a high level of skill. In this paper, we
present a novel approach to create volumetric representations
of 3D characters from consistent turnaround concept art, which
serves as the standard input in the 3D modeling industry. While
Neural Radiance Field (NeRF) has been a game-changer in
image-based 3D reconstruction, to the best of our knowledge,
there is no known research that optimizes the pipeline for concept
art. To harness the potential of concept art, with its defined
body poses and specific view angles, we propose encoding it as
priors for our model. We train the network to make use of
these priors for various 3D points through a learnable viewdirection-
attended multi-head self-attention layer. Additionally,
we demonstrate that a combination of ray sampling and surface
sampling enhances the inference capabilities of our network.
Our model is able to generate high-quality 360-degree views of
characters. Subsequently, we provide a simple guideline to better
leverage our model to extract the 3D mesh. It is important to note
that our model’s inferencing capabilities are influenced by the
training data’s characteristics, primarily focusing on characters
with a single head, two arms, and two legs. Nevertheless, our
methodology remains versatile and adaptable to concept art
from diverse subject matters, without imposing any specific
assumptions on the data.
Size: 6.23 MB
Language: en
Added: Aug 01, 2024
Slides: 55 pages
Slide Content
CharNeRF: 3D Character
Generation from 2D Concept Art
(IEEE AIxVR 2024)
Eddy Chu
1
, Yiyang Chen
1
, Chedy Raissi
2
, Anand Bhojan
1
1
National University of Singapore
1
, Riot Games
2
3D modeling is essential in VR and AR as it enables creative and realistic
expression of ideas, but it’s a costly process!
4
Current 3D Modeling in AR/VR/Game
5
Concept Art
Image Source: https://www.behance.net/gallery/5594425/3D-Models-from-Concept-Art
Image Source: https://zechatactics.com/blog/2021/03/how-to-build-a-zecha/
Image Source: https://substance3d.adobe.com/plugins/mixamo-in-blender/
3D Models
Shading & Texturing Rigging & Animation
Optional
Costly & Tedious
Types of Concept Art
6
Initial Concept Art
Image Source: https://www.behance.net/gallery/5594425/3D-Models-from-Concept-Art
Image Source: https://fgfactory.com/how-to-design-3d-character-from-scratch-to-final-game-model
Image Source: https://www.weasyl.com/~magnus/submissions/55047/lewis-model-sheet
Image Source: https://www.artstation.com/artwork/rRy3em
Initial Sketches
Turnaround Concept Art
3D Modeling-Ready
Input: 3-View Turnaround
Concept Art
G: Goal Model
Output: 3D model
(Volumetric Representation,
NeRF)
-Front, Back & Side faces
-Meshes of different LoD can be
generated easily
7
Problem Setting:
Industrial Values
-Speed up the process of character modelling which usually takes
hundreds of hours (source: WallaWalla Studio)
-Speed up the process of skin production
Research Values
-There are a lot of existing works/researches on human or object reconstruction
from 2D image(s), but to the best of our knowledge, reconstruction from concept
art for virtual character is NEW!
-Reconstruction for “virtual” character means only RGB values are available (not
depth), which is challenging computer vision problem by itself
-The shape variation for virtual characters is much greater than daily objects or
human beings. A network that is able to capture the distribution of virtual
characters would need to be extremely resourceful.
8
Method
9
Capture essential information from the concept art
- Source Image Encoding
10
Pixel-aligned
Features
Image Encoder
-Encode local information
-Store at each pixel location
Choose PixelNeRF (CVPR 2021) as our baseline
11
12
Instead of ResNet, we find multi-level (one coarse and one
fine) feature extraction greatly improves the rendering quality.
Image Encoder
We find two-level encoder effective in extracting local information
13
Source images are first downsized to extract
coarse information
Fine feature
Coarse feature
H/8 x W/8 x256 H/2 x W/2 x256
Image Encoder - Simplified Feature Combination
14
Fine feature
Coarse feature MLP
H/2 x W/2 x256
NeRF
Regularize NeRF through
-a Mix of Sampling Methods
15
2 x Neural Radiance Field (NeRF)
16
novel view
Ray Sampling
17
novel view
Ray Sampling
W
i
Coarse Sampling
18
novel view
Ray Sampling
W
i
ΣW
j
W
i
Weight Distribution
19
novel view
Ray Sampling
Fine Sampling
W
i
ΣW
j
20
NeRF is trained by minimizing the color difference between the
image pixel and the “composited” color along a ray. It is therefore a
common problem for a NeRF to “cheat” by distributing volume
density more evenly along a ray than it should.
Our idea is essentially to encourage NeRF to to assign high volume
density near the surface area, which also allows NeRF to learn the shape
directly.
21
Entropy Loss (InfoNeRF CVPR 2022)Emptiness Loss (SJC CVPR 2023)
novel view
Ray Sampling
Surface Sampling
22
We first distribute the target number of surface sampling points based on the
sub-mesh area distribution, and then, for each sub-meshes, sampling is done
using triangle point picking algorithm.
Pixel-aligned
Features
novel view src view
src view
Ray Sampling
Surface Sampling
23
Mimic how a 3D modelist creates a 3D model
-View Direction attended Feature Combination
24
Pixel-aligned
Features
novel view src view
src view
Ray Sampling
Surface Sampling
25
Pixel-aligned
Features
novel view src view
src view
Early
MLP
Z’
2
Z’
1
Z’
3
Ray Sampling
Surface Sampling
26
Shared
weights
Shared
weights
Two More
MLP
Blocks
-Feature Vector Similarity
-View Direction Similarity
View Direction Attended Multi-Head Self Attention
27
28
Loss Function
29
30
Simple Guideline for Mesh
Reconstruction from CharNeRF
31
In the original NeRF, mesh is reconstructed through setting view direction to be zero
vector. Nonetheless, the approach nullifies our view-direction attended feature
combination mechanism. Therefore, to fully leverage the power of CharNeRF, we
propose to supply multiple view directions and reconstruct the mesh through the
average volume density.
32
Equivalent to setting view directions to0
Data
33
We collect 95 3D character from online resources
-75 in training dataset
-10 in validation dataset
-10 in test dataset
34
Each 3D Character, we generate
-303 images (3 views as concept art, 300 cameras evenly distributed
using fibonacci lattice method)
-1080x1080
-White background
-Pose.txt 4x4 camera to world matrix
-Front, back, side images as concept art
36
Image Source: https://www.mixamo.com/#/?page=2&type=Character
-Intrinsics.json
-focal length: fx, fy
-measured in pixel
-principle point coordinate: cx, cy
-width & height
-aabb_scale
-Obj + Mtl + texture images
37
Results
38
Concept Art from 2D Artists:
39
40
Concept Art from 2D Artists:
Synthetic Concept Art:
Ablation Test
42
CharNeRF 4 2
13
43
PixelNeRF
{
With a mix of sampling method
44
Limitation & Future Extension
45
Extension 1 - Improve rendering quality
46
CharNeRF suffers from blurriness at the extremities of the characters,
especially when the viewing angle is far from the concept art views…
47
48
Later research in NeRF leverages diffusion models as prior by lifting its
strong 2D generative power to 3D…
Text to 3D through diffusion model
-Score Jacobian Chaining
(CVPR 2023)
-Dreamfusion (ICLR 2023)
Image to 3D through diffusion model
-3DiM (ICLR 2023)
-RealFusion 360 (CVPR 2023)
-zero-1-to-3 (ICCV 2023)
49
50
51
NeRFDiff (ICML 2023) finetunes NeRF using conditional diffusion model
Given the similar architecture of CharNeRF to PixelNeRF, the fine tuning
method used in NeRFDiff is a ready-made extension to improve CharNeRF
Extension 2 - Interactivity & Editability
52
53
The Final Character Concept Art typically comes with close-up sketches
for the occluded parts. A model that allows 2D artists to interactively
supply close-up sketches would be an important extension.
Image Source: https://www.artstation.com/artwork/rRy3em