A Style-Based Generator Architecture for Generative Adversarial Networks Walk-Through.pptx

MichaelMansour23 17 views 32 slides Jun 01, 2024
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

A Style-Based Generator Architecture for Generative Adversarial Networks Walk-through


Slide Content

A Style-Based Generator Architecture for Generative Adversarial Networks [ Karras et al 2019] Presented by: Michael Mansour under supervision of : Dr. Dina Khattab

Introduction 01. Baseline 02. Architecture 03. Some tricks 04. Evaluation & Results 05. References 06. TABLE OF CONTENTS

Introduction 01.

STYle-GAN An alternative generator architecture for generative adversarial networks

B aseline 02.

Base line [ Karras et al (2017):PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION] PRO-GAN Progressive GAN (traditional generator) The key idea is to grow both the generator and discriminator progressively starting from a low resolution, we add new layers that model increasingly fine details as training progresses.

Pro-gan architecture [ Karras et al (2017):PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION]

S tyle-gan atchitecture 03.

S tyle-gan atchitecture we first map the input to an intermediate latent space W, which then controls the generator through adaptive instance normalization ( AdaIN ) at each convolution layer.

S tyle-gan atchitecture Z latent Space Must follow the probability density of the training data, but W latent space is free from that restriction and is therefore allowed to be disentangled.

Batch Normalization -Batch Normalization

Instance Normalization -Instance Normalization

Ada-IN : adaptive instance normalization -Adaptive Instance Normalization where each feature map xi is normalized separately, and then scaled and biased using the corresponding scalar components from style y

S tyle-gan atchitecture Gaussian noise is added after each convolution and before AdaIN

S tyle-gan atchitecture “A” stands for a learned affine transform

S tyle-gan atchitecture “B” applies learned per-channel scaling factors to the noise input.

THEM US Mercury is the closest planet to the Sun and the smallest one in the Solar System—it’s only a bit larger than the Moon Venus has a beautiful name and is the second planet from the Sun. It’s hot and has a poisonous atmosphere

S ome tricks 04.

FFHQ DATASET consisting of 70,000 high-quality images at (1024*1024) resolution The images were crawled from Flickr. The dataset includes vastly more variation than CELEBA-HQ (30,000)

Noise Inputs We can see that the noise affects only the stochastic aspects, leaving the overall composition and high-level aspects such as identity intact

Noise Inputs We can see that the artificial omission of noise leads to featureless “painterly” look. (a) Noise is applied to all layers. (b) No noise. (c) Noise in fine layers only (642 – 10242). (d) Noise in coarse layers only (42 – 322)

Tracncation trick allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator’s input w′ represents the truncated or modified latent vector. wˉ represents the mean of the latent vectors. w represents the original latent vector. ψ is the truncation parameter, [brock et al, LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS]

Tracncation trick A higher ψ value results in more truncation, limiting the diversity of generated samples. A lower ψ value allows for more diversity but might also introduce unrealistic or less visually appealing samples

Style mixing To further encourage the styles to localize, we run two latent codes z1, z2 through the mapping network, and have the corresponding w1, w2 control the styles so that w1 applies before the crossover point and w2 after it

Style mixing • Coarse styles —> pose, hair, face shape • Middle styles —> facial features, eyes • Fine styles —> color scheme

Evaluation & Results 05.

Fréchet inception distance (FID) [ Heusel et al ,GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium] μ​ and μ W ​ are the means of the feature representations (activations) of real and generated images C and C W ​ are the covariance matrices of the feature representations of real and generated images Tr(⋅) represents the trace of a matrix (is the sum of the diagonal elements of the matrix).

Fréchet inception distance (FID)

Results Fréchet inception distance (FID) for various generator designs (lower is better)

Do you have any questions?

Papers [1][ Karras et al (2017):PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION] [2][ Heusel et al ,GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium] [3][brock et al, LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS] Videos Progressive Growing of GANs for Improved Quality, Stability, and Variation https://www.youtube.com/watch?v=G06dEcZ-QTg A Style-Based Generator Architecture for Generative Adversarial Networks https://www.youtube.com/watch?v=kSLJriaOumA StyleGAN Paper Explained https://www.youtube.com/watch?v=qZuoU23ACTo AI generated faces - StyleGAN explained https://www.youtube.com/watch?v=4LNO8nLxF4Y References