Overview Description: A Latent Diffusion model that can run on most consumer hardware equipped with a modest GPU with at least 8 GB VRAM Developers CompVis Group, Ludwig Maximilian University of Munich Runway Sponsor/Donor: Stability AI, EleutherAI , and LAION Series A funding: Lightspeed Venture Partners, and Coatue Management Dataset contributors: Various NPOs Training cost: $660,000 Sources: Hugging Face Spaces (Stability AI), and DreamStudio Primarily use To generate detailed images conditioned on text descriptions (prompts) Additional use Inpainting Outpainting generating image-to-image translations guided by a text prompt https://techpp.com/2022/10/10/how-to-train-stable-diffusion-ai-dreambooth/
SD Architecture Uses Diffusion Model from CompVis , LMU Munich, Germany 860 million parameters in the U-Net (with Resnet Backbone) 123 million parameters in the text encoder Considered relatively lightweight by 2022 standards SD v2 uses Xformers , and XL uses additional reinforcement/knowledge Model fine-tuning methods… Dreambooth Textual Inversion LoRA Hypernetworks Aesthetic Gradient
Fine-tuning method: Dreambooth https://youtu.be/dVjMiJsuR5o Well-Researched Comparison of Training Techniques (Lora, Inversion, Dreambooth , Hypernetworks) : r/ StableDiffusion (reddit.com)
Fine-tuning method: Textual Inversion https://youtu.be/dVjMiJsuR5o Well-Researched Comparison of Training Techniques (Lora, Inversion, Dreambooth , Hypernetworks) : r/ StableDiffusion (reddit.com)
Fine-tuning method: LoRA https://youtu.be/dVjMiJsuR5o Well-Researched Comparison of Training Techniques (Lora, Inversion, Dreambooth , Hypernetworks) : r/ StableDiffusion (reddit.com)
Fine-tuning method: HyperNetworks https://youtu.be/dVjMiJsuR5o Well-Researched Comparison of Training Techniques (Lora, Inversion, Dreambooth , Hypernetworks) : r/ StableDiffusion (reddit.com)
Training Create a Dataset Fine tuning options Dreambooth notebook Image dataset 512px x 512px prior_preservation_class_prompt auto complete using CLIP tokenizer Web UI over SD v1 and v2 and hypernetwork Image dataset 512px x 512px Unique dataset folder name and images as a unique name series Starts from pre-trained checkpoint file (. ckpt ) Auto annotation (class prompt) text file generated using BLIP with textual inversion UI supports gradio outlook (conventional) Streamlit outlook (new and in progress)