A great way to think of LoRA is that you are training it to produce art in your style of choice.
A common use of AI image generation is to create popular artistic styles, recognized around the world, like those of Bansky, Vincent van Gogh, and Leonardo da Vinci. This also applies to cartoon styles and others.
Say, you’re an artist who’d love to generate images in your own style or you have an obscure variation of Anime that can’t be found anywhere online, you can train Stable Diffusion with a LoRA model to produce images in the same concepts.
The cool thing is that you don’t have to your own art styles to use LoRA. You can use the styles that other people created at HuggingFace’s LoRA library.
The following images were created with <lora:vae-ft-ema-560000-ema-pruned:1>, which we’ll discuss later in this article.
|Dystopian City with Cameras & Drones||Dystopian City with Cameras||Peter Griffin with a Monocle|
|Vincent van Gogh||Walter White||Iron Man|
What Is a LoRA Model?
Without getting too technical, here’s a brief overview:
LoRA stands for Low-Rank Adaption for Fast Text to Image Diffusion Fine-Tuning. Models contain data sets including characters, concepts, and artistic styles.
- Small In-Size: Ranging from 8 MB or less.
- Efficient: LoRA Models train faster than DreamBooth models.
In short, LoRAs are embedding tools utilized into AUTOMATIC1111, which apply to the hypernetwork with more precision than DreamBooth.
Now, let’s proceed to the pre-instructions.
Running LoRA: Pre-Instructions
AUTOMATIC1111 automatically supports LoRA models. You don’t have to worry about adding extensions – other articles that claim this are outdated.
Before using LoRA on Stable Diffusion, you’ll need to make sure you have everything listed on the following checklist:
Here’s a breakdown:
You’ll need AUTOMATIC1111 working. If you are struggling to run it, you’ll find our Stable Diffusion guide useful.
The training images are for creating your own LoRA models, which is explained later in the article.
For using Stable Diffusion with a pre-created model, you’ll need one of LoRA’s models. They’re available online – as mentioned previously.
How to Setup LoRA Models
Now that you have everything in the pre-instructions, it’s time to begin setting up LoRA.
Below are 5 easy-to-follow steps to help you run a LoRA model in AUTOMATIC1111.
Note: We’ll be talking about how to train your own LoRA models later in the article.
Step 1: Create the Correct File Path
You need to make sure you have a Lora subfolder in your Stable DIffusion folder.
So, you need to create a sub-folder in your Models subfolder within the Stable Diffusion folder.
When doing so, make sure you have the correct file path. Otherwise, you may run into errors later on.
The file path should be:
H3: Step 2: Get a LoRA Model (Where to Find LoRA Models)
You can find LoRA concepts on Hugging Face’s LoRA concept library. It features 22 models of various different styles.
For demonstration purposes, I’m using Shiba Dog Dummy.
Step 3: Clone the Model
To download the model, you need to clone it.
You do that by Git Bashing the destination folder you want the file to be in. In our case, it’s the Lora subfolder we created earlier.
Right Mouse click on the sub folder, click on more options. Then, click on Git Bash Here.
After that, the Git prompt will open.
Next, you need to initiate GIT LFS. You do this by entering the following command:
git lfs install
After, GIT clone your model’s URL. In our case, it’s the Shiba Dog Model. In your case, it’ll be a different model – unless you’re using this one. So, be sure to get the correct URL before striking Enter.
Copy and paste the following, adjusting the URL accordingly if necessary.
git clone https://huggingface.co/stabilityai/sd-vae-ft-ema-original
git clone *YOUR MODEL’S URL
It’s worth mentioning you can have multiple models in your Models folder. So, you won’t get restricted to only one.
Below is what you should be seeing:
Step 4: Find your Model
In Windows File Manager, go onto the File Path mentioned in the first step. Then, click on your model’s folder. If you see exactly what’s in the below screenshot, you have done this step correctly. It’s good to make sure it’s there to avoid errors later.
Click on the folder to find the model file.
Step 5: Click on your Model
In Stable Diffusion, go to the Stable Diffusion checkpoint dropdown menu. Then, click on your model. In our case, it’s v2_1786-ema-pruned (1)/ckpt[ad2a33c361].
Step 6: Click Show Extra Networks
Under the Generate button, click on the Show Extra Networks icon. It’s a small purple icon in the middle of the others.
Step 7: Click on Checkpoints
Click the Checkpoints tab that’s second to the left.
Step 8: Click on your Model
On checkpoints, you’ll find your model. In most cases, a preview isn’t available.
Now that you’ve come across your model in Stable Diffusion, click it.
Step 9: Click on the LoRA Tab
Now, click on the LoRA tab. If nothing appears, click the refresh button.
Here, you’ll see variations of your model – along with the original.
H3: Step 10: Click on One or More Versions of your Model
Most LoRA models come with several variations to choose from. You can either select one or multiple variations of your model.
Proceed by clicking one or more versions of your model.
Once you click on one of them, you’ll see the file name go into the text prompt. Don’t panic, this is normal.
At this point, you don’t have to worry about configuring any settings. LoRA will work with Stable Diffusion, as the Web GUI supports it.. As mentioned previously, Stable Diffusion supports LoRA models.
Step 11: Generate an Image
Now, you can generate an image using LoRA.
Once your model is in the text prompt, all you have to do is generate an image as you ordinarily would. Enter a description of what you want your image to be. Then, click Generate.
Below are some examples of what I generated with the sd-vae-ft-ema model.
A Dystopian World with a (Mysterious) Figure Standing in the Middle
A Dystopian City with Cameras Everywhere
A Portrait of Vincent van Gogh In the Year 3000 With a Futuristic City in the Skyline
This didn’t turn out as I hoped. Although, this is an example of the fact it’s worth nothing you won’t always get the result you’re looking for. The solution is to be more specific in your text description.
Dystopian City with Cameras and Drones
How to Fine Tune Stable Diffusion with LoRA using your own Models
In the pre-setup instructions, we recommended creating a zip file of at least 5-200 images.
Step 1: Obtain and Upload Images
When you begin, you need to put together a ZIP file that includes a specific face, object, or artistic style.
I started by generating 5 images of Homer Simpson in the style of Van Gogh holding a beer. I’m not an artist, so I don’t have a specific drawing style. Thus, I am using the styles of others for demonstration purposes.
The styles I used were the original Matt Groening style, along with Vincent Van Gogh.
You can start with 5-10 images, as that’s enough for training the AI. Although, using 20-100 can maximize your results to their greatest potential.
In terms of file format, you need to use JPGs or PNGs. It’s also worth noting DreamBooth training images can be applied to LoRA.
When uploading your images to public links, you can use Google Drive, GitHub, or Mega.
I opted for Google Drive.
When using Google Drive, it’s important to make the ZIP file public. You do this by clicking on the general access drop-down. Then, click on Anyone with the Link. After, click on Done.
Also, make sure to take a copy of your share link and save it onto a Notepad file for later.
Step 2: Train Your Concept
Firstly, you need to use the LoRA training model. Below is an example of the code:
import replicate # Zip file containing input images, hosted somewhere on the internetzip_url = "https://my-storage/my-input.zip" # Train the model.training_model = replicate.models.get("replicate/lora-training")version = training_model.versions.get("b2a308762e36ac48d16bfadc03a65493fe6e799f429f7941639a6acec5b276cc")lora_url = version.predict(instance_data=zip_url, task="style")
Note: Before doing so, create a virtual environment first.
Step 3: Install the Dependencies
To install the dependencies, enter copy the following command:
pip install accelerate torchvision transformers datasets ftfy tensorboard
Step 4: Install the DIffusers
Then, install the diffusers by copying or entering the following command:
pip install accelerate torchvision transformers datasets ftfy tensorboard
If you want to get the latest version of Diffusers, use this command:
pip install git+https://github.com/huggingface/diffusers
Step 5: Configure Accelerate
Now, you need to configure accelerate. In the terminal, enter the following command:
Step 6: Configure Training with a Single or Multiple GPU
To train on a local machine with mixed precision using a Single GPU, this the following code:
accelerate configUse the following configuration to train on a local machine with mixed precision (single GPU): ----------------------------------------------------------------------------------------------------------------------------In which compute environment are you running?This machine----------------------------------------------------------------------------------------------------------------------------Which type of machine are you using?No distributed training Do you want to run your training on CPU only (even if a GPU is available)? [yes/NO]:no Do you wish to optimize your script with torch dynamo?[yes/NO]:no Do you want to use DeepSpeed? [yes/NO]: no What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]:all----------------------------------------------------------------------------------------------------------------------------Do you wish to use FP16 or BF16 (mixed precision)?fp16 accelerate configuration saved at /home/wfng/.cache/huggingface/accelerate/default
For a multiple GPU setup, type Multiple GPU under ‘What Type of Machine are you Using?’.
Which type of machine are you using? Multi-GPU
Alternatively, you can override the existing confirmation by passing the following arguments when training:
Train with Multiple GPUs
accelerate launch --multi_gpu --gpu_ids="1,2" --num_processes=2 train.py \ multiple-gpu ...
Train with Specific GPUs from a Set of Multiple GPUs
accelerate launch --multi_gpu --gpu_ids="1,2" --num_processes=2 train.py \ gpu_ids
Set the Number of Processes for Parallel Training
accelerate launch --multi_gpu --gpu_ids="1,2" --num_processes=2 train.py \ num_processes
Step 7: Improve Interference Speed with Xformers (Optional)
Xformers help to improve interference speed.
If you have torch==1.13.1, run the following command to install Xformers:
pip install -U xformers
If you’re using Conda on Linux, this installation supports either torch ==1;12;1 or torch 1.13.1.
conda install xformers
Tips for Fine Tuning Stable Diffusion with LoRA
How to Prepare Datasets
The two most common ways to prepare datasets involve uploading or using them on HuggingFace Hub or from your local metadata.jsnol folder.
Huggingface Hub datasets come with image and text keys.
Find Datasets from HuggingFace Hub
Use the following command to find datasets with dataset_name argument on HuggungFace.
accelerate launch train_text_to_image_lora.py \ --pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5 \ --dataset_name="lambdalabs/pokemon-blip-captions" \ --resolution=512 --center_crop --random_flip \ --train_batch_size=1 \ --num_train_epochs=100 \ --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \ --seed=42 \ --output_dir="output" \ --validation_prompt="a drawing of a pink rabbit"
Substitute the original dataset_name argument with the new datasets name found on HuggingFace Hub.
Replace the pretrained_model_name_or_pathj argument with the desired Stable Diffusion model.
|resolution||Resizes all images to the size of the input images, which are in the train/validation datasets. |
For higher resolution output, you need more RAM when training the AI. An example of this would be to set it to 256 to train a model – generating 265 x 265 output.
|train_batch_size||In order to avoid running out of memory during training, you can reduce the batch size for the training data loader. This can be done per device.|
|num_train_size||The default number of training epochs is 100. With this command, you can change it.|
|checkpointing_steps||This command allows you to save a checkpoint of a training state (a point in your training) every X number of updates. |
It’s important to keep in mind these checkpoints are only suitable for resuming training states.
Also, the default checkpoint is set to 500. So, you can set it ot a high value to reduce how frequently checkpoints get saved.
How to Add Custom Images to a Folder
Create a file in the training folder, and call it metadata.jsonl. Make sure you have the correct directory structure.
|- data (folder)| |- metadata.jsonl| |- xxx.png| |- xxy.png| |- ...|- train_text_to_image_lora.py
The data folder includes the main metadata.jsol file, and your set of training images.
For training a conditional image generation model with LoRA, consider opting for the train_data_dir argument as a substitute.
Below is an example:
accelerate launch train_text_to_image_lora.py \ --pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5 \ --train_data_dir="data" \ --resolution=512 --center_crop --random_flip \ --train_batch_size=1 \ --num_train_epochs=100 \ --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \ --seed=42 \ --output_dir="output" \ --validation_prompt="Homer Simpson Holding a Beer"
Note: When you run this for the first time, it will download the Stable Diffusion model. Then, it will save it to the cache folder. Following this, it’ll reuse the same cache data every time you run it subsequently.
How to Utilize the Tensorboard Package
The script will save LoRA weights only at the end of training.
|- output| |- checkpoint-5000| |- checkpoint-10000| |- checkpoint-15000| |- checkpoint-20000| |- logs|- data|- train_text_to_image_lora.py
Meaning, it’s worthwhile to utilize the tensorboard package. This will allow you to monitor training.
To begin, open a new terminal. Then, enter the command below:
tensorboard --logdir output
After doing so, you’ll see this in the terminal:
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_allTensorBoard 2.12.0 at http://localhost:6006/ (Press CTRL+C to quit)
Go onto your browser and create a new tab. Then, access the TensorBoard page by entering the following URL:
On the Tensorboard, you should come across the training metrics – including train_loss.
How to Resume from a Checkpoint
If you want to resume from an existing checkpoint, you need to use the resume_from_checkpoint argument. Then, set it to the checkpoint you want it to resume from.
Below is the command:
accelerate launch train_text_to_image_lora.py \ ... --resume_from_checkpoint="output/checkpoint-20000"
To set the value to the latest checkpoint, you need to enter this command:
accelerate launch train_text_to_image_lora.py \ ... --resume_from_checkpoint="latest"
After the training has been completed, it will generate small LoRA weights – referred to as python+lora+wegiths.bin. This is the output directory.
Now, create a new file. Then, label it as interference.py. Following this, you’ll need to append the code featured below within it.
from diffusers import StableDiffusionPipelineimport torch device = "cuda" # load modelmodel_path = "./output/pytorch_lora_weights.bin"pipe = StableDiffusionPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, safety_checker=None, feature_extractor=None, requires_safety_checker=False) # load lora weightspipe.unet.load_attn_procs(model_path)# set to use GPU for inferencepipe.to(device) # generate imageprompt = "a drawing of a white rabbit"image = pipeline(prompt, num_inference_steps=30).images# save imageimage.save("image.png")
After, run this command to generate images with the LoRA weights you just trained:
How to Resize LoRA Models
You can reduce the size of pre-trained LoRA by running the following batch file name:
It will start a series of popups that will guide you through resizing LoRA models.
A lot of people are starting to downsize their trained models to fix output issues. In general, most individuals say it helps.
But, time has progressed since those days.
Generating images with LoRA models isn’t a difficult task, but if you create your own models it can get quite complex.
Actually using a pre-made LoRA model in Stable Diffusion (With AUTOMATIC1111) is easy because you don’t have to set anything up. You merely need to clone a model from HuggingFace.
Get a Deeper Understanding of AI
With AI advancing and changing all the time, it’s not always easy to keep up-to-date.
This is why we cover various topics, including Chat GPT, Novel AI, Stable Diffusion, DreamBooth, MidJourney, among others.
We provide more than mere insights by expressing our true opinions on all things AI.
If this is up your street, don’t hesitate to give our newsletter a try.