What is Stable Diffusion?
Stable Diffusion is a text-to-image model that was released by StabilityAI on August 22, 2022. Now it is used by millions of people to generate stunning art within seconds.
Its similar to other models like OpenAI’s DALL·E 2 and Midjourney, with one big difference: it was released to the public for free.
This was a very big deal.
Unlike existing models—which were closely guarded by the companies that made them—anyone could customise it or build on top of Stable Diffusion. In the short weeks that followed the release there was an explosion of innovation around Stable Diffusion models and tools.
Stable Diffusion’s breakthrough in speed also meant that for the first time, models could be run on consumer GPUs, instead of superclusters (supercomputers).
What can you do with Stable Diffusion?
Text to Image (txt2img)
The bread and butter of Stable Diffusion. Enter a sentence describing an image—called a ‘prompt’—and Stable Diffusion will generate an image for you!
Image to Image (img2img)
The peanut butter and jelly of Stable Diffusion. Input an image and a prompt, and get a new image.
Edit with inpainting and outpainting
Inpainting: highlight a part of an image and generate over it
Outpainting: extend an image, specifying what you want in the extension
Integrations for Photoshop, Blender etc.
All of the above will soon be integrated into relevant software.
Modify images by giving instructions
InstructPix2Pix is a model that allows you to input an image and give instructions for how you want it to be modified. This is a huge step up from image-to-image generation because you can specify which parts of the image you want to modify with normal language.
Modify videos by giving instructions
Using InstructPix2Pix it is also possible to edit videos!
Check out these creations from Reddit users:
Zero-One-One-Zero makes the Terminator into marble sculptures:
Equal-Mix1721 does the Matrix as a Western film:
How do I use it?
How do you actually get started? I recommend trying some online tools to get a feel for what you can do. Then, install Stable Diffusion locally on your computer.
Just go to the website and you can start generating images!
StableUI by Aqualxx
A completely free tool which supports any resolution, many models, img2img, inpainting and a public gallery.
Made possible with Stable Horde, a crowd-sourcing platform that allows generous people donate their GPU power for other people to use.
Website | Guide
By StabilityAI, the creators of Stable Diffusion. While Stable Diffusion is free, DreamStudio is a paid product with a free trial for those who want an easy to use website, and don’t want to install SD on their own computer.
ArtBot by Dave Schumaker
Another Stable Horde based tool. Works great on mobile, and has a bunch of models and features: img2img, inpainting and a public gallery.
Let’s get this straight: most AI-art apps are terrible. They buy a lot of fake reviews and hope you’ll be tricked into buying their paid version. Avoid these. The best apps are free anyways.
Here are the apps I’ve tested that have actually impressed me.
Draw Things: AI Generation
I’m very impressed by this app. The interface is clean and intuitive. It comes with all the features you’d want. There’s also a version for the Mac and iPad.
Diffusitron AI Art
Free app that produces great results. The developers are responsive and helpful.
You’ve decided you want full customizability and access to the most advanced features as soon as they come out. Great! Let’s install Stable Diffusion on our own computer.
You can think about this in 2 parts: there’s a User Interface with toggles that you use to tweak your generations, and the Model you choose to actually generate your images.
“Stable Diffusion” is the name of the official model published by StabilityAI, but is also used colloquially to refer to any text-to-image mode like: “Oh, that’s a great stable diffusion model”.
Here are the best options for running Stable Diffusion on your computer:
AUTOMATIC1111’s Stable Diffusion WebUI
This is where I recommend people start out. AUTOMATIC1111 is the most full-featured Stable Diffusion User Interface.
Installation instructions for AUTOMATIC with Stable Diffusion v2.1 or if you are interested in anime, with NAI Diffusion
Another great interface. The community is strong and they have a great unified interface.
Mochi Diffusion – made for Mac M1
I really like this one. While the previous 2 apps have Mac versions, Mochi Diffusion is actually built for and optimized for Apple Silicon.
Running in the Cloud
What’s Google Colab? It’s a service that allows you to run Python code, but on Google’s servers.
It might seem intimidating if you’ve never coded, but it’s really just pressing a series of “run” buttons to execute different sections of code.
- TheLastBen’s Fast Stable Diffusion: Most popular Colab for running Stable Diffusion
- AnythingV3 CoLab: Anime generation colab
- Deforum Colab: for creating animations with colab
Models determine what the AI knows
Which model you use will determine what the AI “knows” and can generate.
Top models such as Stable Diffusion are trained on large datasets, and are regularly updated to improve their performance.
Anybody can create their own model.
Here are the most popular models:
Stable Diffusion v2.1
The latest version of the official Stable Diffusion model by StabilityAI.
Midjourney is a very popular AI generation tool, however it costs $10/mo-$60/mo to use. OpenJourney similates the look of Midjourney but you can run it for free!
Of the most popular anime-themed models. It was made with Stable Diffusion + millions of anime images.
For a comprehensive list, check out: https://rentry.org/sdmodels
Best place to find new models?
Remember, models are plug and play.
If you’ve already downloaded any of the User Interfaces in the previous section, you can download and run as many models as you want. Just make sure you have enough hard drive space for them!
Warning: Be careful when downloading models
.ckpt files and python files can execute code. This means people can also create models with malicious code that infects your computer with viruses. AUTOMATIC1111’s Web-GUI should remove malicious code from files, but that’s only one line of defense.
You will need to find the right combinations of words that will direct the tool to the content you want to generate – these are called prompts.
You can be highly descriptive with long sentences, or use short words and phrases, both work.
Here are some great resources for prompting:
Lexica.art: There are a number of sites which aggregate images and prompts, however Lexica is my favorite. Just search for what you’re looking for. Quality will vary though, so be prepared to dig around.
- List of Artists Represented in Stable Diffusion 1.4: A list of all the artists in the v1.4 Stable Diffusion model, with example generations
- Note that different models have different levels of knowledge about artists. However many models use Stable Diffusion as a starting point, so this is a good knowledge-base to use as a starting point
- Arthive Artist Database: 74,000 artists – not all of them are represented in Stable Diffusion, but great for inspiration.
You can also use image-to-text (img2txt) tools to attempt to reverse engineer the prompt that created an image:
- Clip-Interrogator, by Pharma. Figure out a prompt to recreate an image
- BLIP, by Salesforce