SD Prompt Basics

⚠️

Considering that SDXL produces better image effects, this chapter mainly introduces SDXL's Prompt. Therefore, before officially starting this chapter, please download the following models and place the model files in the corresponding folders:

SDXL 1.0 Base (opens in a new tab): Place it in the models/checkpoints folder within ComfyUI.
SDXL 1.0 Refiner (opens in a new tab): Place it in the models/checkpoints folder within ComfyUI.

If you want to use the workflow from this chapter, you can either download and use the Comflowy local version or sign up and use the Comflowy cloud version (opens in a new tab), both of which have the chapter's workflow built-in. Additionally, if you're using the cloud version, you can directly use our built-in models without needing to download anything.

After understanding all the basic nodes and operations of ComfyUI, you should be able to use ComfyUI to generate images.

However, the effect of the generated image might not be what you want. For instance, you may want to create an anime-style image, but the result is a realistic one. In that case, you need to resort to certain methods to make the AI generate images with a specific style. To adjust the image style, there are several methods:

Method one: Influence the image generation through the use of prompts. This is the simplest practice. Basically, using a default workflow should suffice. This chapter will explain how to use prompts to adjust the image style and introduce some more effective methods.
Method two: Influence the image generation by loading different Checkpoint models. This is a straightforward practice; you can check Model Recommendations in the basic elective to learn about commonly used models in the industry.
Method three: Influence the image generation through LoRA. This method is a bit more complex, but compared to method one, more models are available, and the files are significantly smaller. Later, I will introduce you to the usage of LoRA.

Basic Principles

Before introducing how to write prompts, let's first introduce the basic principles of writing prompts.

Principle 1: A prompt is not the longer, the better

Many people who first learn about prompts think that the longer the prompt, the better. However, this is incorrect. It's most important to accurately express your intention, not to accumulate a lot of useless words.

Moreover, it's best to keep the length within 75 tokens (or around 60 characters).

Principle 2: Place important words in a forward position

Many people often overlook this when writing prompts. For example, if you want to generate an image of a cat, you should place "cat" in an earlier position rather than later.

Principle 3: Make good use of symbols

Use commas for separation: Although SDXL has greatly improved in semantic understanding, it is still recommended to use commas to separate different intentions when using prompts.
Use parentheses to adjust weight: You can enter (keyword:weight) to control the weight of the keyword. For example, (high building: 1.2) implies that the weight of the high building is increased. If the weight number entered is less than 1, it means that this word's weight will be reduced, and the generated image will be less related to this word.

Basic Template

After understanding the basic principles, we introduce the template for writing prompts. Generally, prompt will include the following contents:

Subject: Since the subject is the most important, it is generally placed first.
Environment: The environment includes:
- The environment around the subject, such as "in a forest", "on a grassland", etc.
- Lighting, such as "lightning", "moonlight", etc.
- Weather, such as "rain", "snow", etc.
Medium: The medium can be the shooting medium of the picture, like a "camera", or the bearing medium, like an "oil painting".
Style: You can use the 4W mnemonic:
- When: Which era's style?
- Who: Whose style do you want? (individual or organization)
- What: What kind of art type or art movement style?
- Where: What country's style?

💡

If you have also studied my Midjourney tutorial (opens in a new tab), you'll find that SD's prompt is missing the "composition" part. This is because, I believe it's best to input the composition to the model using other methods, such as through pictures using ControlNet or img2img (which will be introduced in the intermediate part), rather than directly written in the prompt. This way, the model's output will more closely align with expectations.

Methods for Learning Prompts

In fact, after understanding the basic principles and template of prompts, you are already able to write prompts and generate images. However, I believe that most people (including myself) will find it difficult to generate the satisfactory image in their minds at once. The reasons are:

Majority of people lack cognitive understanding of beauty: this is a big issue, since you don't know what kind of image you want, you don't know how to write the prompt.
Majority of people lack knowledge relevant to fine arts: Even if you know what beauty is, you don't know how to express it. For example, if you don't know that there is a painter named "Van Gogh" in the world, you won't likely write words like "Van Gogh style" in the prompt.

So, is there a solution?

I think the best method is to "imitate before surpassing". First, see how others write it, modify it based on your own needs, and try different variables in a controlled manner, such as only changing one word at a time. This way, you can understand how each word affects the generated image.

I would like to recommend a few good learning websites:

Civitai (opens in a new tab): The famous C site, currently the largest AI image community in the world. I've learned a lot there.
PromptHero (opens in a new tab): This is also a good AI image community, but it feels smaller than Civitai.
Learning Prompt (opens in a new tab): Although the tutorial I wrote back then was for Midjourney, the scenario examples inside are still relevant for learning Stable Diffusion.

Batch Image Workflow

To improve image efficiency, I suggest you do it this way:

Choose a relatively faster model to output images; If you use the SDXL, then select SDXL Base. Or use the latest Turbo.
ComfyUI has a significant difference in generating images compared to Midjourney; it doesn't generate images concurrently like Midjourney but rather one after the other. So when experimenting with prompts, it's recommended to generate one image at a time. If you're not satisfied, just click Queue Prompt a few more times.
When you are satisfied, you can generate multiple images at once.

When introducing the basic nodes of ComfyUI earlier, I mentioned that you could generate multiple images at once by adjusting the parameters of a certain node.

Let's recall how we can generate multiple images?

If you have forgotten, you can go back and take a look at ComfyUI ① Basics 一文。

Subscribe for free to receive new posts and support my work. Or join our Discord.

5. Basic operations 7. Embedding