Diffusion models flipped image generation on its head in 2021-2022. The core idea: train a model to remove noise from an image; at generation time, start with pure random noise and run the denoiser many times (typically 20-50 steps), conditioning on a text prompt at each step. The model gradually 'sees' the prompted image emerging from the noise. Stable Diffusion, DALL-E 2/3, Midjourney, Adobe Firefly, and Imagen are all diffusion models. The architecture's strengths: photorealistic output, strong style control, manageable compute. Variants: latent diffusion (operates in compressed image space — much faster), conditional diffusion (controls like ControlNet), and video diffusion (Sora, Veo).
GLOSSARY
What is Diffusion Model?
The AI architecture behind most image generators (Stable Diffusion, DALL-E, Midjourney) — generates images by progressively denoising random noise.
RELATED TERMS
Stable Diffusion
The most influential open-source image generation model, released by Stability AI in 2022 — the foundation of much of the AI art ecosystem.
Image Generation
AI that creates new images from text descriptions, reference images, or both — covering tools like Midjourney, DALL-E, Stable Diffusion, and Adobe Firefly.
Transformer
The neural network architecture underlying modern LLMs and most image AI — introduced by Google in 2017 and quickly the dominant approach.
Back to the AI Glossary