Single-Stream Diffusion Transformer
Efficient Image Generation Model with 6B Parameters
🚀 Open source and publicly available
Z-Image Team, Tongyi MAI, Alibaba Group
Experience the power of Z-Image-Turbo with our interactive demo
Available on Leading AI Platforms

Z-Image is an efficient 6-billion-parameter foundation model for image generation. Through systematic optimization, it proves that top-tier performance is achievable without relying on enormous model sizes, delivering strong results in photorealistic generation and bilingual text rendering.
Photography-level realism with fine control over details, lighting, and textures. Achieves excellent aesthetic quality in composition and overall mood.
Achieves sub-second inference latency on enterprise-grade H800 GPUs. Only 8 steps needed for generation.
Accurate rendering of both Chinese and English text while preserving facial realism and overall aesthetic composition.
Can run smoothly on consumer-grade graphics cards with less than 16GB of VRAM, making advanced image generation accessible.
A glance at the powerful capabilities of the Z-Image model.

Z-Image offers specialized models for different use cases:
A distilled version of Z-Image with strong capabilities in photorealistic image generation, accurate rendering of both Chinese and English text, and robust adherence to bilingual instructions. It achieves performance comparable to or exceeding leading competitors with only 8 steps.
A continued-training variant of Z-Image specialized for image editing. It excels at following complex instructions to perform a wide range of tasks, from precise local modifications to global style transformations, while maintaining high edit consistency.
Discover the advanced capabilities of Z-Image across various domains.
Delivers strong photorealistic image generation while maintaining excellent aesthetic quality.
Accurately renders complex Chinese and English text in various scenarios.
Possesses vast understanding of world knowledge and diverse cultural concepts.
Uses structured reasoning chain to inject logic and common sense.
Precisely executes complex instructions for image transformations.
Demonstrates fine-grained control over image elements and transformations.
Competitive results on AI Arena with state-of-the-art performance among open-source models.
Parameters
Inference Steps
VRAM Required
Inference Latency
According to the Elo-based Human Preference Evaluation (on AI Arena), Z-Image shows highly competitive performance against other leading models.
Z-Image shows highly competitive performance against other leading models, while achieving state-of-the-art results among open-source models.
AI Arena
Elo Rating System
The model code, weights, and online demo are now publicly available to encourage community exploration and use.
Community
Open Source
We aim to promote the development of generative models that are accessible, low-cost, and high-performance.
Researchers
Academic
Get the latest news and updates about Z-Image.
Common questions about Z-Image and its capabilities.
For more information, visit our GitHub repository
Experience the power of efficient image generation with Z-Image.