🎉 Official Release

Z-ImageEfficient Image Generation

Single-Stream Diffusion Transformer
Efficient Image Generation Model with 6B Parameters

🚀 Open source and publicly available

Z-Image Team, Tongyi MAI, Alibaba Group

Try Z-Image Turbo Online

Experience the power of Z-Image-Turbo with our interactive demo

Available on Leading AI Platforms

GitHubModelScopeHuggingFace
introduce

What is Z-Image?

Z-Image is an efficient 6-billion-parameter foundation model for image generation. Through systematic optimization, it proves that top-tier performance is achievable without relying on enormous model sizes, delivering strong results in photorealistic generation and bilingual text rendering.

Photorealistic Quality

Photography-level realism with fine control over details, lighting, and textures. Achieves excellent aesthetic quality in composition and overall mood.

Ultra-fast Inference

Achieves sub-second inference latency on enterprise-grade H800 GPUs. Only 8 steps needed for generation.

Bilingual Text Rendering

Accurate rendering of both Chinese and English text while preserving facial realism and overall aesthetic composition.

Efficient VRAM Usage

Can run smoothly on consumer-grade graphics cards with less than 16GB of VRAM, making advanced image generation accessible.

Core Features

A glance at the powerful capabilities of the Z-Image model.

Photography-level Realism

benefits

Model Variants

Z-Image offers specialized models for different use cases:

1

🚀 Z-Image-Turbo

A distilled version of Z-Image with strong capabilities in photorealistic image generation, accurate rendering of both Chinese and English text, and robust adherence to bilingual instructions. It achieves performance comparable to or exceeding leading competitors with only 8 steps.

2

✍️ Z-Image-Edit

A continued-training variant of Z-Image specialized for image editing. It excels at following complex instructions to perform a wide range of tasks, from precise local modifications to global style transformations, while maintaining high edit consistency.

Capabilities Showcase

Discover the advanced capabilities of Z-Image across various domains.

Photorealistic Generation

Delivers strong photorealistic image generation while maintaining excellent aesthetic quality.

Bilingual Text Rendering

Accurately renders complex Chinese and English text in various scenarios.

World Knowledge

Possesses vast understanding of world knowledge and diverse cultural concepts.

Semantic Understanding

Uses structured reasoning chain to inject logic and common sense.

Creative Editing

Precisely executes complex instructions for image transformations.

Instruction Following

Demonstrates fine-grained control over image elements and transformations.

Model Performance

Competitive results on AI Arena with state-of-the-art performance among open-source models.

6B Parameters

6B

Parameters

8 Inference Steps

8

Inference Steps

16GB VRAM Required

16GB

VRAM Required

<1s Inference Latency

<1s

Inference Latency

Human Preference Evaluation

According to the Elo-based Human Preference Evaluation (on AI Arena), Z-Image shows highly competitive performance against other leading models.

Z-Image shows highly competitive performance against other leading models, while achieving state-of-the-art results among open-source models.

arena

AI Arena, Elo Rating System

AI Arena

Elo Rating System

The model code, weights, and online demo are now publicly available to encourage community exploration and use.

community

Community, Open Source

Community

Open Source

We aim to promote the development of generative models that are accessible, low-cost, and high-performance.

researchers

Researchers, Academic

Researchers

Academic

Stay Updated

Get the latest news and updates about Z-Image.

Frequently Asked Questions

Common questions about Z-Image and its capabilities.







For more information, visit our GitHub repository

Start Using Z-Image Today

Experience the power of efficient image generation with Z-Image.

Z-Image - Efficient Image Generation Model