返回博客
基准测试
2026年1月3日
Qwen-Image-2512 vs. Z-Image Turbo:5个提示词基准测试 - 哪个模型更好?
我们使用5个复杂提示词对Qwen-Image-2512和Z-Image Turbo进行了并排测试。查看提示词遵循度、文本渲染和细节丰富度的结果,决定哪个模型最适合您。
引言
随着Qwen-Image-2512的发布,许多创作者都在问:它与广泛使用的Z-Image Turbo相比如何?具体来说,哪个模型在处理复杂指令、特定文本渲染和复杂细节方面表现更好?
为了找出答案,我们在720x1280(竖屏)分辨率下使用相同的种子和提示词进行了直接的A/B测试。
测试流程:
- 平台:zimage.run
- 设置:
- 模型A:Qwen-Image-2512
- 模型B:Z-Image Turbo
- 分辨率:720x1280
- 为什么选择这个工具:它托管了两个模型,并允许免费、无需登录的测试,便于重现这些结果。
5个测试提示词和结果
以下是本次对比使用的确切提示词。您可以复制它们来验证结果。
(注意:左图 = Qwen-Image-2512,右图 = Z-Image Turbo)
1. 小丑(纹理与光照)
重点:物理皮肤纹理(开裂的妆容)和戏剧性光照。
An ultra-detailed, hyper-realistic extreme close-up portrait of The Joker. The frame is filled with his face in a tense three-quarter profile, capturing a moment of unsettling stillness. His skin is a grotesque canvas: a thick layer of caked, smeared white makeup cracks like dry earth, revealing sallow, scarred skin beneath. Crazed streaks of smudged red lipstick stretch far beyond his lips into a permanent, manic grimace. Toxic green hair, oily and unkempt, frames his face. The eyes are the focal point—hollow, dark-rimmed, and gleaming with a volatile mix of calculated madness and raw, chilling mirth. Every pore, every flake of peeling makeup, and the subtle, menacing tension in his jaw muscles are rendered in microscopic detail. Dramatic, chiaroscuro lighting from a single source casts deep shadows across his features, creating extreme contrast and amplifying the sinister, iconic atmosphere. Shot on a phantom high-speed camera, 8K resolution, with the texture and impact of a key film still from a psychological thriller.
2. 网红与特定文本(文本渲染)
重点:在霓虹灯牌上生成特定文本字符串[Qwen-Image-2512]。
A stunning, intimate editorial portrait focused on the charismatic face of a 21-year-old blonde social media influencer. She flashes a playful, knowing smile while confidently pointing a manicured finger directly towards the sleek, glowing neon sign bearing the text "[Qwen-Image-2512]". Soft, directional natural light from a large window washes over her, creating a high-contrast interplay of light and shadow that sculpts her flawless features, sparkling eyes, and textured blonde hair. The atmosphere is modern, vibrant, and stylish, with a shallow depth of field that renders the chic, minimalist urban loft background into a soft, creamy bokeh, ensuring all focus remains on her engaging expression and the luminous sign.
3. 蒸汽朋克大都市(场景复杂度)
重点:垂直场景中的细节密度和构图。
A breathtaking cinematic masterpiece, ultra-wide panorama of a vast, multi-layered steampunk metropolis nestled within a colossal mountain canyon at sunrise. The city is a vertical labyrinth: towering Neo-Victorian spires with glowing clockwork faces, mid-level residential districts of brass and stained glass connected by buzzing aerial trams, and bustling lower streets where steam-carriages navigate cobblestone roads. The sky is dominated by a fleet of majestic brass-and-wood airships with canvas wings, some docking at skyscraper-sized clockwork towers, others departing alongside smaller personal ornithopters. Countless copper pipes and vents emit plumes of steam, catching the brilliant golden-hour light which creates long, dramatic shadows and glints off countless gears, glass domes, and polished brass. Victorian-clad citizens crowd grand plazas, market stalls, and intricate bridge networks, full of life. In the foreground, a massive, slowly-turning central gear and a cascading waterfall turned into a steam-powered generator add dynamic scale. The atmosphere is thick with hopeful industry, mist, and sunbeams, hyper-detailed, 8K, epic sense of scale and wonder.
4. 宿舍房间(氛围)
重点:室内光照和特定物体放置。
A close-up, dynamic selfie of a 20-year-old American college student with long, flowing hair and a model's poised, athletic figure. She has a bright, confident smile and expressive eyes, capturing a moment of lively charm. She wears a casual yet stylish outfit, like a fitted university sweatshirt slipped off one shoulder. The photo is taken in a classic American dorm room: behind her, a cozy loft bed with school-branded blankets is visible, alongside a desk cluttered with textbooks, a laptop, and a poster-covered wall featuring a university flag or souvenir. Sunlight streams warmly through a nearby window, casting soft, natural light that highlights her features and the vibrant, youthful atmosphere. The image is sharp, clear, and full of life, embodying the authentic, energetic spirit of campus life.
5. 新艺术风格(风格迁移)
重点:对Alphonse Mucha艺术风格的遵循度。
A graceful Art Nouveau depiction of a "Winter Goddess." Flowing, organic lines frame intricate patterns of frost-kissed pine branches, holly berries, and delicate snowflakes woven into her hair and gown. Silver leaf accents glimmer like ice against a muted wintry palette of frosted blues, deep evergreen, and soft pearl white. In the style of Alphonse Mucha, the composition is highly decorative and ornamental, evoking the serene yet majestic beauty of a snow-blanketed forest.
结论
基于这5个在720x1280分辨率下的测试,以下是两个模型的对比:
-
指令遵循度:
Qwen-Image-2512倾向于更字面和粗粝。在小丑测试中,它严格遵循了"像干裂的土地一样开裂"的指令,产生了高度纹理化、几乎有触感的结果。Z-Image Turbo遵循了指令,但应用了一层美学平滑,产生了更干净的外观。 -
文本渲染:
两个模型都成功理解了文本请求。Z-Image Turbo在霓虹灯牌上生成了清晰、连贯的字符,创造了令人信服的视觉效果。然而,Qwen-Image-2512展示了更高的精确度,准确拼写了特定字符串并包含了所要求的标点符号。 -
视觉丰富度:
在像蒸汽朋克大都市这样的复杂场景中,Qwen-Image-2512在垂直画面中填充了高密度的信息(纹理、背景齿轮)。Z-Image Turbo优先考虑平衡的构图,通常简化背景元素以保持焦点清晰。