Z-image reimagine project
A workflow for reimagining movie posters using a Python script, a vision-language model (qwen3-vl-8b), and the Stable Diffusion image generation model (Z).
Why it matters
This workflow demonstrates how AI-powered image generation can be used to reimagine and transform existing visual content in creative ways.
Key Points
- 1Uses a Python script to scan a directory of movie posters and generate detailed descriptions using a vision-language model
- 2Passes the descriptions to Stable Diffusion (Z) to generate reimagined versions of the posters
- 3Includes tips for improving the results, such as telling the vision-language model to name characters and using a specific K-sampler configuration
Details
The 'reimagine' workflow involves using a Python script to scan a directory of movie posters (or any other images), generate detailed descriptions of the contents using a vision-language model (qwen3-vl-8b), and then pass those descriptions to the Stable Diffusion image generation model (Z) to create reimagined versions of the posters. The author has found that telling the vision-language model to name each character in the scene helps to avoid duplicate faces, and a specific K-sampler configuration with a 0.6 denoise and 2x contrast helps to get more variety from the Stable Diffusion model. The author has also decided not to use any face detailers or upscales, as they can lead to increased skin noise in the generated images.
No comments yet
Be the first to comment