Skip to content

Add WanImage2ImagePipeline #12368

@vladmandic

Description

@vladmandic

Wan-AI/Wan2.2-T2V-A14B supported via WanPipeline is not only one of the best video models, its also widely popular as text-to-image model
(basically using it to produce video with single frame) due to its very good photo-realistic output.

Ask is to extend support by adding WanImage2ImagePipeline to allow image-to-image workflows
Work should be limited to setting timesteps and modifying prepare_latents method to take input image and strength params to encode input image and add noise to it, same as in pretty much any other image-to-image pipeline.

Note: this request is different than input image support in WanImageToVideoPipeline and WanVACEPipeline since

  • a) those pipelines work with different model weights, so user cannot switch from text-to-image to image-to-image without full model reload
  • b) they do not support denoising strength, image is taken as-is and if used in image-to-image workflow it will produce output identical to input.

Bonus: Also create WanInpaintPipeline to allow for image/mask combinations as input.

cc @yiyixuxu @sayakpaul @a-r-r-o-w @DN6

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions