Re-CatVTON

Official model weights for "Rethinking Garment Conditioning in Diffusion-based Virtual Try-On".

📄 Paper: Re-CatVTON
💻 Code: GitHub

Available Checkpoints

Dataset	Subfolder	Resolution
VITON-HD	`VITON-HD/checkpoint-16000/unet`	512×384
DressCode	`DressCode/checkpoint-32000/unet`	512×384

Usage

import torch
from diffusers import AutoencoderKL, UNet2DConditionModel, DDPMScheduler
from model.pipeline import RECATVTONPipeline
from model.attn_processor import SkipAttnProcessor
from model.utils import init_adapter

device = "cuda"
dtype = torch.bfloat16

# Load components
vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse").to(device, dtype)

# Choose one:
unet = UNet2DConditionModel.from_pretrained(
    "levinna/Re-CatVTON", 
    subfolder="VITON-HD/checkpoint-16000/unet"  # or "DressCode/checkpoint-32000/unet"
).to(device, dtype)

scheduler = DDPMScheduler.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-inpainting", # or can use Re-CatVTON scheduler config
    subfolder="scheduler"
)

# Initialize attention processors (disable cross-attention)
init_adapter(unet, cross_attn_cls=SkipAttnProcessor)

# Create pipeline
pipeline = RECATVTONPipeline(vae=vae, unet=unet, scheduler=scheduler)

You can check more detailed instructions on Official GitHub

License

This model is licensed under CC BY-NC 4.0 due to the usage of non-commercial datasets (VITON-HD, DressCode).

Model Weights: CC-BY-NC 4.0
Code: CC-BY-NC-SA 4.0

Citation

@article{na2025rethinking,
  title={Rethinking Garment Conditioning in Diffusion-based Virtual Try-On},
  author={Na, Kihyun and Choi, Jinyoung and Kim, Injung},
  journal={arXiv preprint arXiv:2511.18775},
  year={2025}
}

Downloads last month: -

Model tree for levinna/Re-CatVTON

Base model

stable-diffusion-v1-5/stable-diffusion-inpainting

Finetuned

(2)

this model