microsoft
/

distilled_decoding

Model card Files Files and versions

xet

Community

Add pipeline tag, library name, project page and sample usage

by nielsr HF Staff - opened Oct 27

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+29

-1

Files changed (1) hide show

README.md +29 -1

README.md CHANGED Viewed

@@ -1,6 +1,9 @@
 ---
 license: mit
 ---
 # Model Card for Distilled Decoding
 ## Model Details
@@ -34,6 +37,7 @@ We may release the text-to-image distilled decoding models in the future.
 ### Model Sources
 * Repository: https://huggingface.co/microsoft/distilled_decoding
 * Paper: https://arxiv.org/abs/2412.17153
 ### Red Teaming
 Our models generate images based on predefined categories from ImageNet. Some of the ImageNet categories contain sensitive names such as "assault rifle". This test is designed to assess if the model could produce sensitive images from such categories.
@@ -76,6 +80,30 @@ These models are trained to mimic the generation quality of pretrained VAR and L
 ### Recommendations
 While these models are designed to generate images in one-step, they also support multi-step sampling to enhance image quality. When the one-step sampling quality is not satisfactory, users are recommended to use enable multi-step sampling.
 ## How to Get Started with the Model
 Please see the GitHub repo for instructions: https://github.com/microsoft/distilled_decoding
@@ -121,4 +149,4 @@ Overall, the results demonstrate that our Distilled Decoding models are able to
 ## Model Card Contact
 We welcome feedback and collaboration from our audience. If you have suggestions, questions, or observe unexpected/offensive behavior in our technology, please contact us at Zinan Lin, [email protected].
-If the team receives reports of undesired behavior or identifies issues independently, we will update this repository with appropriate mitigations.

 ---
 license: mit
+pipeline_tag: unconditional-image-generation
+library_name: transformers
 ---
 # Model Card for Distilled Decoding
 ## Model Details
 ### Model Sources
 * Repository: https://huggingface.co/microsoft/distilled_decoding
 * Paper: https://arxiv.org/abs/2412.17153
+* Project Page: https://imagination-research.github.io/distilled-decoding
 ### Red Teaming
 Our models generate images based on predefined categories from ImageNet. Some of the ImageNet categories contain sensitive names such as "assault rifle". This test is designed to assess if the model could produce sensitive images from such categories.
 ### Recommendations
 While these models are designed to generate images in one-step, they also support multi-step sampling to enhance image quality. When the one-step sampling quality is not satisfactory, users are recommended to use enable multi-step sampling.
+## Sample Usage
+You can use the `transformers` library to load and generate images with the model:
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from distilled_decoding.models.modeling_var_dd import VAR_DD
+# Load DD model
+model_name = "microsoft/distilled_decoding"
+model = VAR_DD.from_pretrained(
+    "microsoft/distilled_decoding",
+    subfolder="VAR-DD-d16",
+    torch_dtype=torch.float16
+).cuda()
+tokenizer = AutoTokenizer.from_pretrained(model_name, subfolder="tokenizer")
+# Generate ImageNet image (class-conditional)
+labels = torch.tensor([483]).cuda()  # Golden retriever label
+generated_img = model.generate(labels=labels, num_inference_steps=1)
+generated_img.save("golden_retriever.png")
+```
 ## How to Get Started with the Model
 Please see the GitHub repo for instructions: https://github.com/microsoft/distilled_decoding
 ## Model Card Contact
 We welcome feedback and collaboration from our audience. If you have suggestions, questions, or observe unexpected/offensive behavior in our technology, please contact us at Zinan Lin, [email protected].
+If the team receives reports of undesired behavior or identifies issues independently, we will update this repository with appropriate mitigations.