Autoregressive Image Generation with Randomized Parallel Decoding
Paper
β’
2503.10568
β’
Published
β’
9
Haopeng Li1, Jinyue Yang2, Guoqi Li2,π§, Huan Wang1,π§
1 Westlake University, 2 Institute of Automation, Chinese Academy of Sciences
ARPG is a novel autoregressive image generation framework capable of performing BERT-style masked modeling with a GPT-style causal architecture.
πͺ FID 1.94 π Fast Speed β»οΈ Low Memory Usage π² Radnom Order π‘ Zero-shot Inference
You can easily load it through the Hugging Face DiffusionPipeline and optionally customize various parameters such as the model type, number of steps, and class labels.
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("hp-l33/ARPG", custom_pipeline="hp-l33/ARPG")
class_labels = [207, 360, 388, 113, 355, 980, 323, 979]
generated_image = pipeline(
model_type="ARPG-XL", # choose from 'ARPG-L', 'ARPG-XL', or 'ARPG-XXL'
seed=0, # set a seed for reproducibility
num_steps=64, # number of autoregressive steps
class_labels=class_labels, # provide valid ImageNet class labels
cfg_scale=4, # classifier-free guidance scale
output_dir="./images", # directory to save generated images
cfg_schedule="constant", # choose between 'constant' (suggested) and 'linear'
sample_schedule="arccos", # choose between 'arccos' (suggested) and 'cosine'
)
generated_image.show()
If this work is helpful for your research, please give it a star or cite it:
@article{li2025autoregressive,
title={Autoregressive Image Generation with Randomized Parallel Decoding},
author={Haopeng Li and Jinyue Yang and Guoqi Li and Huan Wang},
journal={arXiv preprint arXiv:2503.10568},
year={2025}
}
Thanks to LlamaGen for its open-source codebase. Appreciate RandAR and RAR for inspiring this work, and also thank ControlAR.