Safetensors
qwen2_5_vl

Improve model card: Add tags, detailed description, usage, and citation

#1
by nielsr HF Staff - opened

This PR significantly enhances the model card for SpatialThinker by:

  • Adding the pipeline_tag: image-text-to-text to improve discoverability for multimodal vision-language tasks.
  • Adding library_name: transformers as evidenced by the model's architecture files and GitHub requirements, which will enable the automated "how to use" widget.
  • Expanding the model description with the paper's abstract.
  • Including direct links to the Hugging Face paper page, the project page, and the GitHub repository.
  • Integrating key sections from the GitHub README, including Requirements, Installation, Training, Merge Checkpoints, Evaluation (with shell command examples), Supported Evaluation Datasets, Citation (BibTeX), and Acknowledgements to provide comprehensive usage information.
  • Adding the overview image from the GitHub repository.

These updates aim to provide a more complete and useful model card for the Hugging Face community.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment