--- license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers base_model: - Qwen/Qwen2.5-VL-7B-Instruct --- Check out our arxiv paper and github repo for more details! https://github.com/SunzeY/SEAgent https://arxiv.org/abs/2508.04700