PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity Paper • 2510.23603 • Published Oct 27, 2025 • 22
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9, 2025 • 109
Geometry-Editable and Appearance-Preserving Object Compositon Paper • 2505.20914 • Published May 27, 2025 • 6
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? Paper • 2506.05287 • Published Jun 5, 2025 • 14
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published Dec 31, 2024 • 46