RELIC: Interactive Video World Model with Long-Horizon Memory Paper • 2512.04040 • Published 23 days ago • 23
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24 • 60
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published Nov 20 • 91
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation Paper • 2511.13655 • Published Nov 17 • 9
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12 • 201
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12 • 76
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11 • 44
UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning Paper • 2510.20286 • Published Oct 23 • 23
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark Paper • 2510.26802 • Published Oct 30 • 33
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30 • 119
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6 • 118
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 56
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence Paper • 2506.15677 • Published Jun 18 • 23
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14 • 60