DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling Paper • 2512.03000 • Published 22 days ago • 35
Monet: Reasoning in Latent Visual Space Beyond Images and Language Paper • 2511.21395 • Published 29 days ago • 15
VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation Paper • 2511.17199 • Published Nov 21 • 7
RynnVLA-002: A Unified Vision-Language-Action and World Model Paper • 2511.17502 • Published Nov 21 • 25
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds Paper • 2508.14879 • Published Aug 20 • 68
Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects Paper • 2511.01294 • Published Nov 3 • 13
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published Oct 23 • 55