Waypoint-1 Collection The first real time diffusion world model designed for consumer hardware • 2 items • Updated 5 days ago • 6
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published Dec 1, 2025 • 92
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems Paper • 2512.24385 • Published 25 days ago • 8
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published 26 days ago • 45
OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding Paper • 2512.23646 • Published 26 days ago • 15
Autoregressive Image Generation with Randomized Parallel Decoding Paper • 2503.10568 • Published Mar 13, 2025 • 9
Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View Paper • 2503.12553 • Published Mar 16, 2025 • 8
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published May 24, 2025 • 26
DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models Paper • 2411.15024 • Published Nov 22, 2024 • 2
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 253