Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning Paper • 2510.03259 • Published Sep 26 • 57
Temporal Alignment Guidance: On-Manifold Sampling in Diffusion Models Paper • 2510.11057 • Published Oct 13 • 30
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published Jul 14 • 70
Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models Paper • 2505.17225 • Published May 22 • 64
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness Paper • 2505.22960 • Published May 29 • 16
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness Paper • 2505.22960 • Published May 29 • 16
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness Paper • 2505.22960 • Published May 29 • 16 • 1
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published Mar 10 • 31
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24, 2024 • 20
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24, 2024 • 20