Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet Paper • 2509.06861 • Published Sep 8, 2025 • 8 • 2