pB09204048's picture

3 7

pB09204048

pb09204048

·

AI & ML interests

None yet

Organizations

None yet

upvoted 2 papers 3 months ago

Debunk the Myth of SFT Generalization

Paper • 2510.00237 • Published Sep 30, 2025 • 2

Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs

Paper • 2509.25779 • Published Sep 30, 2025 • 18

upvoted a collection about 1 year ago

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 12 days ago • 96