6 38 12

Tianyu Pang

P2333

https://p2333.github.io/

AI & ML interests

Machine Learning

Recent Activity

upvoted a paper about 1 month ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper 2 months ago

Diffusion Language Models are Super Data Learners

upvoted a paper 3 months ago

Imperceptible Jailbreaking against Large Language Models

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 96

upvoted a paper 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

upvoted a paper 3 months ago

Imperceptible Jailbreaking against Large Language Models

Paper • 2510.05025 • Published Oct 6, 2025 • 33

commented a paper 3 months ago

Imperceptible Jailbreaking against Large Language Models

Paper • 2510.05025 • Published Oct 6, 2025 • 33 •

upvoted 2 papers 3 months ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28, 2025 • 174

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 89

authored 7 papers 3 months ago

LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification

Paper • 2502.17421 • Published Feb 24, 2025

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1, 2025 • 76

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26, 2025 • 70

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26, 2025 • 69

upvoted a paper 3 months ago

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26, 2025 • 69

commented a paper 3 months ago

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26, 2025 • 69 •

upvoted a paper 3 months ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26, 2025 • 70

commented a paper 3 months ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26, 2025 • 70 •

upvoted 2 papers 4 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 83

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1, 2025 • 76

upvoted a collection 5 months ago

Perception Encoder

Collection

17 items • Updated Jul 11, 2025 • 73

Tianyu Pang

AI & ML interests

Recent Activity

Organizations

P2333's activity