Arabic LLM Checkpoints
Mingzhe Du PRO
AI & ML interests
Code Generation / Preference Alignment
Recent Activity
upvoted
a
paper
about 17 hours ago
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
upvoted
a
paper
about 17 hours ago
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
updated
a dataset
4 days ago
Elfsong/Qwen3_4B_Arabic_200-responses-Syrian