ThinMQM (automated translation evaluation, MQM) model and data collection.
Runzhe Zhan
rzzhan
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 1 month ago
VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning
upvoted
a
paper
about 1 month ago
TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models
upvoted
a
paper
about 1 month ago
P1: Mastering Physics Olympiads with Reinforcement Learning
Organizations
None yet