Paper: A UNIVERSITY-LEVEL BENCHMARK FOR EVALUATING MATHEMATICAL SKILLS IN LLMS
Toloka
company
Verified
AI & ML interests
Human-expert data for frontier reasoning, safety and agentic AI
Recent Activity
Organization Card
Hey, this is Toloka!
datasets
12
toloka/vist
Viewer
•
Updated
•
39.3k
•
79
toloka/VOX-DUB
Viewer
•
Updated
•
7.58k
•
305
•
10
toloka/JEEM
Viewer
•
Updated
•
2.2k
•
57
•
11
toloka/beemo
Viewer
•
Updated
•
2.19k
•
290
•
18
toloka/u-math
Viewer
•
Updated
•
1.1k
•
179
•
24
toloka/mu-math
Viewer
•
Updated
•
1.08k
•
68
•
23
toloka/CLESC
Viewer
•
Updated
•
500
•
26
•
2
toloka/VoxDIY-RusNews
Updated
•
426
•
3
toloka/CrowdSpeech
Updated
•
172
•
5
toloka/crowdkit-datasets
Updated
•
2.89k