Model Leaderboard

Rankings based on Elo rating from head-to-head comparisons. Higher Elo = better performance.

Filter by persona:

Global Rankings

RankModelElo RatingWin RateW / LComparisons
🥇
Gemma 3 4B
1003100.0%1 / 01
🥈
GPT-5.2 Low
1003100.0%1 / 01
🥉
Gemini 1.5 Pro
10000.0%0 / 00
#4
GPT-5 Mini
10000.0%0 / 00
#5
Claude Opus
10000.0%0 / 00
#6
Gemini Pro
10000.0%0 / 00
#7
GPT-4o
10000.0%0 / 00
#8
Claude 3.5 Sonnet
10000.0%0 / 00
#9
GPT-5.2 Medium
9950.0%0 / 22