Which AI draws the best Pokémon card from pure code? Every model gets the same prompt. No pre-built assets. Pure SVG artistry. AI-judged on a 100-point rubric.
| #▲ | Model | Score | Δ | Effort | Duration | Tokens | Date |
|---|---|---|---|---|---|---|---|
| 1 | claude-sonnet-4-5max4× | ★ 90 | ▼2 | 🧠 High | 67s | 5.2K | Mar 01, 2026 |
| 2 | claude-opus-4-6max4× | ★ 90 | ▼2 | 🧠 High | 285s | 19.5K | Mar 01, 2026 |
| 3 | gpt-5.268× | ★ 88 | ▼4 | 🧠 Low | 87s | 7.2K | Apr 15, 2026 |
| 4 | glm-53× | ★ 88 | ▼2 | 🧠 Medium | 93s | 5.8K | Mar 01, 2026 |
| 5 | gemini-3.1-pro-preview66× | ★ 87 | ▼5 | 🧠 Low | 43s | 5.6K | Apr 15, 2026 |
| 6 | claude-sonnet-4-6max4× | ★ 87 | ▼5 | 🧠 High | 626s | 50.6K | Mar 01, 2026 |
| 7 | kimi-k2.564× | ★ 86 | ▼6 | 🧠 Low | 459s | 8.9K | Apr 15, 2026 |
| 8 | minimax-m2.160× | ★ 86 | ▲1 | 🧠 Low | 173s | 9.8K | Apr 14, 2026 |
| 9 | minimax-m2.564× | ★ 85 | ▼5 | 🧠 Low | 263s | 5.4K | Apr 15, 2026 |
| 10 | gemini-2.5-pro-preview64× | ★ 84 | ▼11 | 🧠 Low | 65s | 8.1K | Apr 15, 2026 |
| 11 | gemini-3-flash-preview68× | ★ 82 | ▼3 | 🧠 Low | 13s | 3.4K | Apr 15, 2026 |
| 12 | step-3.5-flash:free41× | ★ 81 | — | 🧠 Low | 125s | 11.5K | Apr 07, 2026 |
| 13 | deepseek-v3.251× | ★ 80 | ▼10 | 🧠 Low | 119s | 3.7K | Apr 15, 2026 |
| 14 | grok-4.1-fast67× | ★ 79 | ▼11 | 🧠 Low | 24s | 4.8K | Apr 15, 2026 |
| 15 | claude-haiku-4.552× | ★ 79 | ▼13 | 🧠 Low | 31s | 5.0K | Apr 15, 2026 |
| 16 | gemini-2.5-flash-lite66× | ★ 77 | ▼8 | 🧠 Low | 28s | 10.9K | Apr 15, 2026 |
| 17 | trinity-large-preview:free50× | ★ 77 | ▼8 | 🧠 Low | 931s | 2.7K | Apr 15, 2026 |
| 18 | gpt-oss-120b67× | ★ 75 | ▼10 | 🧠 Low | 37s | 2.5K | Apr 15, 2026 |
| 19 | gemini-2.5-flash50× | ★ 75 | ▼17 | 🧠 Low | 35s | 7.3K | Apr 15, 2026 |
| 20 | gemini-2.0-flash-lite-00149× | ★ 73 | ▼19 | 🧠 Low | 19s | 2.9K | Apr 15, 2026 |
| 21 | gpt-5-nano65× | ★ 68 | ▼24 | 🧠 Low | 28s | 3.7K | Apr 15, 2026 |
PokéBench is an open-source visual coding benchmark that tests how well AI models can generate complex, structured visual output. Unlike text-based benchmarks, this measures spatial reasoning, color theory, typography, and artistic ability — all in a single prompt. The Pokémon card format was chosen because it requires every skill at once: illustration, layout, typography, and coherent design.
Show your model’s PokéBench score. Copy the badge code below and paste it into your README, blog, or docs.
[](https://pokebench.info?model=claude-sonnet-4-5)
[](https://pokebench.info?model=claude-opus-4-6)
[](https://pokebench.info?model=openai/gpt-5.2)
[](https://pokebench.info?model=z-ai/glm-5)
[](https://pokebench.info?model=google/gemini-3.1-pro-preview)
[](https://pokebench.info?model=claude-sonnet-4-6)
[](https://pokebench.info?model=moonshotai/kimi-k2.5)
[](https://pokebench.info?model=minimax/minimax-m2.1)
[](https://pokebench.info?model=minimax/minimax-m2.5)