Projectile quest progression
Update projectile behavior so enemy hits advance a QuestManager kill step, remove the projectile on impact, and clean it up after it travels off screen.
RESULTS
Ranked by Pass@1 on GameDevBench. Each score reflects one final effective attempt per task, evaluated by hidden Godot validation scripts.
LEADERBOARD
SEELE02 rows use the final effective validation results from the Seele report. External rows follow the GameDevBench README-style leaderboard reference.
| Rank | Model | Harness | Feedback | Pass | Fail | Total | Pass@1 |
|---|---|---|---|---|---|---|---|
| 1 | SEELE02-pro | Seele Claw | Final effective run | 209 | 124 | 333 | 62.8% |
| 2 | SEELE02-flash | Seele Claw | Final effective run | 183 | 150 | 333 | 55.0% |
| 3 | gemini-3-pro-preview | Gemini CLI | Screenshot + Video | — | — | — | 53.8% |
| 4 | gpt-5.4 | Codex | Screenshot + Video | — | — | — | 52.0% |
| 5 | gemini-3-flash-preview | Gemini CLI | Video | — | — | — | 46.9% |
| 6 | gpt-5.4-mini | Codex | Video | — | — | — | 43.2% |
| 7 | gpt-5.4-mini | OpenHands | Baseline | — | — | — | 38.4% |
| 8 | claude-sonnet-4-5 | Claude Code | Screenshot + Video | — | — | — | 34.8% |
| 9 | gemini-3-flash-preview | OpenHands | Screenshot + Video | — | — | — | 31.8% |
| 10 | kimi-k2.5 | OpenHands | Screenshot + Video | — | — | — | 20.7% |
| 11 | claude-haiku-4-5 | Claude Code | Video | — | — | — | 18.6% |
| 12 | claude-haiku-4-5 | OpenHands | Screenshot + Video | — | — | — | 17.7% |
| 13 | qwen3.5-397b | OpenHands | Baseline | — | — | — | 5.4% |
WHAT IS GAMEDEVBENCH?
GameDevBench evaluates agents on real Godot projects derived from web and video tutorials. Tasks require edits across scenes, scripts, UI, physics, shaders, TileSets, particles, resources, and runtime behavior. A submission passes only when Godot validation scripts say it passes.
Official categories cover 2D Graphics & Animation, 3D Graphics & Animation, User Interface, and Gameplay Logic.
Pass@1 counts one final effective attempt per task. Model self-reporting is not used as evidence of success.
EXAMPLE TASKS
These examples are compressed from real GameDevBench task_config instructions, preserving the actual task substance while making them readable on the page.
Update projectile behavior so enemy hits advance a QuestManager kill step, remove the projectile on impact, and clean it up after it travels off screen.
Clone a SafePlatform into an UnsafePlatform with an AnimationPlayer that shifts warning colors and disables Area2D processing during the red danger window.
Create a viewport-filling Control scene with launch, pause, and restart panels, exact node names, offsets, labels, buttons, font overrides, CanvasLayer, and script wiring.
Update a Godot shader with border-smoothing uniforms and wire the required ShaderMaterial parameters on the MeshInstance3D while preserving its texture input.
Modify TileSet metadata so six waterfall atlas tiles render above the player with a higher z-index and half-opacity modulation, without changing the map structure.
Place six tight semi-transparent Polygon2D circles over yellow star coins in a platformer image, minimizing spill outside each coin.
SOURCES