What One Prompt Built

The question was simple: how far can one prompt go?

Rogue Extract gameplay, enemies swarming, projectiles firing

That’s Rogue Extract. A Vampire Survivors-style roguelite in Godot 4. One prompt produced the entire thing: a playable game, full design documentation, a multi-version art pipeline, and a strategy document that was more honest about the project’s gaps than most human-written postmortems.

Corvus, hooded plague doctor with green glowing eyes

Corvus, 64x64 pixels

What the prompt produced

The first git commit contained 344 GDScript files, 181 scene files, and 38 resources. Five working weapons (Ice Spear, Tornado, Javelin, Acid Flask, Toxic Cloud), seventeen enemy types with coded behavior, a meta-progression system with gold economy and permanent unlocks, and multi-platform export for both Windows and web.

It also produced a 179-line game design document, a 668-line strategy assessment, and a prioritized work queue. And the art pipeline: four Python scripts totaling over 4,000 lines, the most advanced version running 2,292 lines with reference-driven generation, variant scoring, and auto-deployment into Godot scene files.

The sprite coherence problem

The interesting research wasn’t the game code. It was the art pipeline. A “toxic slime” and a “plague bat” generated separately share nothing: palette, proportions, outline weight, all different. The fix was a strict style prefix with locked hex codes:

STYLE_PREFIX = (
    "16-bit retro indie pixel art, "
    "BOLD shapes, THICK 2-pixel black outlines, HIGH CONTRAST. "
    "Limited 24-color palette: purple-blacks (#0D0B1A, #1A1333), "
    "toxic greens (#2D8B4E, #3EBF68), amber golds (#D4A030, #F0C850), "
    "corrupted reds (#8B2D2D, #BF3E3E), bone whites (#D4C8B0, #E8DCC8). "
)

Walk cycles were harder. Four frames of one character with identical proportions and subtle pose changes. Ask Gemini for all four at once and you get four different characters. The pipeline solved this with two-stage generation: create a reference sprite first, then feed it back as a multimodal input with frame-by-frame instructions.

4 frames, two-stage reference-driven generation

The honest assessment

The strategy doc is the most interesting artifact. It didn’t just plan. It graded itself:

Area	Designed	Built	Gap
Run length	15 minutes	5 minutes	67% short
Characters	6	1	No variety
Enemy art	17 coded	6 with art	11 on placeholders
Sound assets	47 needed	~12	74% missing
Behavior AI	Beehave (installed)	Unused	All basic vector math

Those numbers came from the same prompt that built the game. The system that generated 344 scripts also generated a document explaining exactly where those scripts fall short. “Every run feels identical,” it wrote. “Same arena, same enemy sequence, same weapon options, one character. Zero run variety.”

What broke

Boss rendered as solid white silhouette, damage flash shader stuck — The Amalgam, rendered as a white rectangle. The damage flash shader gets stuck.

Green fringe bleeds through the chromakey removal. Eleven of seventeen enemies still run on placeholder art. The player character vanishes into its own floor tiles (dark purple on dark purple). The 519% CPU crash got fixed with 135 lines of GDScript that throttle spawning below 30 FPS and cull the farthest enemies when the count exceeds 200. The white flash shader remains broken.

What happened next

The commit went in at 11:42 PM. By 11:44, six art improvement iterations had run. By 1:31 AM, ten gameplay iterations had completed: bug fixes, balance tuning, weapon adjustments. The overnight automation loop took over and has been running nightly since, each cycle pulling from the work queue, testing, committing if stable, rolling back if not.

The prompt produced a playable game. The automation loop is trying to produce a good one.

344GDScript files

181Scenes

17Enemies

5Weapons

2,292Lines (art pipeline v4)