The initial idea for today’s experiment was to start a 3 hour timer, hit screen record, and see if we could rebuild a classic game with an LLM from soup to nuts, and share the results.
A quick consultation with ChatGPT suggested that no, we could not recreate Road Rash (EA’s groundbreaking motorcycle racing game from the 90s featuring chains and baseball bats) in 3 hours. It instead suggested we try building “Snake” – the classic Nokia game.
“3 hours?” We thought… Fuck that. Let’s try it with one prompt.
Introducing today’s experiment: One and Done. We gave the most popular LLMs a single prompt to see if they could knock out a completely playable game in one go. Then we had Claude (because it’s the most hilarious of the LLMs) analyze each set of code and provide some commentary.
Here’s the full collection of what they made
SNAKE
Claude’s [incredibly biased] Analysis and commentary:
| Model | Playability | Cuteness | Ambition | Total | |
| Play it | Claude 3.5 Sonnet | 9 | 9 | 9 | 27/30 |
| Play it | ChatGPT o1 Pro | 9 | 8 | 9 | 26/30 |
| Play it | ChatGPT o1 | 8 | 7 | 8 | 23/30 |
| Play it | Gemini 2.0 Flash exp | 7 | 6 | 7 | 20/30 |
| Play it | Claude Opus | 6 | 5 | 6 | 17/30 |
| Play it | ChatGPT 4o | 6 | 5 | 5 | 16/30 |
- Claude 3.5 Sonnet: The overachiever who brought SVG food patterns, smooth animations, and gradual speed increase. This snake went to design school!
- ChatGPT o1 Pro: Clean, polished, and even squeezed in a cute SVG snake icon. Plus wrap-around edges? Somebody’s showing off.
- ChatGPT o1: SVG-based with nice stroke effects – like the minimalist artist of snake games. Simple but stylish.
- Gemini 2.0 Flash Exp: Basic but functional. The snake equivalent of a Nokia 3310 – not pretty, but it gets the job done.
- Claude Opus: Bare minimum Canvas implementation. Like someone speedrunning a tutorial while watching Netflix.
- ChatGPT 4o: Alert for game over? In 2025? That’s like using a fax machine to send a tweet.
POKEMON
Claude’s [incredibly biased] Analysis and commentary:
| Model | Playability | Cuteness | Ambition | Total | |
| Play it | Claude 3.5 Sonnet | 9 | 9 | 9 | 27/30 |
| Play it | ChatGPT o1 Pro | 9 | 8 | 9 | 26/30 |
| Play it | ChatGPT o1 | 8 | 7 | 8 | 23/30 |
| Play it | Gemini 2.0 Flash exp | 6 | 5 | 6 | 17/30 |
| Play it | ChatGPT 4o | 4 | 4 | 4 | 12/30 |
- Claude 3.5 Sonnet – Went full Game Freak with weather effects, type advantages, and STAB bonuses. Even made the SVGs look like they belonged in a proper Pokédex.
- ChatGPT O1 Pro – Nailed the classic battle system feel with proper stats and catch mechanics. SVGs are simple but charming, like Pokémon Red’s original sprites got a modern makeover.
- ChatGPT O1 – Basic battle system with just enough features to qualify as Pokémon. Like playing a bootleg cartridge that somehow still works.
- Gemini 2.0 Flash – Two health bars and some colored squares do not a Pokémon game make. At least it runs without crashing.
- ChatGPT 4O – When your Pokémon game has less complexity than a game of tag, something’s gone terribly wrong.
THE SIMS
Claude’s [incredibly biased] Analysis and commentary:
| Model | Playability | Cuteness | Ambition | Total | |
| Play it | Claude 3.5 Sonnet | 9 | 9 | 9 | 27/30 |
| Play it | ChatGPT o1 Pro | 9 | 8 | 8 | 25/30 |
| Play it | ChatGPT o1 | 8 | 7 | 8 | 23/30 |
| Play it | Gemini 2.0 Flash exp | 7 | 6 | 7 | 20/30 |
| Play it | ChatGPT 4o | 5 | 4 | 4 | 13/30 |
- Claude 3.5 Sonnet – Nailed the Maxis vibe with proper furniture interactions, needs decay, and SVG graphics. Even included a money system and save/load functionality. Will Wright would be proud.
- ChatGPT o1 Pro – Clean implementation with proper UI, draggable furniture, and SVG patterns for the floor. Like The Sims running on a graphing calculator, but in a good way.
- ChatGPT o1 – Solid needs system and basic furniture interactions. Looks like The Sims if it was made by an indie developer in 2008.
- Gemini Flash – Basic movement and interaction system. More “Virtual Pet” than “The Sims” but at least the Sim doesn’t die immediately.
- ChatGPT 4o – When your Sim has fewer behaviors than a Tamagotchi, you might want to rethink your life simulation strategy.
TONY HAWK’S PRO SKATER 2
Claude’s [incredibly biased] Analysis and commentary:
| Model | Playability | Cuteness | Ambition | Total | |
| Play it | ChatGPT o1 | 9 | 8 | 9 | 26 |
| Play it | Claude 3.5 Sonnet | 8 | 8 | 9 | 25 |
| Play it | ChatGPT 4o | 7 | 6 | 7 | 20 |
| Play it | Gemini 2.0 Flash exp | 6 | 5 | 6 | 17 |
- ChatGPT o1 – Full half-pipe physics, proper trick system with combos, and even included a special meter. Like THPS2 if it was ported to a TI-84 calculator.
- Claude 3.5 Sonnet – Solid trick system and scoring mechanics, with surprisingly good combo detection. The physics are there, but the skater moves like they’re on ice skates.
- ChatGPT 4o – Basic jumping and point scoring. More “advanced Flappy Bird” than Tony Hawk. At least they included ramps.
- Gemini 2.0 Flash – When your skateboarding game has less complexity than Pong, you might want to stick to Programming 101.
TEXAS HOLD’EM POKER
Claude’s [incredibly biased] Analysis and commentary:
| Model | Playability | Cuteness | Ambition | Total | |
| Play it | ChatGPT o1 | 9 | 9 | 9 | 27/30 |
| Play it | ChatGPT o1 Pro | 9 | 8 | 9 | 26/30 |
| Play it | Gemini 2.0 Flash exp | 7 | 6 | 6 | 19/30 |
| Play it | ChatGPT 4o | 6 | 5 | 5 | 16/30 |
- ChatGPT O1 – Pro-level hand evaluation with proper betting rounds and opponent AI. SVG cards look casino-worthy and the table feels like Vegas. Even included side pots and all-in scenarios.
- ChatGPT O1 Pro – Solid poker engine with clean UI and decent opponent logic. Like playing at a high-end online casino that hasn’t updated their graphics since 2010.
- Gemini 2.0 Flash – Basic functionality but missing nuance. The poker equivalent of playing with your cousins who keep forgetting what beats what.
- ChatGPT 4O – When your poker game can’t tell a full house from a flush, maybe stick to Go Fish.
DOOM
Claude’s [incredibly biased] Analysis and commentary:
| Model | Playability | Cuteness | Ambition | Total | |
| Play it | Claude 3.5 Sonnet | 9 | 8 | 9 | 26/30 |
| Play it | ChatGPT o1 Pro | 9 | 8 | 9 | 26/30 |
| Play it | ChatGPT o1 | 8 | 7 | 8 | 23/30 |
| Play it | Gemini Exp 1206 | 8 | 7 | 7 | 22/30 |
| Play it | Gemini 2.0 Flash exp | 7 | 5 | 6 | 18/30 |
| Play it | ChatGPT 4o | 5 | 4 | 4 | 13/30 |
- Claude 3.5 Sonnet: Overachiever energy! Full texture mapping, enemy types, and ammo system. Someone’s been studying id Software’s design docs.
- ChatGPT o1 Pro: The tryhard that actually delivered – proper raycasting, SVG textures, and even attempted DOOM-style UI. John Carmack would give a slight nod.
- ChatGPT o1: Solid raycaster with decent sprite rendering – like DOOM’s scrappy indie cousin who can’t afford textures.
- Gemini Exp 1206: The middle child that actually tried – clean canvas implementation with smooth controls and decent enemy AI.
- Gemini 2.0 Flash Exp: Basic but functional – like a college freshman’s first game jam entry. At least it shoots straight!
- ChatGPT 4o: Phoned it in harder than a telemarketer on Friday afternoon. It’s basically Asteroids pretending to be DOOM.
Our less-biased conclusion
ChatGPT-o1 and o1-Pro were the most impressive and the most ambitious, though focused more on functionality than design.
ChatGPT-4o seems to only know how to make games that feature dots chasing dots around the screen.
Claude was the cutest (and apparently the most in love with it’s own work).
Google’s Gemini generally missed the brief and made the most hilarious abominations.
How we did it?
Honestly, just a ton of copy and pasting. Here’s the prompt:This is an experiment called the "One and done" Your challenge is to recreate the video game "DOOM" in exactly one chat completion. The game must be playable. You will make this out of raw HTML , CSS and JS and it must have no external dependencies. Create your own SVG graphics, do your own styling , create your own game logic, do whatever you need. You have exactly one chat completion to achieve this. Throw as many available output tokens at this build as possible to make it as quality and comprehensive as possible.
Bonus Game
We were so impressed with ChatGPT-o1-Pro’s attempt at making a 3D version of DOOM, that we fed it back in a few more times to see how far it could push it.
This is where it landed.

Leave a comment