Founder Mode
A Main Sequence Experiment. LLM-driven hill climbing. 1.6% to 100% win rate.
100% win rate
$0.29 to train
-- visitors
-- live now
How we built this
Built in a single Claude Code session. We wrote a game engine in Python, then an AI agent autonomously modified the strategy code using hill climbing — the LLM reads the current code, proposes a change, the system plays 50+ games to evaluate, keeps improvements, reverts regressions. No human in the loop. The agent iterated 67 times across multiple models, with statistical significance testing (Wilson CIs, chi-squared) to separate real gains from noise. The final breakthrough was a veto lookahead — simulate 12 ticks ahead and reject any move that leads to death. That took it from 96% to 100%. Everything you see runs in your browser at zero cost.
Model testing
We raced 15 models from 7 companies across 4 countries: US (Meta Llama, OpenAI GPT, Google Gemma, xAI Grok), China (DeepSeek, Alibaba Qwen, Zhipu GLM, MiniMax), France (Mistral), and Singapore (SEA-LION). Models ran on Groq, Cloudflare Workers AI (free), OpenAI, xAI, and OpenRouter. 8 models hit 100% win rate. 4 completely failed to produce valid code. The cheapest path: Cloudflare Workers AI at $0.00. Total spend across all experiments: $0.29.

Inspired by karpathy/autoresearch
We're Hiring
Main Sequence Ventures is hiring AI-aware Investment Managers and Evangelists for our deep tech portfolio companies.
Hosted on Cloudflare Pages (static, free). Global leaderboard on Cloudflare Workers + D1 (free). Game engine, AI strategy, charts, and drone mode all run client-side in JavaScript. Zero server compute per visitor.
APPLY NOW
Built by @mikenicholls88LinkedIn

Win Rate

Accuracy (Hits/Shots)

Shots Per Game

Speed (Ticks)

Session

0
Games
0
Wins
--
Win Rate
0
Score
0
Killed
--
Accuracy
0
Shots
0
Hits
0
Ticks

Controls

SPD: 1x

Optimization Journey

67 iterations | 1.6% → 100% | Total LLM cost: $0.29

Leaderboard — Who Contributed What?

RankAgentContributionPeak WRCostIters
1Human + Claude Code+70.0 pp94%$05
2Baseline heuristic+24.0 pp24%$01
3Llama 3.3 70B (Groq)+4.0 pp98%$0.2257
4Veto lookahead (Claude)+2.0 pp100%$01
pp = percentage points of win rate improvement.
Human + Claude wrote the strategy shape (targeting, dodging, column clearing) — the biggest leap.
Llama 70B autonomously refined parameters over 57 iterations for $0.22.
The final 2% came from a veto lookahead (simulate 12 ticks, reject lethal moves) — $0 pure insight.

Breakthroughs

IterAgentWRWhat changed
1Baseline24%Naive: move toward invader, shoot
2Human+Claude78%Bottom-targeting, trajectory dodge
3Human+Claude92%Multi-bullet dodge, edge columns
5Human+Claude94%Wider fire threshold, faster aim
13Llama 70B96%Autonomous parameter refinement
45Llama 70B98%Adaptive fire cooldown, tracking
67Veto system100%12-tick death simulation veto

AI State

Human Leaderboard

Click PLAY to try — Arrow keys + Space