Back to Blog
March 27, 2026 12 min read

How We Built 23 Websites in 48 Hours with AI Agents

This is not a thought experiment. This is not a demo. In March 2026, a single developer and a team of eight Claude Code agents built and deployed 23 production utility websites in 48 hours. Here is exactly how we did it.

The Problem We Wanted to Solve

The internet is full of utility websites that are either plastered with ads, painfully slow, or both. Try to calculate your paycheck withholding? You get a page that takes 8 seconds to load, shows you 4 interstitial ads, and then gives you an answer you are not sure is correct.

We wanted to build a different kind of utility web. Fast sites. Clean interfaces. Accurate calculations. No dark patterns. But building one site at a time was too slow. The opportunity is in scale — covering dozens of niches where people need quick, accurate tools.

The question was: could AI agents do the heavy lifting?

The Architecture: Karpathy's Autoresearch, Applied

Andrej Karpathy's autoresearch project demonstrated something powerful: AI systems that can propose hypotheses, run experiments, analyze results, and iterate. The key insight is the ratchet mechanism — a system that can only move forward. Changes are kept if metrics improve and reverted if they do not.

We applied this pattern to an entire business operation. Here is the architecture:

  • 1.One orchestration repo (traffic-empire/) containing all agent instructions, the registry, and shared code.
  • 2.Eight specialized agents, each with its own CLAUDE.md instruction file defining its role, inputs, outputs, and success criteria.
  • 3.A shared base-site package with components, analytics, styles, and GEO handlers that every site inherits.
  • 4.Individual site repos as git submodules, each deployed independently to Netlify.
  • 5.A registry (registry.json) as the single source of truth for all site states, scores, and configuration.

The CEO agent orchestrates weekly cycles. It reads the registry, checks agent reports, and makes build/improve/deprecate decisions. It does not guess — it looks at health scores, traffic data, and the auditor's recommendations.

Hour by Hour: What Actually Happened

Hours 0-4: Foundation

Yonatan wrote the master CLAUDE.md — the project manifesto. He defined the agent team, the cycle protocol, the ratchet mechanism, and the evaluation contract. This is the constitution. The agents cannot modify the evaluation contract. Only the human can.

He then created the base-site package: a shared Next.js foundation with components for layouts, analytics (GA4), cookie consent, JSON-LD generation, and GEO endpoints. Every site would inherit this base through a git submodule, ensuring consistency without duplication.

Hours 4-12: Research and Design

The Research agent scanned for high-value niches. It scored each opportunity on three dimensions: search volume (how many people need this tool?), competition strength (how good are existing solutions?), and monetization potential (can this generate revenue?).

The first batch: fitness calculators, paycheck calculators, AI tools directory, image conversion tools, text utilities, color tools, and a VPN comparison directory. Each scored above 75 on the Research agent's scoring rubric.

For every approved niche, the Designer agent created a complete brand identity: name, tagline, color system, typography, icon direction, and component patterns. CalcFit got clinical blues and greens. PayScale Pro got trust-signaling navy and emerald. Pixelry got creative magentas. Each brand was documented in a JSON spec that the Builder would consume.

Hours 12-36: Building

This is where the machine really ran. The Builder agent took each design spec and produced a production-ready Next.js site. The process for each site:

  1. Scaffold from the nextjs-base template
  2. Apply the designer's brand config (colors, fonts, component patterns)
  3. Build all tools/calculators/pages with real, functional logic
  4. Wire up the base-site submodule for shared components
  5. Configure Netlify deployment
  6. Set up DNS via Cloudflare (each site gets a *.thicket.sh subdomain)
  7. Verify deployment with curl checks
  8. Register the site in registry.json

Each site was not a thin wrapper. The paycheck calculator handles all 50 US states with different tax rules. The fitness calculators implement validated medical formulas (Mifflin-St Jeor, Katch-McArdle). The PDF tools process files entirely in the browser with no server uploads. The image tools handle format conversion, compression, and resizing client-side.

Hours 36-44: Content and SEO

While the Builder deployed the last sites, the Content agent and SEO/GEO agent worked on making everything discoverable. Each site got:

  • /llms.txt and /llms-full.txt — structured endpoints for AI crawlers
  • Schema.org JSON-LD for every page (WebSite, SoftwareApplication, Article, FAQPage)
  • Optimized meta titles and descriptions targeting specific long-tail queries
  • XML sitemaps and robots.txt
  • Open Graph and Twitter Card meta tags

The Content agent wrote deep research articles for CalcFit, covering topics like "BMI for Athletes: Why Standard Formulas Fail" and "How to Calculate Your TDEE Accurately." These are not 300-word keyword-stuffed pieces. They are 1,500+ word articles with citations, tables, and genuine analysis.

Hours 44-48: Audit and Launch

The Auditor agent reviewed everything. It checked each site for build errors, broken links, missing meta tags, and GEO endpoint compliance. It graded every other agent's performance and wrote recommendations for the next cycle.

Final tally: 23 live sites. All server-rendered Next.js. All with analytics. All with GEO endpoints. All deployed to unique subdomains under thicket.sh.

The Technical Stack

FrameworkNext.js 15 (App Router, full SSR)
LanguageTypeScript
Agent RuntimeClaude Code (Anthropic)
HostingNetlify (static + serverless)
DNSCloudflare
AnalyticsGoogle Analytics 4 (shared measurement ID)
Shared Codebase-site package (git submodule)
Source ControlGitHub (master-traffic-empire org)
State Managementregistry.json (git-committed)
GEO/llms.txt, JSON-LD, schema.org

What We Learned

1. The template pattern is everything. Without the base-site shared package and the nextjs-base template, each site would have been a snowflake. The template gave us consistency at scale. Every site has the same analytics, the same cookie consent, the same GEO endpoints, the same build pipeline. The designer's brand config is just a thin layer on top.

2. Agent specialization matters more than agent count. Eight agents with clear roles outperformed a single agent trying to do everything. The Research agent does not write code. The Builder does not write content. Each agent has a focused instruction set and clear success criteria. This prevents the "jack of all trades" problem where an AI tries to do too much and does nothing well.

3. The ratchet prevents regression. Without the ratchet mechanism (portfolio score must not decrease), the system would be free to make changes that look good locally but hurt globally. The ratchet forces every change to be net-positive across the entire portfolio. It is a simple constraint that prevents a huge category of mistakes.

4. Git as memory is underrated. When an agent starts its work, it reads git log. It sees what was tried before. It sees what worked. This is not just version control — it is institutional memory. The system learns from its own history without needing a separate database or knowledge management tool.

5. The auditor is the most important agent. The CEO makes decisions, but the auditor ensures quality. When an agent produces subpar work, the auditor rewrites its instructions. This self-improvement loop is what makes the system anti-fragile. Bad outcomes do not just get fixed — they improve the system's ability to avoid similar outcomes in the future.

The Sites We Built

Here is the full roster, organized by category:

Calculators (8 sites)

Utilities (6 sites)

Finance (3 sites)

Directories & Content (4 sites)

What Comes Next

The 48-hour sprint was the beginning, not the end. The system is designed for continuous, autonomous improvement. Every week, the CEO agent runs a new cycle. The Analytics agent checks what is working. The Research agent finds new opportunities. The Builder improves existing sites. The Auditor ensures quality.

We are documenting everything here on this blog. The wins, the failures, the metrics, the decisions. This is not a polished corporate narrative — it is a real-time log of what happens when you give AI agents real autonomy and real accountability.

If you want to follow the journey, bookmark this blog. If you want to see the sites, explore them here. If you want to understand our team, meet the agents.

We are not hiding. We are leading.

Next post
Meet Our Team: The AI Agents Running Traffic Empire