How I Shipped 100k LOC in 2 Weeks with Coding Agents

The Problem

Over the last two weeks, I shipped 100,000 lines of high-quality code using AI agents. But here’s what I learned talking to engineers across companies: we’re being asked to adopt AI coding tools (Cursor, Windsurf, GitHub Copilot, Claude Code) without instructions, support, or infrastructure on how to get ROI in production.

When we onboard developers, we give them documentation, coding standards, proven workflows, and collaboration tools. When we “deploy” AI agents, we give them nothing. They start fresh every time. No project context, no memory of patterns, no proven workflows.

So I compiled AI Coding Infrastructure, the missing support layer that agents need. Five components:

Project Memory (AGENTS.md): Your tech stack, patterns, conventions that agents read automatically before every response
Proven Workflows (Skills): Battle-tested TDD, debugging, code review patterns agents MUST follow
Specialization (Sub-Agents): 114+ domain experts working in parallel, not one generalist
Planning Systems (ExecPlans): Self-contained living docs for complex features
Autonomous Execution (Ralph): Continuous loops for overnight autonomous development

Get started: github.com/flora131/agent-instructions

How It Works

Project Memory

A single markdown file (AGENTS.md or CLAUDE.md) in your project root contains your tech stack, architectural patterns, and coding conventions. Agents automatically read this before every response, transforming them from stateless tools into stateful team members who know your project.

Skills: Mandatory Workflow Discipline

Skills are proven patterns from Anthropic Skills [2] and Superpowers [3]: spec-driven development (SDD), test-driven development (TDD), systematic debugging, code review, brainstorming, etc.

The difference: Mandatory First Response Protocol. Before ANY response, agents MUST:

List available skills
Check if ANY skill matches
If yes → Read and follow it exactly

Agents can’t rationalize “this is too simple for TDD” or “let me gather info first.” If the skill applies, they MUST use it. This prevents the most common failure mode: shortcuts that seem reasonable but create bugs and technical debt.

Credit: Superpowers framework by Jesse Vincent (obra) [3]

Custom Skills: Tailored to Your Workflow

You can also create custom skills for your unique workflows and patterns. One incredibly useful that I built is a meta-prompting skill called prompt-engineer that helps agents improve their own prompts based on project context and past failures. This was built using Anthropic’s skill creation in Anthropic Skills [2].

Credit: Built on Anthropic’s Task tool [2]

Sub-Agents: Parallel Specialization

114+ specialized agents (Python Pro, React Specialist, Cloud Architect, Security Engineer, ML Engineer, etc.) orchestrated by Agent Organizer. Up to 50 agents work in parallel on independent tasks.

Example: Building a notification system? Agent Organizer dispatches Backend Developer, WebSocket Engineer, Database Optimizer, and React Specialist all simultaneously.

Credit: Uses Awesome Claude Code Subagents [1]

ExecPlans: Self-Contained Living Documents

For complex features, agents auto-generate ExecPlans in specs/, which are fully self-contained documents that anyone can implement from without external context. Updated as work progresses. Includes Purpose, Progress, Decision Log, Plan of Work, Validation.

Nothing gets lost across context switches or handoffs.

Ralph: Overnight Autonomous Development

Named after the Ralph Wiggum meme [4], Ralph enables agents to run in continuous loops with no manual intervention.

How it works: Four files in .ralph/ directory:

prompt.md — Your instructions (“Port TypeScript to Python”)
sync.sh — Single iteration (reads prompt, runs Claude CLI, logs output)
ralph.sh — Continuous loop (runs sync, sleeps 10s, repeats)
visualize.py — Colored output showing progress

The agent works, commits changes, sleeps 10 seconds, continues until task completion. Manages its own context across iterations.

Real-world results:

Ships 6+ repos overnight at YC hackathons
Builds programming languages autonomously
Migrates entire codebases between technologies

Best on cloud VMs (AWS EC2, Google Compute, DigitalOcean) running a tmux session.

Credit: Geoffrey Huntley [4], high-level implementation from repomirror [6]

Real Results

Two weeks. 100,000 lines of production-quality code. I took the final 20% as the engineer, but the infrastructure got me to 80% much faster than before.

What worked:

Sub-agents prevented bottlenecks through parallel execution
Mandatory TDD caught bugs in design phase
ExecPlans survived context switches
Ralph built core features overnight

Critical human touch:

Reviewing agent-written plans: Caught architectural issues and edge cases before implementation
Refining skills and sub-agents: Customized agent behavior to match my workflow and this learning loop was essential
Final integration testing, business logic decisions, performance optimization

This transforms you from a passive user into a power user who configures how agents think and collaborate.

My AI-Augmented Development Workflow

The infrastructure above enables a research-driven development process that balances AI assistance with human oversight. Here’s the workflow that produced 100k lines of quality code:

Research & Requirements

Start with comprehensive research into the problem space: user needs, existing solutions, relevant technologies, and constraints. This research feeds into a Product Requirements Document (PRD) that articulates the what and why, including the problem statement, target users, success metrics, and business objectives without prescribing implementation details.

AI-Assisted Design

This is where AI coding infrastructure becomes critical. Brainstorm with your coding agent to explore technical possibilities. The agent leverages its knowledge of patterns and best practices to generate multiple approaches, identify challenges, and discuss trade-offs. This exploratory phase surfaces ideas that might not emerge from solo brainstorming.

Formalize the output into a Technical Design/Spec (often auto-generated as an ExecPlan). This describes the how: architecture decisions, API designs, data models, technology stack, system components, and scalability/security considerations.

Human Validation Loop

Critical checkpoint: Experienced engineers review the AI-assisted spec. This human oversight catches edge cases, validates assumptions, and ensures alignment with organizational standards. This acknowledges that AI assistance needs human verification. I spent significant time here catching architectural issues before implementation.

Incorporate feedback into a Refined Technical Design/Spec. This might involve adjusting architecture, adding clarifications, or reconsidering technology choices. The refined spec represents the agreed-upon technical approach with human validation baked in.

Execution

Break the refined spec into an Implementation Plan Doc (ExecPlans in specs/). This includes task decomposition, effort estimates, dependency mapping, and milestone definitions.

During Implementation, sub-agents work in parallel on independent tasks. Ralph handles overnight autonomous development for foundational features. Mandatory TDD skills catch bugs in the design phase.

Testing validates against both PRD objectives and technical spec requirements: unit tests, integration tests, performance testing, and final QA.

Why This Works

AI-augmented but human-validated: Balances the speed and breadth of AI with the judgment and experience of senior engineers. AI assists exploration and implementation while humans validate critical decisions.

Separation of concerns: Clear distinction between product requirements (PRD), technical design (Spec), and execution planning (Plan Doc/ExecPlan). Each artifact serves its specific purpose.

Feedback integration: Explicit human review loop after initial spec ensures first drafts benefit from iteration before implementation begins.

Research-driven: Starting with deep research rather than jumping to requirements ensures decisions are grounded in solid understanding of the problem space.

This workflow is particularly effective for complex projects where upfront planning investment pays dividends, teams leveraging AI coding tools, and organizations wanting to maintain human control over critical technical decisions while benefiting from AI capabilities.

5-Minute Setup

Copy template files to your project (AGENTS.md/CLAUDE.md, specs/, optionally .ralph/)
Open in your AI coding tool (Cursor, Windsurf, GitHub Copilot, Claude Code)
Ask: “Set up agent instructions, skills, and sub-agent support for this project”

Agent analyzes your codebase and populates AGENTS.md, then installs 100+ skills and 114+ sub-agents to your config directory.

You get: Project memory, mandatory workflows, specialized agents, ExecPlan templates, Ralph setup.

Repository: github.com/flora131/agent-instructions [5]

Why Open Source This?

Developers aren’t seeing production ROI from AI coding tools. Without infrastructure and support, these tools aren’t being maximized for their potential.

This infrastructure made the difference for me. From inconsistent results to 100,000 lines in two weeks. If it helps others build faster, it should be shared.

PRs welcome [5]: Build skills for your workflow, create domain sub-agents, improve setup, find better ExecPlan patterns, extend Ralph.

We’re at an inflection point. AI coding tools are deploying widely, but the infrastructure layer is missing. Let’s build it together and make it easier and faster for developers to use.

Credits

Built on excellent work by:

Superpowers [3]: Mandatory skill checking, TDD discipline, systematic debugging (Jesse Vincent)
Anthropic Skills [2]: Skills system and reusable patterns framework
Ralph Method [4]: Continuous agent loops for autonomous development (Geoffrey Huntley)
Sub-Agent Architecture: Anthropic’s Task tool and orchestration patterns [2]

Additional Resources

Complete Ralph Setup Script

The sync script that powers autonomous execution:

#!/usr/bin/env bash

cat .ralph/prompt.md | \
    claude -p --output-format=stream-json --verbose \
        --dangerously-skip-permissions --add-dir . | \
    tee -a .ralph/claude_output.jsonl | \
    uv run --no-project .ralph/visualize.py --debug

Place this in .ralph/sync.sh and make it executable. The continuous loop (.ralph/ralph.sh) repeatedly calls this script with 10-second sleeps between iterations.

Key Takeaways

AI Coding Infrastructure is the missing support layer for coding agents, providing project memory, proven workflows, specialization, planning systems, and autonomous execution
Mandatory skill checking prevents agents from rationalizing away best practices, making it structurally impossible to skip proven workflows like TDD
114+ specialized sub-agents enable parallel execution (up to 50 agents) with domain expertise instead of one generalist
Ralph method enables overnight autonomous development through continuous agent loops
5-minute setup via single prompt installs the complete infrastructure across any AI coding tool

References

[1] Awesome Claude Code Subagents - https://github.com/VoltAgent/awesome-claude-code-subagents

[2] Anthropic Skills Framework - https://github.com/anthropics/skills

[3] Superpowers Framework by Jesse Vincent - https://github.com/obra/superpowers

[4] Ralph Wiggum Method by Geoffrey Huntley - https://ghuntley.com/ralph/

[5] Agent Instructions Repository - https://github.com/flora131/agent-instructions

[6] Repomirror - High-level implementation of Ralph method - https://github.com/repomirrorhq/repomirror