<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Alex Lavaee&apos;s Blog</title><description>Posts about machine learning, AI, and all things tech.</description><link>https://alexlavaee.me/</link><language>en-us</language><item><title>Microsoft Build’s Developer Story Is the Agent Stack</title><link>https://alexlavaee.me/blog/microsoft-build-agentic-developer-stack/</link><guid isPermaLink="true">https://alexlavaee.me/blog/microsoft-build-agentic-developer-stack/</guid><description>A practical look at Microsoft Build’s coding-agent announcements: Copilot, GitHub, MAI-Code-1-Flash, Windows dev infrastructure, local AI hardware, and developer skepticism.</description><pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.CqkeEnpr.png&quot; alt=&quot;Microsoft Build’s Developer Story Is the Agent Stack&quot; /&gt;&lt;p&gt;A practical look at Microsoft Build’s coding-agent announcements: Copilot, GitHub, MAI-Code-1-Flash, Windows dev infrastructure, local AI hardware, and developer skepticism.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/microsoft-build-agentic-developer-stack/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Claude Opus 4.8 Is About Reliability</title><link>https://alexlavaee.me/blog/claude-code-opus-4-8-developer-verdict/</link><guid isPermaLink="true">https://alexlavaee.me/blog/claude-code-opus-4-8-developer-verdict/</guid><description>A practical developer-focused look at Claude Opus 4.8 in Claude Code: what changed, what technical users are reporting, how pricing works in real projects, and where it fits against other coding models.</description><pubDate>Fri, 29 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.CPvBgE78.png&quot; alt=&quot;Claude Opus 4.8 Is About Reliability&quot; /&gt;&lt;p&gt;A practical developer-focused look at Claude Opus 4.8 in Claude Code: what changed, what technical users are reporting, how pricing works in real projects, and where it fits against other coding models.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/claude-code-opus-4-8-developer-verdict/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>The New Shape of Supply-Chain Trust</title><link>https://alexlavaee.me/blog/shai-hulud-github-and-the-new-shape-of-supply-chain-trust/</link><guid isPermaLink="true">https://alexlavaee.me/blog/shai-hulud-github-and-the-new-shape-of-supply-chain-trust/</guid><description>The latest Shai-Hulud waves and GitHub’s poisoned-extension breach show that supply-chain security now includes developer laptops, IDEs, CI runners, AI tooling, package scripts, and cloud credentials.</description><pubDate>Thu, 28 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BFeErTWC.png&quot; alt=&quot;The New Shape of Supply-Chain Trust&quot; /&gt;&lt;p&gt;The latest Shai-Hulud waves and GitHub’s poisoned-extension breach show that supply-chain security now includes developer laptops, IDEs, CI runners, AI tooling, package scripts, and cloud credentials.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/shai-hulud-github-and-the-new-shape-of-supply-chain-trust/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Workflows Are the New Interface for Coding Agents</title><link>https://alexlavaee.me/blog/workflows-are-the-interface-for-coding-agents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/workflows-are-the-interface-for-coding-agents/</guid><description>Claude Code Dynamic Workflows make one thing obvious: the future of coding agents is scripted, inspectable orchestration. Atomic takes that idea and makes it developer-owned.</description><pubDate>Thu, 28 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.C8RxKmz9.png&quot; alt=&quot;Workflows Are the New Interface for Coding Agents&quot; /&gt;&lt;p&gt;Claude Code Dynamic Workflows make one thing obvious: the future of coding agents is scripted, inspectable orchestration. Atomic takes that idea and makes it developer-owned.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/workflows-are-the-interface-for-coding-agents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Alignment is moving into the agent control plane</title><link>https://alexlavaee.me/blog/alignment-moved-fleet-control-plane/</link><guid isPermaLink="true">https://alexlavaee.me/blog/alignment-moved-fleet-control-plane/</guid><description>Plan Mode, Outcomes, Skills, and agent-as-judge workflows point toward a shared pattern: reliable coding agents depend less on a single prompt and more on the planning, steering, memory, and verification systems around the model.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.cPSAstw-.png&quot; alt=&quot;Alignment is moving into the agent control plane&quot; /&gt;&lt;p&gt;Plan Mode, Outcomes, Skills, and agent-as-judge workflows point toward a shared pattern: reliable coding agents depend less on a single prompt and more on the planning, steering, memory, and verification systems around the model.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/alignment-moved-fleet-control-plane/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>How to align coding agents with your plans better than markdown, without burning tokens</title><link>https://alexlavaee.me/blog/html-artifacts-coding-agent-plans/</link><guid isPermaLink="true">https://alexlavaee.me/blog/html-artifacts-coding-agent-plans/</guid><description>Thariq Shihipar at Claude Code has been making the case that HTML beats markdown for agent output. We agree. The DeepSeek-OCR screenshot trick is what makes the token cost honest enough to do this at every plan stage.</description><pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.CDZwxbWq.png&quot; alt=&quot;How to align coding agents with your plans better than markdown, without burning tokens&quot; /&gt;&lt;p&gt;Thariq Shihipar at Claude Code has been making the case that HTML beats markdown for agent output. We agree. The DeepSeek-OCR screenshot trick is what makes the token cost honest enough to do this at every plan stage.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/html-artifacts-coding-agent-plans/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Atomic&apos;s Ralph Loop: a deterministic plan → orchestrate → review for long-running, ambiguous work</title><link>https://alexlavaee.me/blog/atomic-ralph-loop/</link><guid isPermaLink="true">https://alexlavaee.me/blog/atomic-ralph-loop/</guid><description>Geoffrey Huntley&apos;s Ralph primitive opened up a rich ecosystem of looped coding agents. Atomic&apos;s built-in Ralph builds on that lineage with RFC-driven planning, schema-enforced dual review, deterministic file-grouped finding clusters, and a captured branch changeset injected into both reviewers — designed for unattended long-running work where every step needs to be inspectable after the fact.</description><pubDate>Thu, 07 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.SMpsuvTC.png&quot; alt=&quot;Atomic&apos;s Ralph Loop: a deterministic plan → orchestrate → review for long-running, ambiguous work&quot; /&gt;&lt;p&gt;Geoffrey Huntley&apos;s Ralph primitive opened up a rich ecosystem of looped coding agents. Atomic&apos;s built-in Ralph builds on that lineage with RFC-driven planning, schema-enforced dual review, deterministic file-grouped finding clusters, and a captured branch changeset injected into both reviewers — designed for unattended long-running work where every step needs to be inspectable after the fact.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/atomic-ralph-loop/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>The Coding Benchmark We Actually Need</title><link>https://alexlavaee.me/blog/programbench-memorization-vs-engineering/</link><guid isPermaLink="true">https://alexlavaee.me/blog/programbench-memorization-vs-engineering/</guid><description>The benchmarks worth caring about measure the GDP a coding agent can generate, not how much of SQLite it can recall. ProgramBench gets one key thing right while other design choices deserve scrutiny. Here&apos;s how to combine its best idea with real outcome measurement to build the benchmark we actually need.</description><pubDate>Wed, 06 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BM7-4jma.png&quot; alt=&quot;The Coding Benchmark We Actually Need&quot; /&gt;&lt;p&gt;The benchmarks worth caring about measure the GDP a coding agent can generate, not how much of SQLite it can recall. ProgramBench gets one key thing right while other design choices deserve scrutiny. Here&apos;s how to combine its best idea with real outcome measurement to build the benchmark we actually need.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/programbench-memorization-vs-engineering/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>DeepSeek V4: What&apos;s Inside, How It Compares, and Where It Actually Wins</title><link>https://alexlavaee.me/blog/deepseek-v4-architecture-benchmarks-engineer-verdict/</link><guid isPermaLink="true">https://alexlavaee.me/blog/deepseek-v4-architecture-benchmarks-engineer-verdict/</guid><description>DeepSeek V4 dropped on April 24 with a 1.6T-parameter open-weights model that costs roughly 1/7 of Claude Opus 4.7 and posts coding numbers in the same neighborhood. Here&apos;s where it actually wins, what engineers using it in real work are reporting, and what&apos;s genuinely new under the hood.</description><pubDate>Mon, 27 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BM8nwCfs.png&quot; alt=&quot;DeepSeek V4: What&apos;s Inside, How It Compares, and Where It Actually Wins&quot; /&gt;&lt;p&gt;DeepSeek V4 dropped on April 24 with a 1.6T-parameter open-weights model that costs roughly 1/7 of Claude Opus 4.7 and posts coding numbers in the same neighborhood. Here&apos;s where it actually wins, what engineers using it in real work are reporting, and what&apos;s genuinely new under the hood.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/deepseek-v4-architecture-benchmarks-engineer-verdict/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Software Quality Has Never Been More Vulnerable</title><link>https://alexlavaee.me/blog/claude-code-postmortem-shipping-too-fast/</link><guid isPermaLink="true">https://alexlavaee.me/blog/claude-code-postmortem-shipping-too-fast/</guid><description>Anthropic published a detailed postmortem on three Claude Code regressions between March and April. The document is admirable — it&apos;s also a mirror. Every team shipping AI-assisted software right now is operating under the same conditions, and software has never been more vulnerable because of it.</description><pubDate>Fri, 24 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.Bvnqj4eT.png&quot; alt=&quot;Software Quality Has Never Been More Vulnerable&quot; /&gt;&lt;p&gt;Anthropic published a detailed postmortem on three Claude Code regressions between March and April. The document is admirable — it&apos;s also a mirror. Every team shipping AI-assisted software right now is operating under the same conditions, and software has never been more vulnerable because of it.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/claude-code-postmortem-shipping-too-fast/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>GPT-5.5: The Honest Take on OpenAI&apos;s Response to Opus 4.7</title><link>https://alexlavaee.me/blog/gpt-5-5-honest-take/</link><guid isPermaLink="true">https://alexlavaee.me/blog/gpt-5-5-honest-take/</guid><description>OpenAI shipped GPT-5.5 today, exactly one week after Claude Opus 4.7. It leads on Terminal-Bench 2.0 and hard math, trails Opus 4.7 on SWE-Bench Pro and tool use, and doubles the API price. Here&apos;s what the benchmarks actually say, how the model was built, and what early users are reporting.</description><pubDate>Thu, 23 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.D1oFyFgu.png&quot; alt=&quot;GPT-5.5: The Honest Take on OpenAI&apos;s Response to Opus 4.7&quot; /&gt;&lt;p&gt;OpenAI shipped GPT-5.5 today, exactly one week after Claude Opus 4.7. It leads on Terminal-Bench 2.0 and hard math, trails Opus 4.7 on SWE-Bench Pro and tool use, and doubles the API price. Here&apos;s what the benchmarks actually say, how the model was built, and what early users are reporting.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/gpt-5-5-honest-take/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Atomic&apos;s Workflow SDK: Deterministically Extending Coding Agents</title><link>https://alexlavaee.me/blog/atomic-workflow-sdk-coding-agent-orchestration/</link><guid isPermaLink="true">https://alexlavaee.me/blog/atomic-workflow-sdk-coding-agent-orchestration/</guid><description>Coding agents are brilliant inside a session and fragile outside it. Atomic is an open-source TypeScript SDK that wraps deterministic workflows around Claude Code, Copilot CLI, or opencode — without reimplementing their harness. Here&apos;s the gap it closes and why a general agent framework isn&apos;t the answer.</description><pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.D89FFHWZ.png&quot; alt=&quot;Atomic&apos;s Workflow SDK: Deterministically Extending Coding Agents&quot; /&gt;&lt;p&gt;Coding agents are brilliant inside a session and fragile outside it. Atomic is an open-source TypeScript SDK that wraps deterministic workflows around Claude Code, Copilot CLI, or opencode — without reimplementing their harness. Here&apos;s the gap it closes and why a general agent framework isn&apos;t the answer.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/atomic-workflow-sdk-coding-agent-orchestration/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Open Claude Design: A Weekend Harness Built on Atomic</title><link>https://alexlavaee.me/blog/open-claude-design-atomic-harness/</link><guid isPermaLink="true">https://alexlavaee.me/blog/open-claude-design-atomic-harness/</guid><description>Anthropic shipped Claude Design on April 17. Three days later we shipped an open-source replica built as an Atomic workflow — 5 phases, the same pipeline ported across three coding agents (Claude, Copilot CLI, opencode). Here&apos;s what that reveals about building thin harnesses around coding agents.</description><pubDate>Mon, 20 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Anthropic shipped Claude Design on April 17. Three days later we shipped an open-source replica built as an Atomic workflow — 5 phases, the same pipeline ported across three coding agents (Claude, Copilot CLI, opencode). Here&apos;s what that reveals about building thin harnesses around coding agents.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/open-claude-design-atomic-harness/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Claude Opus 4.7: Anthropic&apos;s Agentic Reliability Release, Explained</title><link>https://alexlavaee.me/blog/claude-opus-4-7-technical-breakdown/</link><guid isPermaLink="true">https://alexlavaee.me/blog/claude-opus-4-7-technical-breakdown/</guid><description>Anthropic shipped Opus 4.7 today. The headline is SWE-Bench Verified at 87.6%, but the real story for software engineers is what changed in how the model behaves on long-running, autonomous work — loop resistance, self-verification, and finer-grained control over reasoning effort and token budget.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.CvhU4Qd-.png&quot; alt=&quot;Claude Opus 4.7: Anthropic&apos;s Agentic Reliability Release, Explained&quot; /&gt;&lt;p&gt;Anthropic shipped Opus 4.7 today. The headline is SWE-Bench Verified at 87.6%, but the real story for software engineers is what changed in how the model behaves on long-running, autonomous work — loop resistance, self-verification, and finer-grained control over reasoning effort and token budget.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/claude-opus-4-7-technical-breakdown/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>The Rise of Edge AI — A New Layer in the Coding Agent Stack</title><link>https://alexlavaee.me/blog/open-source-ai-edge-coding-agents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/open-source-ai-edge-coding-agents/</guid><description>Compression breakthroughs, edge-optimized model releases, and local runtime optimization are converging fast. Edge AI is emerging as a distinct — and in many scenarios, preferred — layer in the coding agent stack.</description><pubDate>Tue, 14 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BPreAl19.png&quot; alt=&quot;The Rise of Edge AI — A New Layer in the Coding Agent Stack&quot; /&gt;&lt;p&gt;Compression breakthroughs, edge-optimized model releases, and local runtime optimization are converging fast. Edge AI is emerging as a distinct — and in many scenarios, preferred — layer in the coding agent stack.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/open-source-ai-edge-coding-agents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>The Memory Wall Is Coming Down — What It Means for Coding Agents</title><link>https://alexlavaee.me/blog/attention-memory-coding-agents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/attention-memory-coding-agents/</guid><description>Research breakthroughs in attention optimization, community-built memory tools, and production harness architectures are converging on the same problem. Understanding how these layers connect is essential for anyone building with or for coding agents.</description><pubDate>Mon, 13 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.D9reOJ_g.png&quot; alt=&quot;The Memory Wall Is Coming Down — What It Means for Coding Agents&quot; /&gt;&lt;p&gt;Research breakthroughs in attention optimization, community-built memory tools, and production harness architectures are converging on the same problem. Understanding how these layers connect is essential for anyone building with or for coding agents.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/attention-memory-coding-agents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Why My AI-Generated UI Looked Generic (and How I Fixed It)</title><link>https://alexlavaee.me/blog/lessons-learned-designing-with-ai/</link><guid isPermaLink="true">https://alexlavaee.me/blog/lessons-learned-designing-with-ai/</guid><description>AI coding agents produce statistically average UIs by default. After months of iteration, I found a workflow that actually produces distinctive, polished interfaces: a design sandbox for rapid prototyping, structured design skills like Impeccable, and machine-readable design language definitions.</description><pubDate>Thu, 09 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.De_2E4wX.png&quot; alt=&quot;Why My AI-Generated UI Looked Generic (and How I Fixed It)&quot; /&gt;&lt;p&gt;AI coding agents produce statistically average UIs by default. After months of iteration, I found a workflow that actually produces distinctive, polished interfaces: a design sandbox for rapid prototyping, structured design skills like Impeccable, and machine-readable design language definitions.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/lessons-learned-designing-with-ai/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Marc Andreessen on Why AI&apos;s Messiest Problems Are Its Biggest Opportunity</title><link>https://alexlavaee.me/blog/andreessen-latent-space-ai-thesis/</link><guid isPermaLink="true">https://alexlavaee.me/blog/andreessen-latent-space-ai-thesis/</guid><description>Marc Andreessen argues that society&apos;s messiness — not model capabilities — is where the real building opportunity lies. We break down his Latent Space podcast thesis and connect it to Claude Mythos, the Unix mindset for agents, and what it all means for software engineers.</description><pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.MH5WdIWV.png&quot; alt=&quot;Marc Andreessen on Why AI&apos;s Messiest Problems Are Its Biggest Opportunity&quot; /&gt;&lt;p&gt;Marc Andreessen argues that society&apos;s messiness — not model capabilities — is where the real building opportunity lies. We break down his Latent Space podcast thesis and connect it to Claude Mythos, the Unix mindset for agents, and what it all means for software engineers.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/andreessen-latent-space-ai-thesis/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Meta Harnesses: How Coding Agents Learn to Optimize Their Own Scaffolding</title><link>https://alexlavaee.me/blog/meta-harness-automated-agent-optimization/</link><guid isPermaLink="true">https://alexlavaee.me/blog/meta-harness-automated-agent-optimization/</guid><description>Stanford&apos;s Meta-Harness system uses coding agents to automatically optimize the scaffolding around LLMs — achieving state-of-the-art results across coding, math, and classification tasks. The secret: giving the optimizer raw execution traces instead of compressed feedback.</description><pubDate>Tue, 07 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.DlPfn_fN.png&quot; alt=&quot;Meta Harnesses: How Coding Agents Learn to Optimize Their Own Scaffolding&quot; /&gt;&lt;p&gt;Stanford&apos;s Meta-Harness system uses coding agents to automatically optimize the scaffolding around LLMs — achieving state-of-the-art results across coding, math, and classification tasks. The secret: giving the optimizer raw execution traces instead of compressed feedback.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/meta-harness-automated-agent-optimization/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Three Things I Learned Using Coding Agents with 1M-Token Models</title><link>https://alexlavaee.me/blog/copilot-cli-tips-long-context-models/</link><guid isPermaLink="true">https://alexlavaee.me/blog/copilot-cli-tips-long-context-models/</guid><description>After extensive work with coding agents (primarily Copilot CLI and the SDK) alongside Codex 5.4 and Opus/Sonnet 4.6, three patterns emerged: the effective context window is far smaller than advertised, sub-agents are essential for long-horizon work, and Skills + CLIs beat MCP servers for context control.</description><pubDate>Sat, 04 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.UftH8em_.png&quot; alt=&quot;Three Things I Learned Using Coding Agents with 1M-Token Models&quot; /&gt;&lt;p&gt;After extensive work with coding agents (primarily Copilot CLI and the SDK) alongside Codex 5.4 and Opus/Sonnet 4.6, three patterns emerged: the effective context window is far smaller than advertised, sub-agents are essential for long-horizon work, and Skills + CLIs beat MCP servers for context control.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/copilot-cli-tips-long-context-models/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>From RPI to QRSPI: Rebuilding the First Structured Workflow for Coding Agents</title><link>https://alexlavaee.me/blog/from-rpi-to-qrspi/</link><guid isPermaLink="true">https://alexlavaee.me/blog/from-rpi-to-qrspi/</guid><description>Dex Horthy&apos;s Research-Plan-Implement was the first widely-adopted structured workflow for AI coding agents. Three failure modes forced a complete redesign into QRSPI. Here&apos;s what changed, what practitioners are validating, and what it reveals about where AI-assisted development is headed.</description><pubDate>Mon, 30 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.IetzFCeq.png&quot; alt=&quot;From RPI to QRSPI: Rebuilding the First Structured Workflow for Coding Agents&quot; /&gt;&lt;p&gt;Dex Horthy&apos;s Research-Plan-Implement was the first widely-adopted structured workflow for AI coding agents. Three failure modes forced a complete redesign into QRSPI. Here&apos;s what changed, what practitioners are validating, and what it reveals about where AI-assisted development is headed.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/from-rpi-to-qrspi/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Owning the Execution Layer: The Frontier in AI-Driven Development</title><link>https://alexlavaee.me/blog/why-openai-bought-a-python-package-manager/</link><guid isPermaLink="true">https://alexlavaee.me/blog/why-openai-bought-a-python-package-manager/</guid><description>OpenAI acquired Astral. Anthropic acquired Bun. Both companies are buying the execution layer underneath their coding agents — the runtimes, package managers, linters, and test runners that determine whether an agent can actually ship code.</description><pubDate>Tue, 24 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.D5ZwH9YQ.png&quot; alt=&quot;Owning the Execution Layer: The Frontier in AI-Driven Development&quot; /&gt;&lt;p&gt;OpenAI acquired Astral. Anthropic acquired Bun. Both companies are buying the execution layer underneath their coding agents — the runtimes, package managers, linters, and test runners that determine whether an agent can actually ship code.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/why-openai-bought-a-python-package-manager/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>My Honest Take on the Coding Agent Landscape</title><link>https://alexlavaee.me/blog/honest-take-coding-agent-landscape/</link><guid isPermaLink="true">https://alexlavaee.me/blog/honest-take-coding-agent-landscape/</guid><description>An honest assessment after going deep with Factory&apos;s Droids, Devin, Claude Code, Cursor, Windsurf, Codex, and a dozen more. The features blend together, the cycle of hope and disappointment repeats, and the thing that&apos;s actually missing isn&apos;t another desktop app or a better model.</description><pubDate>Sun, 22 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.C_NwfG0z.png&quot; alt=&quot;My Honest Take on the Coding Agent Landscape&quot; /&gt;&lt;p&gt;An honest assessment after going deep with Factory&apos;s Droids, Devin, Claude Code, Cursor, Windsurf, Codex, and a dozen more. The features blend together, the cycle of hope and disappointment repeats, and the thing that&apos;s actually missing isn&apos;t another desktop app or a better model.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/honest-take-coding-agent-landscape/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>We&apos;re Burning Tokens and Calling It Progress</title><link>https://alexlavaee.me/blog/agent-observability-measurement-gap/</link><guid isPermaLink="true">https://alexlavaee.me/blog/agent-observability-measurement-gap/</guid><description>Jensen Huang projects $1 trillion in compute demand through 2027. But 84% of developers use AI coding tools and almost nobody is measuring if they work. Two companies are building the observability infrastructure that separates teams that improve from teams that just spend.</description><pubDate>Thu, 19 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BvqRB_-N.png&quot; alt=&quot;We&apos;re Burning Tokens and Calling It Progress&quot; /&gt;&lt;p&gt;Jensen Huang projects $1 trillion in compute demand through 2027. But 84% of developers use AI coding tools and almost nobody is measuring if they work. Two companies are building the observability infrastructure that separates teams that improve from teams that just spend.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/agent-observability-measurement-gap/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Stop Building Subagents. Start Writing Skills.</title><link>https://alexlavaee.me/blog/skills-over-subagents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/skills-over-subagents/</guid><description>Most teams overuse subagents when skills are the better primitive. The architectural case for progressive context disclosure, automatic project scoping, and portable expertise across 30+ AI coding tools.</description><pubDate>Tue, 17 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BMOf0r-U.png&quot; alt=&quot;Stop Building Subagents. Start Writing Skills.&quot; /&gt;&lt;p&gt;Most teams overuse subagents when skills are the better primitive. The architectural case for progressive context disclosure, automatic project scoping, and portable expertise across 30+ AI coding tools.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/skills-over-subagents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Your Agent&apos;s Bottleneck Isn&apos;t the Model. It&apos;s the Context.</title><link>https://alexlavaee.me/blog/context-infrastructure-agent-bottleneck/</link><guid isPermaLink="true">https://alexlavaee.me/blog/context-infrastructure-agent-bottleneck/</guid><description>AI coding agents burn through hundreds of thousands of tokens grepping files and hallucinating APIs. A new class of context infrastructure tools is emerging to fix both problems — for your codebase and for external libraries.</description><pubDate>Tue, 10 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.FniH5I5g.png&quot; alt=&quot;Your Agent&apos;s Bottleneck Isn&apos;t the Model. It&apos;s the Context.&quot; /&gt;&lt;p&gt;AI coding agents burn through hundreds of thousands of tokens grepping files and hallucinating APIs. A new class of context infrastructure tools is emerging to fix both problems — for your codebase and for external libraries.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/context-infrastructure-agent-bottleneck/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Your AI Agent Writes Plausible Code. Plausible Is 20,000x Slower Than Correct.</title><link>https://alexlavaee.me/blog/plausible-code-is-not-correct-code/</link><guid isPermaLink="true">https://alexlavaee.me/blog/plausible-code-is-not-correct-code/</guid><description>A developer reimplemented SQLite in Rust with LLMs — 576,000 lines that compiled, passed tests, and ran 20,171x slower than the real thing. The bugs weren&apos;t syntactic. They were semantic. Here&apos;s why architecture, specs, test-driven contracts, and targeted review are the fix.</description><pubDate>Mon, 09 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.DAJuWO_M.png&quot; alt=&quot;Your AI Agent Writes Plausible Code. Plausible Is 20,000x Slower Than Correct.&quot; /&gt;&lt;p&gt;A developer reimplemented SQLite in Rust with LLMs — 576,000 lines that compiled, passed tests, and ran 20,171x slower than the real thing. The bugs weren&apos;t syntactic. They were semantic. Here&apos;s why architecture, specs, test-driven contracts, and targeted review are the fix.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/plausible-code-is-not-correct-code/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>GPT-5.4: The Real Leap Isn&apos;t Coding</title><link>https://alexlavaee.me/blog/gpt-5-4-the-real-leap-isnt-coding/</link><guid isPermaLink="true">https://alexlavaee.me/blog/gpt-5-4-the-real-leap-isnt-coding/</guid><description>GPT-5.4&apos;s coding benchmarks barely moved. But computer use jumped from 47% to 75%, tool search cuts MCP token usage by 47%, and knowledge work hit 83% across 44 professions. Here&apos;s what actually matters for developers.</description><pubDate>Thu, 05 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.VXwTgHfG.png&quot; alt=&quot;GPT-5.4: The Real Leap Isn&apos;t Coding&quot; /&gt;&lt;p&gt;GPT-5.4&apos;s coding benchmarks barely moved. But computer use jumped from 47% to 75%, tool search cuts MCP token usage by 47%, and knowledge work hit 83% across 44 professions. Here&apos;s what actually matters for developers.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/gpt-5-4-the-real-leap-isnt-coding/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>AI Agents Demand More Engineering Discipline, Not Less</title><link>https://alexlavaee.me/blog/engineering-discipline-ai-agents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/engineering-discipline-ai-agents/</guid><description>Four industry leaders independently converged on the same conclusion: engineering discipline is the competitive moat when building with AI agents. Here&apos;s the day-one infrastructure that makes agent-generated code reliable.</description><pubDate>Wed, 04 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.D8MPJaTl.png&quot; alt=&quot;AI Agents Demand More Engineering Discipline, Not Less&quot; /&gt;&lt;p&gt;Four industry leaders independently converged on the same conclusion: engineering discipline is the competitive moat when building with AI agents. Here&apos;s the day-one infrastructure that makes agent-generated code reliable.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/engineering-discipline-ai-agents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>How to Harness Coding Agents with the Right Infrastructure</title><link>https://alexlavaee.me/blog/harness-engineering-why-coding-agents-need-infrastructure/</link><guid isPermaLink="true">https://alexlavaee.me/blog/harness-engineering-why-coding-agents-need-infrastructure/</guid><description>A technical deep dive into harness engineering — the converging discipline across OpenAI, Anthropic, and independent practitioners that makes coding agents reliable on complex work.</description><pubDate>Tue, 03 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BMzVYpVo.png&quot; alt=&quot;How to Harness Coding Agents with the Right Infrastructure&quot; /&gt;&lt;p&gt;A technical deep dive into harness engineering — the converging discipline across OpenAI, Anthropic, and independent practitioners that makes coding agents reliable on complex work.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/harness-engineering-why-coding-agents-need-infrastructure/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Inside the Cloud VMs Powering Autonomous Coding Agents</title><link>https://alexlavaee.me/blog/cloud-vms-autonomous-agent-infrastructure/</link><guid isPermaLink="true">https://alexlavaee.me/blog/cloud-vms-autonomous-agent-infrastructure/</guid><description>A technical deep dive into the isolated VM infrastructure that lets AI coding agents operate for hours without human intervention — from Cursor&apos;s cloud agents and Firecracker microVMs to snapshot bootstrapping, computer use, and secrets management.</description><pubDate>Thu, 26 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.UELI87o6.png&quot; alt=&quot;Inside the Cloud VMs Powering Autonomous Coding Agents&quot; /&gt;&lt;p&gt;A technical deep dive into the isolated VM infrastructure that lets AI coding agents operate for hours without human intervention — from Cursor&apos;s cloud agents and Firecracker microVMs to snapshot bootstrapping, computer use, and secrets management.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/cloud-vms-autonomous-agent-infrastructure/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Designing the Multi-Agent Development Environment</title><link>https://alexlavaee.me/blog/parallel-agent-sessions-infrastructure-gap/</link><guid isPermaLink="true">https://alexlavaee.me/blog/parallel-agent-sessions-infrastructure-gap/</guid><description>The biggest constraint in multi-agent development isn&apos;t model capability. It&apos;s that nobody&apos;s built the orchestration, window management, and resource isolation layers end to end. A technical deep dive into what each tool does architecturally, where it breaks, and what the missing product looks like.</description><pubDate>Wed, 25 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.BMVlpRLK.png&quot; alt=&quot;Designing the Multi-Agent Development Environment&quot; /&gt;&lt;p&gt;The biggest constraint in multi-agent development isn&apos;t model capability. It&apos;s that nobody&apos;s built the orchestration, window management, and resource isolation layers end to end. A technical deep dive into what each tool does architecturally, where it breaks, and what the missing product looks like.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/parallel-agent-sessions-infrastructure-gap/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Junior Engineers Don&apos;t Need Protection from AI. They Need Agency.</title><link>https://alexlavaee.me/blog/junior-engineers-agency-not-protection/</link><guid isPermaLink="true">https://alexlavaee.me/blog/junior-engineers-agency-not-protection/</guid><description>The discourse assumes juniors need protection from AI tools. They don&apos;t. They need trust, a disciplined workflow, and room to build capability on their own terms.</description><pubDate>Tue, 24 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.-8U1hdu1.png&quot; alt=&quot;Junior Engineers Don&apos;t Need Protection from AI. They Need Agency.&quot; /&gt;&lt;p&gt;The discourse assumes juniors need protection from AI tools. They don&apos;t. They need trust, a disciplined workflow, and room to build capability on their own terms.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/junior-engineers-agency-not-protection/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>If Your Claws Aren&apos;t Out, You&apos;re Already Falling Behind</title><link>https://alexlavaee.me/blog/claws-layer-autonomous-agents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/claws-layer-autonomous-agents/</guid><description>Karpathy just named the layer most engineers are missing: Claws. Here&apos;s the data behind it, and how to start building it today.</description><pubDate>Mon, 23 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.s-eW-AS1.png&quot; alt=&quot;If Your Claws Aren&apos;t Out, You&apos;re Already Falling Behind&quot; /&gt;&lt;p&gt;Karpathy just named the layer most engineers are missing: Claws. Here&apos;s the data behind it, and how to start building it today.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/claws-layer-autonomous-agents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Gemini 3.1 Pro, Opus 4.6, and Codex 5.3: A Technical Breakdown of Three Models, Three #1 Positions</title><link>https://alexlavaee.me/blog/gemini-3-1-pro-opus-codex-technical-comparison/</link><guid isPermaLink="true">https://alexlavaee.me/blog/gemini-3-1-pro-opus-codex-technical-comparison/</guid><description>Google just reclaimed #1 on SWE-Bench Verified with Gemini 3.1 Pro. But Codex still leads terminal work, and Claude still leads real-world preference. Here&apos;s what&apos;s technically different about each model—and what engineers are actually experiencing.</description><pubDate>Thu, 19 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.Dn_j63h5.png&quot; alt=&quot;Gemini 3.1 Pro, Opus 4.6, and Codex 5.3: A Technical Breakdown of Three Models, Three #1 Positions&quot; /&gt;&lt;p&gt;Google just reclaimed #1 on SWE-Bench Verified with Gemini 3.1 Pro. But Codex still leads terminal work, and Claude still leads real-world preference. Here&apos;s what&apos;s technically different about each model—and what engineers are actually experiencing.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/gemini-3-1-pro-opus-codex-technical-comparison/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>The New SDLC: A Practical Guide to Agentic Engineering</title><link>https://alexlavaee.me/blog/new-sdlc-agentic-engineering/</link><guid isPermaLink="true">https://alexlavaee.me/blog/new-sdlc-agentic-engineering/</guid><description>Coding is practically solved. The engineer&apos;s job is shifting from writing code to designing systems, writing specs, and orchestrating agents. Here&apos;s what the new software development lifecycle looks like and how to adopt it today.</description><pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.CZzdjfSW.png&quot; alt=&quot;The New SDLC: A Practical Guide to Agentic Engineering&quot; /&gt;&lt;p&gt;Coding is practically solved. The engineer&apos;s job is shifting from writing code to designing systems, writing specs, and orchestrating agents. Here&apos;s what the new software development lifecycle looks like and how to adopt it today.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/new-sdlc-agentic-engineering/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Claude Sonnet 4.6: What Developers Actually Need to Know</title><link>https://alexlavaee.me/blog/sonnet-4-6-technical-breakdown/</link><guid isPermaLink="true">https://alexlavaee.me/blog/sonnet-4-6-technical-breakdown/</guid><description>Sonnet 4.6 scores within 1.2 points of Opus 4.6 on SWE-bench at roughly 60% of the cost. We break down the benchmarks, architecture changes, pricing math, developer reactions, and what it means for your agentic workflows.</description><pubDate>Tue, 17 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.DhmtH1cM.png&quot; alt=&quot;Claude Sonnet 4.6: What Developers Actually Need to Know&quot; /&gt;&lt;p&gt;Sonnet 4.6 scores within 1.2 points of Opus 4.6 on SWE-bench at roughly 60% of the cost. We break down the benchmarks, architecture changes, pricing math, developer reactions, and what it means for your agentic workflows.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/sonnet-4-6-technical-breakdown/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Google DeepMind&apos;s Delegation Framework for Coding Agent Architecture</title><link>https://alexlavaee.me/blog/intelligent-agent-delegation/</link><guid isPermaLink="true">https://alexlavaee.me/blog/intelligent-agent-delegation/</guid><description>Google DeepMind&apos;s new paper formalizes delegation as more than task decomposition — it&apos;s a transfer of authority, accountability, and trust. Here&apos;s what that means for how we build coding agents, with concrete patterns you can apply today.</description><pubDate>Mon, 16 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.B-D3swmq.png&quot; alt=&quot;Google DeepMind&apos;s Delegation Framework for Coding Agent Architecture&quot; /&gt;&lt;p&gt;Google DeepMind&apos;s new paper formalizes delegation as more than task decomposition — it&apos;s a transfer of authority, accountability, and trust. Here&apos;s what that means for how we build coding agents, with concrete patterns you can apply today.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/intelligent-agent-delegation/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Codex Spark and the Two-Mode Future of Coding Agents</title><link>https://alexlavaee.me/blog/codex-spark-speed-depth-modes/</link><guid isPermaLink="true">https://alexlavaee.me/blog/codex-spark-speed-depth-modes/</guid><description>OpenAI&apos;s Codex Spark trades intelligence for speed at 1,000+ tokens/sec on Cerebras hardware. The real story isn&apos;t the model—it&apos;s the infrastructure overhaul and the emerging split between speed mode and depth mode in coding agents.</description><pubDate>Thu, 12 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.y5zn6w6M.png&quot; alt=&quot;Codex Spark and the Two-Mode Future of Coding Agents&quot; /&gt;&lt;p&gt;OpenAI&apos;s Codex Spark trades intelligence for speed at 1,000+ tokens/sec on Cerebras hardware. The real story isn&apos;t the model—it&apos;s the infrastructure overhaul and the emerging split between speed mode and depth mode in coding agents.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/codex-spark-speed-depth-modes/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>GLM-5 and the Open Model Convergence</title><link>https://alexlavaee.me/blog/glm5-open-model-convergence/</link><guid isPermaLink="true">https://alexlavaee.me/blog/glm5-open-model-convergence/</guid><description>GLM-5 hit 77.8% on SWE-bench Verified under an MIT license. The benchmark gap between open and closed models is closing fast. Here&apos;s what that means for how you architect your coding agent infrastructure—and what to do about it.</description><pubDate>Thu, 12 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.DAtELm9U.png&quot; alt=&quot;GLM-5 and the Open Model Convergence&quot; /&gt;&lt;p&gt;GLM-5 hit 77.8% on SWE-bench Verified under an MIT license. The benchmark gap between open and closed models is closing fast. Here&apos;s what that means for how you architect your coding agent infrastructure—and what to do about it.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/glm5-open-model-convergence/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>OpenAI&apos;s Agent-First Codebase Learnings</title><link>https://alexlavaee.me/blog/openai-agent-first-codebase-learnings/</link><guid isPermaLink="true">https://alexlavaee.me/blog/openai-agent-first-codebase-learnings/</guid><description>OpenAI shipped a million lines of code with zero human-written code. The engineering patterns they discovered—progressive disclosure, layered architecture, feedback loops—are patterns you can adopt today. Here&apos;s a practical breakdown.</description><pubDate>Wed, 11 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.I9-Syhqq.png&quot; alt=&quot;OpenAI&apos;s Agent-First Codebase Learnings&quot; /&gt;&lt;p&gt;OpenAI shipped a million lines of code with zero human-written code. The engineering patterns they discovered—progressive disclosure, layered architecture, feedback loops—are patterns you can adopt today. Here&apos;s a practical breakdown.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/openai-agent-first-codebase-learnings/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Five Architectural Primitives Every Agent Swarm Rediscovers</title><link>https://alexlavaee.me/blog/five-primitives-agent-swarms/</link><guid isPermaLink="true">https://alexlavaee.me/blog/five-primitives-agent-swarms/</guid><description>Cursor ran thousands of agents to build a browser. Anthropic ran 16 to build a C compiler. Both independently converged on the same five design patterns. Here&apos;s the technical breakdown of why, and how you can apply them.</description><pubDate>Tue, 10 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.B0RtEU-w.png&quot; alt=&quot;Five Architectural Primitives Every Agent Swarm Rediscovers&quot; /&gt;&lt;p&gt;Cursor ran thousands of agents to build a browser. Anthropic ran 16 to build a C compiler. Both independently converged on the same five design patterns. Here&apos;s the technical breakdown of why, and how you can apply them.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/five-primitives-agent-swarms/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Building Self-Improving Coding Agents: How Factory&apos;s Signals Pipeline Closes the Feedback Loop</title><link>https://alexlavaee.me/blog/building-self-improving-coding-agents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/building-self-improving-coding-agents/</guid><description>Factory&apos;s Signals system auto-resolves 73% of agent issues in under 4 hours using LLM judges, friction telemetry, and a closed-loop pipeline. Here&apos;s how it works and how you can adopt similar patterns in your own agent infrastructure.</description><pubDate>Mon, 09 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.B8OVgyP7.png&quot; alt=&quot;Building Self-Improving Coding Agents: How Factory&apos;s Signals Pipeline Closes the Feedback Loop&quot; /&gt;&lt;p&gt;Factory&apos;s Signals system auto-resolves 73% of agent issues in under 4 hours using LLM judges, friction telemetry, and a closed-loop pipeline. Here&apos;s how it works and how you can adopt similar patterns in your own agent infrastructure.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/building-self-improving-coding-agents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Opus 4.6, GPT-5.3 Codex, Agent Teams, and Fleet Mode: What Developers Actually Need to Know</title><link>https://alexlavaee.me/blog/opus-codex-agent-teams-deep-dive/</link><guid isPermaLink="true">https://alexlavaee.me/blog/opus-codex-agent-teams-deep-dive/</guid><description>Four major AI releases dropped within 24 hours. Here&apos;s a technical deep dive into Opus 4.6, GPT-5.3 Codex, Claude Code&apos;s agent teams, and Copilot CLI&apos;s Fleet Mode—and how to start using them effectively.</description><pubDate>Thu, 05 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.qWk_MX4m.png&quot; alt=&quot;Opus 4.6, GPT-5.3 Codex, Agent Teams, and Fleet Mode: What Developers Actually Need to Know&quot; /&gt;&lt;p&gt;Four major AI releases dropped within 24 hours. Here&apos;s a technical deep dive into Opus 4.6, GPT-5.3 Codex, Claude Code&apos;s agent teams, and Copilot CLI&apos;s Fleet Mode—and how to start using them effectively.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/opus-codex-agent-teams-deep-dive/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Codex macOS: Orchestration-First Agent Desktop</title><link>https://alexlavaee.me/blog/codex-macos-orchestration-desktop/</link><guid isPermaLink="true">https://alexlavaee.me/blog/codex-macos-orchestration-desktop/</guid><description>I spent a week exploring OpenAI&apos;s new Codex macOS app. Here&apos;s what I learned about its orchestration-first approach, how it differs from the Claude workflow I&apos;ve grown attached to, and whether it&apos;s worth adding to your toolkit.</description><pubDate>Wed, 04 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.CdMl0kQx.png&quot; alt=&quot;Codex macOS: Orchestration-First Agent Desktop&quot; /&gt;&lt;p&gt;I spent a week exploring OpenAI&apos;s new Codex macOS app. Here&apos;s what I learned about its orchestration-first approach, how it differs from the Claude workflow I&apos;ve grown attached to, and whether it&apos;s worth adding to your toolkit.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/codex-macos-orchestration-desktop/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Agent-Operated CI/CD: The Architecture Making AI Coding Agents Actually Work</title><link>https://alexlavaee.me/blog/agent-operated-cicd-pipelines/</link><guid isPermaLink="true">https://alexlavaee.me/blog/agent-operated-cicd-pipelines/</guid><description>A practical guide to wiring AI coding agents into your CI/CD pipeline with GitHub Actions. Includes working configurations for Copilot Autofix, OpenAI Codex, and Claude Code with proper guardrails.</description><pubDate>Tue, 03 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.qLv2GmTh.png&quot; alt=&quot;Agent-Operated CI/CD: The Architecture Making AI Coding Agents Actually Work&quot; /&gt;&lt;p&gt;A practical guide to wiring AI coding agents into your CI/CD pipeline with GitHub Actions. Includes working configurations for Copilot Autofix, OpenAI Codex, and Claude Code with proper guardrails.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/agent-operated-cicd-pipelines/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Evolving Coding Agent Infrastructure: The Rise of the Meta-Framework Layer</title><link>https://alexlavaee.me/blog/evolving-coding-agent-infrastructure/</link><guid isPermaLink="true">https://alexlavaee.me/blog/evolving-coding-agent-infrastructure/</guid><description>How hooks, skills, and tool orchestration are transforming developer infrastructure. A deep dive into Claude Code&apos;s layered stack and why the most important code you write this year won&apos;t be features.</description><pubDate>Mon, 02 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;img src=&quot;https://alexlavaee.me/_astro/cover.CEBBmPQ3.png&quot; alt=&quot;Evolving Coding Agent Infrastructure: The Rise of the Meta-Framework Layer&quot; /&gt;&lt;p&gt;How hooks, skills, and tool orchestration are transforming developer infrastructure. A deep dive into Claude Code&apos;s layered stack and why the most important code you write this year won&apos;t be features.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/evolving-coding-agent-infrastructure/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Building AI Agents That Work at Any Scale</title><link>https://alexlavaee.me/blog/openai-data-agent-patterns/</link><guid isPermaLink="true">https://alexlavaee.me/blog/openai-data-agent-patterns/</guid><description>OpenAI built a data agent serving 3.5k users across 600 petabytes. The architectural patterns that made it work are the same ones that power a 3,000-line coding agent CLI.</description><pubDate>Thu, 29 Jan 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;OpenAI built a data agent serving 3.5k users across 600 petabytes. The architectural patterns that made it work are the same ones that power a 3,000-line coding agent CLI.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/openai-data-agent-patterns/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Atomic: Building Reliable AI Coding Agent Infrastructure</title><link>https://alexlavaee.me/blog/building-reliable-ai-coding-agent-infrastructure/</link><guid isPermaLink="true">https://alexlavaee.me/blog/building-reliable-ai-coding-agent-infrastructure/</guid><description>A technical guide to implementing procedural memory, specialized sub-agents, and autonomous ralph loops for AI coding assistants cross platform.</description><pubDate>Wed, 28 Jan 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;A technical guide to implementing procedural memory, specialized sub-agents, and autonomous ralph loops for AI coding assistants cross platform.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/building-reliable-ai-coding-agent-infrastructure/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Atomic: Automated Procedures and Memory for AI Coding Agents</title><link>https://alexlavaee.me/blog/atomic-workflow/</link><guid isPermaLink="true">https://alexlavaee.me/blog/atomic-workflow/</guid><description>Building on AI Coding Infrastructure, Atomic introduces a research-to-execution flywheel where specifications become lasting memory. Here&apos;s what we learned scaling multi-agent workflows.</description><pubDate>Mon, 08 Dec 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Building on AI Coding Infrastructure, Atomic introduces a research-to-execution flywheel where specifications become lasting memory. Here&apos;s what we learned scaling multi-agent workflows.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/atomic-workflow/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>How I Shipped 100k LOC in 2 Weeks with Coding Agents</title><link>https://alexlavaee.me/blog/ai-coding-infrastructure/</link><guid isPermaLink="true">https://alexlavaee.me/blog/ai-coding-infrastructure/</guid><description>Open sourcing my developer workflow with AI agents—skills, sub-agents, and autonomous execution. A 5-minute setup that provides the missing infrastructure layer for AI coding tools.</description><pubDate>Wed, 12 Nov 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Open sourcing my developer workflow with AI agents—skills, sub-agents, and autonomous execution. A 5-minute setup that provides the missing infrastructure layer for AI coding tools.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/ai-coding-infrastructure/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Continuous Self-Learning in AI Agents</title><link>https://alexlavaee.me/blog/self-evolving-llm-agents/</link><guid isPermaLink="true">https://alexlavaee.me/blog/self-evolving-llm-agents/</guid><description>An overview of two frameworks for memory and context management to enable continous self-learning systems</description><pubDate>Mon, 10 Nov 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;An overview of two frameworks for memory and context management to enable continous self-learning systems&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/self-evolving-llm-agents/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Context Engineering Navigator</title><link>https://alexlavaee.me/blog/context-engineering-cheat-sheet/</link><guid isPermaLink="true">https://alexlavaee.me/blog/context-engineering-cheat-sheet/</guid><description>An interactive cheat sheet covering context engineering techniques for LLMs including retrieval, processing, management, and dynamic assembly strategies.</description><pubDate>Fri, 19 Sep 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;An interactive cheat sheet covering context engineering techniques for LLMs including retrieval, processing, management, and dynamic assembly strategies.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/context-engineering-cheat-sheet/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Building Products with Agentic-Powered IDEs</title><link>https://alexlavaee.me/blog/context-engineering-ai-ides/</link><guid isPermaLink="true">https://alexlavaee.me/blog/context-engineering-ai-ides/</guid><description>How context engineering transforms AI-powered development tools from disappointing to transformative through smart prompting, MCP servers, and strategic tool integration.</description><pubDate>Wed, 23 Jul 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;How context engineering transforms AI-powered development tools from disappointing to transformative through smart prompting, MCP servers, and strategic tool integration.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/context-engineering-ai-ides/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title>Memorization, Generalization, and Reasoning</title><link>https://alexlavaee.me/blog/memorization-generalization-and-reasoning/</link><guid isPermaLink="true">https://alexlavaee.me/blog/memorization-generalization-and-reasoning/</guid><description>A deep dive into the concepts of memorization, generalization, and reasoning in large language models.</description><pubDate>Mon, 23 Jun 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;A deep dive into the concepts of memorization, generalization, and reasoning in large language models.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://alexlavaee.me/blog/memorization-generalization-and-reasoning/&quot;&gt;Read more on the blog →&lt;/a&gt;&lt;/p&gt;</content:encoded></item></channel></rss>