The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

The latest whitepaper emphasizes that in AI-driven software development, the model itself is only about 10% of the system. The focus should be on harnesses, context engineering, and verification, which are the true sources of value and control.

A new Google whitepaper titled The New SDLC With Vibe Coding emphasizes that the AI model accounts for only about 10% of the system’s behavior. Instead, the paper argues that harnesses, context engineering, and verification are where the real value and control lie, marking a significant shift in software engineering practices as AI becomes more integrated into development workflows.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, reports that as of early 2026, 85% of professional developers use AI coding agents regularly, with 51% using them daily. It states that roughly 41% of all new code is generated by AI, but emphasizes that the model itself is only a small part of the overall system. The majority of the system’s behavior is shaped by the harness — including prompts, rules, tools, and observability — which constitutes about 90% of the system’s effectiveness. Concrete examples include experiments where tweaking only the harness improved agent performance significantly, while changing the model had minimal impact.

At a glance
reportWhen: published early 2026
The developmentA Google whitepaper introduces a new perspective on SDLC, highlighting that the AI model constitutes only a small part of the system, with most value lying in harnesses and context management.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Why Focus on Harnesses and Context Matters

This shift redefines where development teams should invest their resources. Instead of chasing the latest model advancements, organizations should prioritize building robust harnesses, optimizing context management, and verifying outputs. This approach can lead to lower costs, better security, and more reliable AI systems, fundamentally changing how AI-driven software is built and maintained.

Amazon

AI development harness tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI in Software Development

The whitepaper builds on the recent surge in AI adoption, where by early 2026, a majority of developers integrate AI tools into their workflows. It challenges the common misconception that the model’s capabilities are the primary driver of AI performance. Previous practices focused heavily on model improvements, but emerging evidence suggests that configuration, scaffolding, and context engineering have a far greater impact. This perspective aligns with recent experiments showing that simple adjustments to prompts and tools can outperform major model upgrades.

“The biggest shift isn’t a new language or framework; it’s moving from writing code to expressing intent and trusting machines to turn that into working software.”

— Addy Osmani

Amazon

software verification tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unanswered Questions About Implementation and Impact

While the whitepaper presents compelling data and experiments, it remains unclear how universally applicable these findings are across different industries and scales. Specific strategies for building effective harnesses and context management at enterprise levels are still being developed, and real-world case studies are limited. Additionally, the long-term implications for AI model development priorities are still evolving.

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Developers and Organizations

Organizations should begin evaluating their current AI workflows, focusing on harness design, context management, and verification processes. Future research and industry collaborations are expected to refine best practices for scalable harness construction. Monitoring how these principles influence AI system reliability, cost, and security will be key in the coming months.

Amazon

AI observability software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper shows through experiments that the harness, prompts, tools, and context management play a much larger role in shaping AI behavior than the model itself, which is why the focus should shift there.

How does this change the way organizations should develop AI systems?

Instead of investing primarily in acquiring or upgrading models, organizations should prioritize building robust harnesses, optimizing context loading, and verifying outputs. This can reduce costs and improve system reliability.

What are harnesses and why are they so important?

Harnesses include prompts, rules, tools, and observability mechanisms that control and shape AI behavior. They are crucial because they determine how effectively an AI system performs, often more than the underlying model.

Are these findings applicable to all AI development projects?

The whitepaper presents strong evidence, but applicability may vary based on project scale, industry, and existing infrastructure. More case studies are needed to confirm universal relevance.

What should developers focus on first based on this new insight?

Developers should start by evaluating their current harnesses and context management strategies, then experiment with improvements to reduce costs and increase reliability.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.
You May Also Like

VigilSAR: The Object That Isn’t Transmitting

VigilSAR is a SAR-based platform that identifies vessels not broadcasting transponder signals, enhancing maritime awareness and safety.

AI Is the Alibi. The Reorg Is the Signal.

Coinbase’s recent layoffs and restructuring are officially linked to AI, but underlying factors suggest market pressures and cost-cutting are primary drivers. Here’s what’s confirmed and what remains unclear.

7 Best Wireless Smartwatches for Prime Day Deals in 2026

Discover the best wireless smartwatches on Prime Day 2026, including Apple, Garmin, and budget options, with deals and buying tips.

2 Best Home Night Lights In 2026

Discover the best home night lights of 2026, featuring the DORESshop and LOHAS models, with insights on features, energy use, and placement tips.