A Method for Rapid Product Development Using AI Agents
AI‑accelerated software development and the infrastructure that makes it repeatable at scale.
Summary
This post outlines a practical, repeatable method for using AI agents to accelerate product development from concept to deployable software. I have applied this methodology across multiple projects over the past 18 months, iterating on the approach until it became broadly applicable. The gains have been material: producing MVPs and proofs‑of‑concept in hours rather than the days or weeks they required only a couple of years ago, and achieving even larger speed‑ups on subsequent iterations as the codebase, documentation, and tests provide rich context for the agents to work with.
The method is deliberately simple. It emphasizes high‑quality inputs, tight feedback loops, disciplined test generation, and tool integration that lets teams adopt the latest models with minimal disruption. It scales from a single engineer to a team and preserves quality while moving fast.
What This Method Optimizes
The methodology focuses on four outcomes.
Going from concept to a deployable product. The process starts with concise but complete documentation, then moves through planning, implementation, testing, and review—each step producing tangible artifacts the next step consumes—until there is a working, deployable build.
Speed without sacrificing quality. Some people believe moving quickly to meet aggressive deadlines requires cutting corners on design, performance, and code structure. I do not. With the right prompts and agent workflows, models can adhere to architecture guidelines, implement cleanly decomposed components, and maintain readability and extensibility. Quality here is not accidental; it is a product of deliberate prompting and disciplined outputs.
Repeatability. The same project can be advanced through successive cycles, and the same cycle can be re‑run against different models. Models vary in strengths and weaknesses; being able to compare outputs side‑by‑side—or combine the best parts across runs—raises quality and, when appropriate, makes the result less easily replicable. Repeatability also lowers the cost of course correction: revert the last commit, adjust prompts, and re‑run the agent to produce an alternative path.
Expandability from solo to team. Many demonstrations focus on a single developer. The hard part begins when multiple contributors collaborate with agents. Coordination issues, drift from architecture, and inconsistent prompting can derail teams. I will address team‑scale patterns, handoffs, and anti‑patterns in the next installment of this series.
A note on inputs: outcome quality is bounded by prompt quality. Even a capable agent will produce weak output when given sparse or ambiguous instructions. Good prompts are specific, structured, and traceable back to clear requirements.
Accelerating Development With AI Agents
This is an iterative, agent‑assisted process that moves a product from requirements to v1 and beyond with an engineer taking on a role of product manager and architect more than a coder. In practice this has accelerated concept‑to‑delivery by roughly 5–7× for initial iterations, with even larger gains on later iterations because the agents can leverage prior documentation, code, and tests as context.
The process is not static. Sustained advantages require continually evaluating new models from major vendors and emerging players, and having the agility to switch with minimal disruption to development and QA. This means avoiding lock‑in to agent frameworks that assume a fixed model, and instead treating the model as a swappable component behind a stable workflow.
High Level Overview
At its core, the method applies closed‑loop agent workflows across a handful of simple phases. Each phase has explicit inputs and expected outputs. You do not need heavyweight frameworks; you need clarity and consistency.
Integrate agents where the team already works. Productivity comes from using LLM models via IDE plugins and exposing additional capabilities through MCP servers. The right integrations let agents run tests, execute scaffold commands, and comment on pull requests without leaving the developer’s environment.
The process begins with a concise set of foundational documents describing product requirements, design intent, expected behavior, preferred technologies, architecture, scalability, and deployment. These do not need to be perfect on day one. Draft high‑level versions by hand, then iterate with AI assistance for brainstorming, validation, and assumption testing. This step is decisive; rushing it invites ambiguity that compounds later.
Use markdown format
Use markdown for both prompts and document outputs. Markdown is easy to read and diff, comfortable for engineers, familiar to most models (which are often trained on documents in this format), and supported by common tools. The format cleanly represents structure, highlights key points, and includes code blocks or pseudocode where needed. Google Docs can now export to markdown, so you don’t trade off editor convenience to gain optimal format.
Each iteration follows the same pattern and can be repeated:
Requirements
Architecture and infrastructure setup (as needed, more important early; often skipped later)
Planning
Implementation
Testing
Reviews
Features and tests advance together. Application code generation and test generation must move in lockstep. Tests are created or updated whenever code changes. In practice, I maintain two concurrent prompt streams derived from the same requirements and planning step: one drives product feature implementation, the other test specs and execution. The result is a feedback loop that both pushes development forward and enforces correctness.
Software quality and correctness still require discipline. Modern agentic workflows can produce remarkably good code quickly, but they are not infallible. You will still see duplication, oversized functions, and occasional divergence from intended architecture. Left unchecked, these issues accumulate across iterations. The cycle below builds in checks to prevent that drift.
Development Cycle - Process Specifics
Think of a cycle as a focused increment, similar to an agile sprint. The typical outcome is a completed feature, but the same cycle can jumpstart a new project, produce an MVP, or deliver a proof‑of‑concept. Treat this as a template, not a rigid recipe: skip steps, refine scope, or add checkpoints as needed. The separation into steps exists because agents perform best when each request concentrates on one objective at a time.
Every step produces a file that becomes input to the next. Markdown remains the standard format for prompts and outputs. It’s plain text, easy to author and review, and makes structure and emphasis clear to both humans and models.
Important
The steps below describe the meaning and intent of the process.
Wording in the prompts provided are an example and can be further refined and expanded to provide additional clarity and effectiveness with advanced reasoning models. Threat them as examples, not final products.
1) Requirements
Capture high‑level requirements that define the expected behavior of the system—what it should do. Use case narratives are often effective here: for each user action or input, describe the expected output or side effect. Include a feature list and UX specifications so the agent can design to what matters. Store this in requirements.md.
When ready to generate a plan, instruct the agent explicitly not to write code yet and to produce a single output file, plan.md, that explains how the product will meet the requirements. A refined prompt:
You are planning an implementation, not writing code.
Inputs:
- requirements.md (authoritative functional and non-functional requirements)
- architecture.md (optional; treat as constraints and preferences)
Task:
- Produce **plan.md** with a stepwise implementation plan that explains how each requirement will be satisfied.
- For every plan step, reference the specific requirement IDs it covers.
- Include a brief rationale where choices are not obvious.
- Add a traceability table mapping `requirement → plan step(s).
Constraints:
- Do **not** generate any application code.
- Validate completeness by listing any requirements not yet addressed and proposing how to handle them in the next iteration.
2) Architecture
This optional step matters most when starting a new project or when you want to steer technical choices. Capture the preferred tech stack, frameworks, key components, cloud provider/vendor, and specific hosting services you want to use in a separate file, which can be called architecture.md. A concise example includes programming languages; core frameworks (for example, React, Flask, Django, or Spring), and libraries; description of major application components loaders, parsers, transformers, and repositories; and deployment choices such as Vercel, Firebase, or Amplify.
Most capable agents can scaffold initial infrastructure by generating configuration files and shell commands to set up the application, connect to services, and perform a first deployment. If you maintain architecture.md, extend the earlier prompt as follows:
In addition to requirements.md, read **architecture.md** and treat it as a constraint set.
- Conform to specified languages, frameworks, and component boundaries.
- Where feasible, include shell commands or scripts to scaffold the project and integrate external services.
- Reflect deployment targets and environment configuration (local/dev/prod) in the plan.
3) Planning
By now, the agent has produced plan.md. The point of this step is to pause and inspect the proposed work before any code is written. Review the plan, add details, re‑order steps, or remove unnecessary work. If you missed something in the earlier documents, update requirements.md and/or architecture.md and regenerate the plan. Iterate until plan.md reflects what you intend to build.
When the plan is ready, ask the agent to convert it into a concrete task list in tasks.md.
Use this sample prompt:
Generate **tasks.md** from **plan.md**.
Task:
- Create a complete, ordered set of implementation tasks.
- For each task, provide:
1. What will be implemented and how it will be implemented.
2. Inputs, triggering actions, and expected outputs.
3. Dependencies on other tasks.
4. Acceptance criteria and **detailed testing instructions** (unit tests and, when applicable, integration tests).
5. Cross-references to the relevant sections of **plan.md**.
Constraints:
- Do **not** generate any application code in this step.
- Validate correctness and completeness; if gaps exist, propose follow-up tasks.
4) Tasks
You now have tasks.md. Review it for clarity and ordering. If you don’t like the result, refine plan.md and regenerate. Implementation is where the agent generates code and produces a runnable artifact.
Implement tasks one at a time in the order defined in tasks.md. Focus is your ally: when multiple unrelated tasks are combined into a single request, hallucinations and omissions become more likely and quality of generated code drops.
It’s a good practice to create a separate Git branch for each task. This keeps work isolated, makes rollback trivial, and lets you run the agent multiple times or apply tactical manual edits using faster, lower‑level models after the agent has done the heavy lifting. It also enables targeted AI‑assisted code reviews on a per‑task basis before merging to the main branch. Agents like OpenAI Codex do this automatically as part of their standard workflow.
5) Testing
Ensure each task includes explicit testing instructions; these instructions are used to generate test suites. Unless the work is a disposable proof‑of‑concept, maintain a comprehensive test suite alongside the main codebase. This pays for itself by preventing regressions and lowering maintenance costs.
The task implementation prompt should require the agent to generate unit tests—and integration tests when applicable—alongside the code. In my experience, tests generated in the same run as the implementation are more complete because the agent holds the full context. Where your tools allow it, instruct the agent to run the entire test suite after generating code, detect runtime or startup errors, and fix them within the same iteration. This practice helps to redirect your time from debugging code toward building features, which is much more productive.
6) Review
Every task should go through code review.
GitHub makes this easy and integrates with multiple review agents. You control which model reviews the code; choose an advanced one and make sure it respects the guidance in AGENTS.md (or a similar file) that captures architecture and style conventions from architecture.md.
Automated code reviews are a checkpoint to catch bugs and prevent incremental changes from degrading structure or drifting from the architecture. They do not replace human review. Treat the agent like a fast and consistent teammate whose job is to find mistakes and enforce standards; then apply human judgment on top.
Practical Notes on Tooling
Agents perform best when tethered to the developer’s workflow. IDE plugins allow inline generation and refactoring, MCP servers expose environment- and project‑specific tools, and CI hooks run tests or linters automatically. Keep the model abstracted behind these integrations so you can switch providers as capabilities shift. The goal is to make the workflow stable even as the model changes.
Why This Works
The method compresses cycle time by reducing ambiguity early, enforcing traceability between requirements, plan, tasks, and tests, and letting agents operate in well‑bounded scopes. Markdown artifacts create a durable, inspectable record at each step, enabling easy review, rollback, and re‑runs against different models. As the project matures, the agents benefit from richer context—documentation, code, and tests—which compounds speed in later iterations.
Bonus Capabilities
This workflow enables a few additional advantages that were impractical using agents to develop your products.
You can rerun the workflow with different agents and models, feeding it same input .md files (
requirements.md,architecture.md,plan.md, etc..) to compare outcomes with modest incremental cost. You might run one pass with OpenAI/GPT‑5.1 and another, another with Gemini 3, and another with Anthropic’s Opus or Sonnet, then choose the best output or merge the strongest parts together in a complete product. Before agents, achieving the same comparison would have required hiring several different teams to build parallel versions—multiplying cost and calendar time for a similar insight.Agents constantly improve. If you revisit the product six months later with newer model versions, the results often improve again. Data portability between versions matters here, as do comprehensive test suites to validate compatibility and catch regressions quickly.
Conclusion and What’s Next
This is a pragmatic, agent‑driven development cycle that moves a product from idea to working software quickly while keeping quality front and center. It scales from a solo builder to a team, encourages repeatability, and treats the model as a swappable component behind a stable workflow.
In the next installment, I will focus on expanding this process from an individual agent‑assisted developer to a full team: collaboration patterns that work, handoff protocols, prompt management at scale, and the failure modes that derail multi‑developer agentic projects—and how to avoid them.
Original article has been posted on my website: https://olegkozlov.dev/posts/agent-driven-development-1


