Skip to content

Open by Default, Reliable by Subscription

How Apuna builds websites with AI — and how you can audit every step

By JTO · with Ogilvy, an Apuna AI agentPublished June 16, 2026

Most AI-development engagements fail the same test: when something goes wrong, there is no trail. No record of what the model produced, who reviewed it, what was verified, and who gave the final instruction to ship. Responsibility disperses.

This paper describes Apuna's method: a documented build loop in which AI agents propose, a multi-agent review panel evaluates, a verification gate checks every concrete claim against the rendered output, and a human greenlights each change before it ships. The build pipeline is open-source (Apache-2.0); the subscription — Apuna Care — delivers the infrastructure, the specialist time, and the accountability that comes from a human who signs off.

The argument does not rest on case studies or invented metrics. The proof is reflexive: Apuna's own public website was built by this method, and the method is publicly auditable. This paper is itself a product of it — co-authored by a human and an AI agent, disclosed as such.

The Problem: AI Development Without a Paper Trail

There is a conversation most companies have had by now. Someone proposes using AI to build or maintain a digital property — a website, a tool, an integration. The proposal is appealing: faster delivery, lower marginal cost, a crew that works outside business hours. The work proceeds. Something ships.

Then something goes wrong.

It may be small: a claim on the page that is not quite accurate. A section that contradicts the legal copy two pages down. A feature described in the navigation that does not exist in the product. When someone asks how it got there, the answer is usually a variation on: the model produced it, someone reviewed it, it seemed fine at the time.

The problem is not that the model got it wrong. Models get things wrong; so do people. The problem is the audit trail — or rather, the absence of one. There is no record of what the model was asked, what it returned, who reviewed the output, what the review checked, and who gave the final instruction to publish. Accountability disperses. The failure has no clean owner.

This is not an AI problem. It is a process problem. It exists, in nearly identical form, in any workflow where review is informal and sign-off is implicit. AI makes it more visible — and more consequential — because the volume and speed of output exceed what informal review can cover.

The solution is also not new: a documented loop with a human at the load-bearing point. Not in a ceremonial role — an approval checkbox that everyone treats as a formality — but in a structural one: a human who sees the rendered output, checks it against what was specified, and decides, explicitly, that it may ship.

The rest of this paper describes how Apuna built that loop, why it is auditable rather than just described, and what a buyer actually receives when she engages it.

The Method: A Loop with a Human at the Load-Bearing Point

The build loop is small, documented, and repeatable. It runs the same way on day one and day ninety — through the build and through maintenance — because the discipline is structural, not aspirational.

The loop has six steps:

1. Brief to backlog. A short intake converts the engagement into atomic units of work: one section, one component, one fix. Nothing large enough to be opaque. The atomicity is deliberate — a small change can be verified; a large one can only be trusted.

2. Atomic pull requests. The crew works the backlog as a stream of small, self-contained changes. Each pull request is one thing. This is not an efficiency choice; it is an accountability choice. A one-thing PR can be checked. A multi-thing PR diffuses responsibility across its contents.

3. Round-table review. Before any PR is offered for sign-off, the /meeting panel scores competing approaches. The fittest candidate advances. The first idea is not automatically the last — this is the variation-and-selection discipline, applied to copy and code alike.

4. Fact-check and verify gate. A verification step checks every concrete claim in the PR against the actual diff and the rendered page — not the description of the page, the page itself. Fabricated technical claims are caught here. A feature asserted but not built. A section described but not present. The gate does not pass what cannot be found.

5. Human greenlight. No PR merges or deploys without an explicit human decision on the rendered output. The principle is stated in the Constitution as §8: read the page, not the diff. The human's role is not ceremonial; it is the one check the AI crew cannot perform for itself. More on this in the next section.

6. Merge, deploy, iterate. What has been greenlighted ships. The loop begins again.

The loop does not change between build and maintenance. A patch on day ninety-three passes the same gates as the first section on day one. The discipline is the same because the accountability question is the same.

The Real Question: Is the Human Greenlight a Rubber Stamp?

There is an objection an experienced buyer will raise, and it deserves a direct answer: if the AI crew did all the work — drafted the PR, ran the review, passed the verify gate — and the human then approves output they cannot independently produce, is the greenlight real? Or is it a rubber stamp: answerable in name, not in fact?

The objection is valid against a greenlight that is purely notional. The loop is designed to prevent that. The reason comes from the Constitution's §8, stated bluntly: read the page, not the diff.

The technical meaning is this: the human's sign-off is on the rendered output as a stranger encounters it — not on the changed lines in a code-review interface. A reviewer who reads only the diff is checking internal consistency: whether the change was executed as intended. A reviewer who reads the page is checking something different: whether the result is what an actual person, arriving cold, would encounter and could use.

Those two things are not the same. The diff requires you to reconstruct what the page will look like. The rendered page is what it looks like. The human is the only party in this process who can stand where the stranger stands — who has no model of the page that is itself a representation, no prior run that shapes what she expects to see. That is not a small difference. It is the difference between verifying an intention and verifying a result.

This is the non-fungible contribution. The AI crew can check whether the code implements the spec. It cannot check whether the spec produces something a real person can navigate, read, and act on — because checking that requires being the real person. The human greenlight is where that verification enters the record.

There is a second dimension to the objection, which is less technical but more important. When a decision routes through a system — an AI, a process, a rule — and no named person stands behind it, the responsible party has made a choice: to appear to decide without actually deciding. To sign off on something generated without standing behind it. This is not a technological problem; it is an ethical one. An organization that routes binding decisions through AI without a visible, functional human checkpoint has not built a trustworthy process. It has built a well-disguised one.

The greenlight in Apuna's loop is not a rubber stamp because the act of reading the page — of standing where no AI can stand — is a verification the AI cannot perform for itself. And because the human who greenlights is the person who will be found, asked to account, and held to what shipped. That relation is what the word assurance actually means.

Why It Is Trustworthy: Open Code, Disclosed AI, Verified Claims

A buyer does not have to trust the pitch. She can read the source. Three properties make the method auditable rather than merely described.

Open-source pipeline. The build pipeline — apuna/core — is Apache-2.0. Every line that runs a build is readable. There is no proprietary process that must be taken on faith; a buyer, or her technical team, can inspect what runs. The open-source licence also removes lock-in: at the end of any engagement, the buyer holds every line of code under a licence that lets her keep it, fork it, and run it on her own infrastructure. The code is a public good. The subscription is something else entirely.

AI disclosed, never disguised. Every AI-authored or AI-assisted artifact carries visible attribution — agent name, AI status, role in the process. This paper carries it on its byline. The method does not hide what the model contributed and present it as unassisted human judgment. This is Constitution §4, verifiable in the repository. It is not a courtesy; it is a constraint.

Accuracy over completeness. Every claim in Apuna's public work must be traceable to a primary source within the repository. Unverifiable claims are removed, not hedged. This is Constitution §6. The buyer's ground-truth check is simple: can you find the source for this claim? If not, the claim should not be there. A shorter, verified statement is always preferable to a fuller, speculative one — because a shorter verified statement can be trusted, and speculation cannot be actioned.

These three properties are not independent. They form a single argument: the buyer can inspect the process, see who contributed what, and verify that the claims on the page are traceable to something real. That is auditable. That is different from being told the process is good.

The Reflexive Proof: This Site Was Built This Way

The method is not an aspiration. It has an artifact.

Apuna's public website was built by the same loop described in the previous section — the same atomic pull requests, the same /meeting round-table review, the same fact-check and verify gate, the same human greenlight on the rendered output. The repository is public. The commit history is readable. The Constitution that governs the crew is committed to the repository and dated: adopted 2026-06-16.

This paper was produced by the same method: a human and an AI agent, in documented collaboration, with the AI's contribution openly identified on the byline. The method is the product is the proof.

The buyer is not reading about a future process. She is reading an output of the current one. The page she is reading was reviewed by the same panel, passed the same gate, and was greenlighted by the same human who will greenlight the work she is considering commissioning.

This is what it means for a proof to be reflexive rather than testimonial. A testimonial says: a client achieved this result. That may be true; it may also be selected, smoothed, and retrospectively tidied. A reflexive proof says: here is the output of the method, in front of you, auditable. The repository is the primary source. The commit history is the record. There are no case studies, no client names, no invented metrics — because the argument does not need them. It stands on the artifact the reader is already looking at.

What the Buyer Gets: Open by Default, Reliable by Subscription

There is a question an engineering buyer asks before the second meeting, rarely out loud: if the code is Apache-2.0 and I can read every line of it, what exactly am I paying for? It is the right question, and the honest answer determines whether this practice deserves the engagement.

The code is free. The build pipeline is Apache-2.0. The buyer can take every line at the end of an engagement and walk away. There is no proprietary cage, no lock-in, no knowledge deliberately withheld to create dependency. This is not a sales concession; it is the premise of the commercial model.

The subscription — Apuna Care — delivers something the code cannot supply. Infrastructure: the Cloudflare Workers environment, the model API access, the automation that runs the daily loop. Human-in-the-loop time: the specialist hours that review, greenlight, and handle the decisions a model should not make alone. And reliability: a named contact who knows the system, who will still be there when something breaks, and who is accountable for what ships. The margin is time and scale, not a markup on tokens. Variable costs — model API usage, infrastructure — are billed at pass-through, zero markup. That is stated plainly in the product specification and is auditable by the buyer.

Apuna Care has three editions. Community is for teams who want to run the open-source pipeline themselves: they bring their own domain and API keys, Apuna takes nothing. Standard is for teams who want Apuna to run it for them: Apuna provides the infrastructure and the keys, bills variable costs at pass-through plus billable human hours, and delivers managed maintenance with defined response times. Premium adds custom integrations beyond what the Cloudflare platform provides out of the box, priority incident response across extended European hours (currently around GMT+1), proactive monitoring, and a named contact who knows the system.

Prices are not quoted here. The Constitution forbids invented numbers, and that discipline extends to this paper. The right next step is a conversation.

The question a Mittelstand buyer is actually asking — can I rely on these people when something goes wrong? — is answered not by the pricing table but by the accountability structure: a human who greenlights, a process that is documented, a codebase that is open, and a constitution that is public and dated. You pay for assurance, not code. The distinction is precise and intentional.

This paper was co-authored by JTO and Ogilvy, an AI agent in the Artist role at Apuna. Ogilvy's contribution is disclosed in the byline, per Constitution §4. The build pipeline and the Constitution are publicly available in the apuna/core repository under Apache-2.0.

If the method described here is the kind of assurance you have been looking for: talk to an engineer.