Skip to content
All entries
by JTO👤

Live demos and the distance to production

There were live AI-agent demonstrations at the VDMA Praxistag KI in Frankfurt — proper ones, on stage, in front of eighty people. Vendors showing what their platforms can do: autonomous fault diagnosis, connected agents reaching into ERP systems, decisions made in real time without a human in the loop. Impressive. Worth filming. I filmed some of it.

And I spent the drive home thinking about the gap between what I had just watched and what it takes to make that work in production. That gap is the whole story.

**The demo gods are real**

A live AI-agent demo is the most seductive artefact in enterprise technology. It shows the happy path — the query that resolves, the fault that gets diagnosed, the recommendation that lands correctly — and it compresses months of engineering into four minutes of smooth. What it does not show is the network request that timed out at 3 a.m. on a Tuesday, the column name in the production dataset that differs by one underscore from the column name in the test dataset, or the agent that decided, in front of eight senior engineers, to do something nobody had anticipated and that took three minutes to explain away.

I am not being uncharitable. This is just what live demos are. They are the vision, not the product. The demo gods — dead Wi-Fi, API rate limits, a latency spike that makes a three-second pause look like a freeze when eighty people are watching — do not care about your slide deck. Every team that has ever done a live demo knows this. The good ones have a fallback. The brave ones go without one.

**The rehearsed-data trap**

The subtler risk is not the network. It is the data. An agent that performed flawlessly on your test dataset last week will encounter the production dataset on stage and find one column that is named differently, one null value in a field it expected to be populated, one timestamp format that differs from the one it trained on. The demo fails in a way that looks like the system is broken, when in fact it is the data that was never quite what it appeared to be.

This is not a demo problem. It is a production problem that the demo revealed. The distance between a rehearsed dataset and real-world data is the same distance between a controlled lab environment and a shop floor. That distance does not disappear when you ship. It is the engineering work.

**The "it worked this morning" curse**

There is a regression between the dress rehearsal and the live session that every team encounters at least once. The system worked at nine. It does not work at two. Nothing was changed. No one touched it. And yet. The curse is real — partly because complex systems have dependencies that are not fully visible, partly because the demo environment is never quite identical to the room you end up in, and partly because entropy.

The response to the curse is not superstition. It is observability: knowing what changed, when, and why. That is not a demo capability. That is a production discipline, and most demo environments do not have it.

**The agent that improvised**

Autonomous systems do unexpected things. That is partly the point — if they only did what you explicitly programmed, you would not need agents, you would need scripts. But unexpected behaviour in front of eighty customers is a different problem from unexpected behaviour in a controlled staging environment. The agent that confidently proposes an action outside its intended scope, or that hallucinates a plausible-sounding but wrong data value in front of senior engineers, does not just cause a technical problem. It causes a trust problem. And trust, once broken in a room, takes longer to rebuild than the demo took to run.

This is partly why "a person decides — by design" is not a constraint we added reluctantly. It is an acknowledgement that live, autonomous systems surprise you. The surprise is a feature of the technology. The human checkpoint is a feature of the design.

**The 3-minute demo versus the 3-year production system**

The gap between what you see on stage and what it takes to run something in production is not a failure of ambition. It is a failure of framing. A demo is a proof of concept. A production system is a supply chain: data provenance, access controls, audit trail, failure modes, rollback procedures, monitoring, the slow accumulation of edge cases that the happy path never showed you.

The vendors showing live agent demos at the VDMA conference are not being dishonest. They are showing real capabilities. But the honest conversation — the one worth having at every table in the World Café — is about the distance between the stage and the shop floor. That distance is where the work is. That distance is what most AI projects underestimate.

**The temptation to fake it**

There is a shortcut some teams take: the "live" demo that is actually scripted against a fixed dataset, or pre-recorded, or subtly mocked. It eliminates the demo gods. It produces a flawless four minutes. And it erodes trust the moment anyone probes it — a question about the data source, a request to try a slightly different query, a curious engineer who asks to see the logs. The faked demo is worse than the failed one, because the failed demo is honest about where the system actually is.

**Reputational asymmetry**

A great demo is forgotten by Friday. A failed one is retold at the next conference, with embellishments. The asymmetry is brutal and completely consistent with how human memory works. You are not playing for the room on the day. You are playing for the story that gets told afterwards. The best hedge against the downside is not a better demo. It is a more honest framing of what the demo is showing — and what it is not.

Apuna does live things too. We know this risk from the inside, not the outside. The lesson is not "don't demo." It is: respect the distance between the stage and the shop floor. Know which side of that distance you are on, and say so.

*The VDMA Praxistag KI im Maschinen- und Anlagenbau was held 18 June 2026 in Frankfurt am Main, organised by VDMA Software & Digitalisierung.*