by SZJune 18, 2026

Why inference at the edge matters for manufacturing AI

Platform decisions look boring from the outside. They look boring until the milliseconds and the data sovereignty rules start to matter. At the VDMA AI summit in Frankfurt today, "Physical AI" — inference running close to the machine, not in a hyperscaler's data centre — came up more than once. I want to say plainly why the substrate is not a detail.

A cloud-first architecture is the right call for most AI workloads. The data paths are short, the compute scales, and the operational overhead is someone else's problem. I recommend it daily. But manufacturing has two constraints that generic cloud architecture does not solve: latency and locality.

A quality control decision that has to leave the factory floor, hit a data centre in another country, and return before the conveyor belt moves on has a latency budget that generic cloud can blow. An anomaly detection model trained on production data that legally cannot leave the site — under a Betriebsvereinbarung, a customer data-processing agreement, or a supply chain confidentiality clause — cannot run on shared hyperscaler infrastructure without a serious legal conversation first.

Edge inference is not a rejection of cloud. It is a precision tool for the subset of decisions where the machine cannot wait and the data cannot travel. The platform question — what runs where, why, and under whose governance — is a product decision, not a DevOps detail. Get it wrong once and a working model becomes an unusable one.

My discipline is deciding what the platform does next, and arguing the roadmap in metrics rather than slideware. The metric here is straightforward: does the decision arrive before the moment it was needed, and does the data stay where it has to stay? Architecture that answers yes to both is the right architecture.

All entries