I spent Thursday at the Emirates Stadium for Dynatrace Innovate London. I went as an independent voice, not a buyer and not a seller, with a bag of books and one question I ask at every event like this: what is your team actually wrestling with right now?
The vendor keynotes were polished, and the platform demos did what platform demos do. That is not what I want to write about. The part worth your time was the customer stories. When practitioners stand up in front of three hundred peers and describe what worked, what did not, and what they are still nervous about, you learn more in forty minutes than a year of analyst decks will teach you. Three patterns ran through every one of them.
Pattern one: the foundation came first, always
Not a single customer who reported real gains from AI had started with AI. They started with the boring work. A regulated bank described six months of ingesting logs, wiring up queries and building dashboards before any of the clever automation arrived. A large photo-products group described merging five legacy platforms and roughly 200 microservices, and getting their telemetry in order before they let a model near any of it. Their line was the one I keep repeating to clients: work out what needs to be true for you before AI can take part.
This matters because a broken foundation does not stay neutral when you add AI. It gets amplified. Point a model at noisy, disconnected, half-instrumented telemetry and it will give you confident answers that are wrong. The firms seeing value had earned it by doing the groundwork first. There is no shortcut, and anyone selling you one is selling you a future incident.
Pattern two: the unit of measure moved from health to outcome
The most striking talk of the day came from a national hospitality brand. Their observability leads did not talk about uptime or error rates. They talked about a booking screen that took around 2.7 seconds to load, the work they did to bring it closer to 1.4, and the revenue and customers that the shift was worth to the business. On stage, they put numbers to it. I will not repeat those figures as fact, because a slide at a vendor event is a claim and not an audited result, but the framing is the point: they measured a technical change in customers retained and pounds earned, not milliseconds saved.
That is the shift I have been arguing for in the book and the newsletter. For years, observability answered the question "Is it up?" The teams pulling ahead are answering a different question: "What is it worth?" When you can connect a load-time improvement to conversion, you stop begging for engineering time and start being asked for more of it. The tool matters far less than the decision to make business impact the unit of measure.
Pattern three: the human is coming out of the loop slowly, and on purpose
Every customer was moving toward more automation, and every serious one was doing it with the brakes on. The bank had cut its incident root-cause time dramatically by putting natural language over its query engine, so an ordinary engineer could ask a question instead of waiting a day or two for an expert to write it. The honest detail was the best moment of the session. Their own words, roughly: it is not one hundred percent correct, and that is the point. It gets the team to the right area fast. It is not yet reliable enough to trust blind.
That is exactly the right posture, and it is rarer than it should be. They are drafting fixes automatically and still routing them through a human, in a regulated shop, with guardrails, and only planning to remove that human once confidence and controls justify it. Nobody in the room who runs something that matters was automating the accountability. They were automating the toil and keeping the judgement. If you have read anything I have written on trust-gated operations, you will know why that made me happy.

The line I wrote down
The afternoon had a communication session that had nothing to do with observability and everything to do with why good observability work still fails to land. The speaker's favourite line was this: the winner of the communication race is the person who speaks last. He talked about the curse of knowledge, the trap where the more you understand, the more you assume everyone else does too. He described the job of a technical leader as director of translation, turning MCP, DQL and distributed tracing into something a board can act on.
I thought about every brilliant observability programme I have seen stall, not because the engineering was wrong, but because nobody translated it upward. If you run one of these teams, that is your real second job. You are not only building the platform. You are the chief storyteller for what it makes possible, and if you cannot tell that story to the people who hold the budget, the platform does not get its second year.
What I took away
Innovate London was a good day, and the credit goes to the customers who stood up and were honest, not just triumphant. Strip away the vendor gloss and the message from the floor was steady and unglamorous. Build the foundation before you reach for AI. Measure what the business measures. Automate the work, keep the human on the accountability, and remove that human only when your guardrails have earned it. Then go and learn to explain all of it in plain language, because the best observability in the world is worth nothing if the person paying for it cannot see what it bought them.
None of that is new. It is just easier to believe when three hundred of your peers are nodding along in a football stadium.
Allan Mann is the author of Metrics & Mayhem: A CTO's Guide to Observability That Actually Works and writes the Observability Digest.
|
Get the next one One signal a week. No noise. | |
|
If this was useful, Metrics & Mayhem sends one short, practical piece like it to IT operations leaders most weeks. No fluff, no vendor noise.
Prefer to start with the book? Read a free chapter. |
