As enterprises transfer from experimenting with generative AI to deploying agentic methods in manufacturing, the dialog is shifting. The query executives are asking is now not “Can this mannequin purpose?” however “Can this technique be trusted?”
To discover what that shift actually means, I sat down with Maria Zervou, Chief AI Officer for EMEA at Databricks. Maria works carefully with clients throughout regulated and fast-moving industries and spends her time on the intersection of AI structure, governance, and real-world execution.
All through the dialog, Maria saved returning to the identical level: success with agentic AI isn’t concerning the mannequin. It’s concerning the methods round it—knowledge, engineering self-discipline, and clear accountability.
Catherine Brown: Many executives I converse with nonetheless equate AI high quality with how spectacular the mannequin appears. You’ve argued that’s the mistaken body. Why?
Maria Zervou: The most important misunderstanding I see is folks complicated a mannequin’s cleverness or perceived reasoning skill with high quality. These should not the identical factor.
High quality, particularly in agentic methods, is about compounding reliability. You’re now not evaluating a single response. You’re evaluating a system which may take tons of of steps—retrieving knowledge, calling instruments, making choices, escalating points. Even small errors can compound in unpredictable methods.
So the questions change. Did the agent use the appropriate knowledge? Did it discover the appropriate sources? Did it know when to cease or escalate? That’s the place high quality actually lives.
And importantly, high quality means various things to totally different stakeholders. Technical groups usually deal with KPIs like price, latency, or throughput. Finish customers care about model compliance, tone, and authorized constraints. So, if these views aren’t aligned, you find yourself optimizing the mistaken factor.
Catherine: That’s attention-grabbing, particularly as a result of many leaders assume AI methods should be “excellent” to be usable, significantly in regulated environments. How ought to firms in highly-regulated industries strategy AI initiatives?
Maria: In extremely regulated sectors, you do want very excessive accuracy, however the first benchmark ought to be human efficiency. People make errors right now, on a regular basis. Should you don’t anchor expectations in actuality, you’ll by no means transfer ahead.
What issues extra is traceability and accountability. When one thing goes mistaken, are you able to hint why a call was made? Who owns the result? What knowledge was used? Should you can’t reply these questions, the system isn’t production-ready, no matter how spectacular the output seems to be.
Catherine: You speak lots about domain-specific brokers versus general-purpose fashions. How ought to executives take into consideration that distinction?
Maria: A general-purpose mannequin is basically a really succesful reasoning engine educated on very giant and numerous datasets. Nevertheless it doesn’t perceive what you are promoting. A site-specific agent makes use of the identical base fashions, but it surely turns into extra highly effective by means of context. You pressure it right into a predefined use case. You restrict the house it will possibly search. You train it what your KPIs imply, what your terminology means, and what actions it’s allowed to take.
That constraint is definitely what makes it higher. By narrowing the area, you scale back hallucinations and improve the reliability of outputs. A lot of the worth doesn’t come from the mannequin itself. It comes from the proprietary knowledge it will possibly securely entry, the semantic layer that defines that means, and the instruments it’s allowed to make use of. Primarily, it will possibly purpose in your knowledge. That’s the place aggressive benefit lives.
Catherine: The place do you sometimes see AI agent workflows break when organizations attempt to transfer from prototype to manufacturing?
Maria: There are three major failure factors. The primary is tempo mismatch. The expertise strikes sooner than most organizations. Groups soar into constructing brokers earlier than they’ve finished the foundational work on knowledge entry, safety, and construction.
The second is tacit information. Numerous what makes staff efficient lives in folks’s heads or scattered paperwork. If that information isn’t codified in a type an agent can use, the system won’t ever behave the way in which the enterprise expects.
The third is infrastructure. Many groups don’t plan for scale or real-world utilization. They construct one thing that works as soon as, in a demo, however collapses beneath manufacturing load.
All three points have a tendency to indicate up collectively.
Catherine: You’ve mentioned earlier than that capturing enterprise information is as necessary as choosing the proper mannequin. How do you see organizations doing that properly?
Maria: It begins with recognizing that AI methods should not one-off initiatives. They’re residing methods. One sensible strategy is to file and transcribe conferences and deal with that as uncooked materials. You then construction, summarize, and tag that data so the system can retrieve it later. Over time, you’re constructing a information base that displays how the enterprise really thinks.
Equally necessary is the way you design evaluations. Early variations of an agent ought to be utilized by enterprise stakeholders, not simply engineers. Their suggestions—what feels proper, what doesn’t, why one thing is mistaken—turns into coaching knowledge.
Constructing an efficient analysis system, customized to that agent’s particular function, is crucial to making sure high-quality outputs, which is finally crucial for any AI initiatives in manufacturing. Our personal utilization knowledge reveals that clients who use AI analysis instruments get almost 6x extra AI initiatives into manufacturing than those that don’t.
In impact, you’re codifying the enterprise mind into analysis standards.
Catherine: That sounds costly and time-consuming. How do you stability rigor with velocity?
Maria: That is the place I discuss minimal viable governance. You don’t remedy governance for the whole enterprise on day one. You remedy it for the particular area and use case you’re engaged on. You ensure that the info is managed, traceable, and auditable for that agent. Then, because the system proves invaluable, you increase.
What helps is having repeatable constructing blocks—patterns that already encode good engineering and governance practices. That’s the pondering behind approaches like Agent Bricks, the place groups can begin from refined foundations as an alternative of reinventing workflows, evaluations, and controls from scratch every time.
Executives ought to nonetheless insist on a couple of non-negotiables up entrance: clear enterprise KPIs, a named govt sponsor, evaluations constructed with enterprise customers, and powerful software program engineering fundamentals. The primary venture might be painful—but it surely units the sample for the whole lot that follows and makes subsequent brokers a lot sooner to deploy.
Should you skip that step, you find yourself with what I name “demo put on”: spectacular prototypes that by no means fairly grow to be actual.
Catherine: Are you able to share examples the place brokers have materially modified how work will get finished?
Maria: Internally at Databricks, we’ve seen this in a couple of locations. In Skilled Providers, brokers are used to scan buyer environments throughout migrations. As an alternative of engineers manually reviewing each schema and system, the agent generates advisable workflows primarily based on finest practices. That dramatically reduces time spent on repetitive evaluation.
In Subject Engineering, brokers routinely generate demo environments tailor-made to a buyer’s business and use case. What used to take hours of handbook prep now occurs a lot sooner, with increased consistency.
In each instances, the agent didn’t exchange experience—it amplified it.
Catherine: Should you needed to distill this for a CIO or CDO simply beginning down this path, what ought to they deal with first?
Maria: Begin with the info. Trusted brokers require a unified, controllable, and auditable knowledge basis. In case your knowledge is fragmented or inaccessible, the agent will fail—irrespective of how good the mannequin is. Second, be clear about possession. Who owns high quality? Who owns outcomes? Who decides when the agent is “adequate”? And at last, do not forget that agentic AI is just not about exhibiting how sensible the system is. It’s about whether or not the system reliably helps the enterprise make higher choices, sooner, with out introducing new threat.
Closing Ideas
Agentic AI represents an actual shift—from instruments that help people to methods that act on their behalf. However as Maria makes clear, success relies upon far much less on mannequin sophistication than on self-discipline: in knowledge, in governance, and in engineering.
For executives, the problem is just not whether or not brokers are coming. It’s whether or not their organizations are able to construct methods that may be trusted as soon as they arrive.
To be taught extra about constructing an efficient working mannequin, obtain the Databricks AI Maturity Mannequin.
