You’ve heard it by now: “AI agents will run the plant.” Predict, optimize, act. But in manufacturing operations where safety, quality, and uptime are on the line, claims aren’t enough. Agents have to earn trust.

At Operations Calling 2025, Tulip brought together three leaders shaping the conversation around industrial AI: The panel was moderated by David Rogers, Senior Solutions Architect at Databricks, who works directly with manufacturers deploying AI in production environments. He was joined by Pattie Maes, Professor at the MIT Media Lab and a pioneer in software agents, and Ashtad Engineer, Worldwide Head of Automotive and Manufacturing Solutions at AWS. Together, they explored what industrial AI agents can realistically do today, what still blocks autonomy on the shop floor, and what conditions must be in place before agents can safely influence operations. The discussion surfaced a consistent theme: progress is real, but scaling agents on the shop floor requires far more structure and discipline than most headlines suggest.

We’ve distilled seven key takeaways to help teams evaluate the claims about AI agents, understand how best to extract real business value from industrial agents, and to build a safe path forward for introducing agents at scale.

1. Industrial AI Means Constraints

Systems are physical in manufacturing. Environments are constrained and have consequences.

In manufacturing, an agent is a system that can take inputs from machines, logs, or enterprise systems, interpret them in context, and generate recommendations or take action toward a defined goal.

“Industrial AI is about applying AI in controlled, constrained environments, with guardrails and predictability,” - Ashtad Engineer, Worldwide Head of Automotive and Manufacturing Solutions, AWS

That’s what makes the industrial context so different from consumer chatbots or office tools. It’s not just about smart suggestions; it’s about ensuring those suggestions are repeatable, explainable, and safe.

This is why early wins for agents show up in structured, bounded workflows:

  • Computer vision for quality inspection

  • AI-assisted planning for maintenance and scheduling

  • Data onboarding and clean-up tasks

“AI agents don’t need to be fully autonomous to be useful”, but they do need clear context and constraints to act responsibly - Pattie Maes, Professor, MIT Media Lab


2. Advisory vs. Autonomous Agents

Advisory agent

An AI system that surfaces insights or recommendations from operational data, but requires a human to review and execute the decision.

Autonomous agent

An AI system that takes action on its own in a live production environment, such as changing a setting or triggering a step without human approval, and therefore must meet strict safety, validation, and accountability requirements.

Manufacturing today runs on tightly coordinated physical processes. Every decision affects safety, product quality, throughput, and often regulatory compliance. To manage that complexity, plants rely on sensors, connected machines, MES and ERP systems, and strict operating procedures. In this environment, the most common AI agents are still advisory. You see them in

  • Vision systems that inspect parts or packaging and flag possible defects

  • Maintenance copilots that analyze sensor data and equipment history to recommend work orders or suggest the best downtime window

  • Planning and scheduling tools that suggest sequence changes, capacity adjustments, or inventory moves when conditions shift

Being in the “advisory zone” means these agents read production data and policies, then generate summaries, ranked recommendations, or next-best actions. But they do not act on their own. A human still reviews and approves any change to setpoints, schedules, or system records. Operators stay in control, while AI reduces cognitive load and helps teams make faster, more informed decisions, without taking autonomous action on the line.



3. Where Agents Are Working Today

AI agents are showing up first in areas where the work is clear and structured. These are tasks that follow defined processes and have clear boundaries, which makes them lower risk and easier to scale.

Today, that includes:

  • Quality inspection
    Vision systems that check parts or packaging and flag possible defects. It leads to fewer defects reaching customers, less rework, and more consistent quality.

  • Maintenance support
    Tools that analyze machine data and repair history to suggest work orders or the best time for planned downtime. This helps in faster troubleshooting, reduced downtime, and better use of maintenance resources.

  • Data cleanup and onboarding
    Systems that organize and label production data so teams can use it for reporting or analysis. Helps with cleaner data, fewer manual errors, and faster insights.

  • Troubleshooting support
    Agents that search SOPs, manuals, and past incidents to suggest likely causes and next steps. Leads to shorter issue resolution time and less reliance on tribal knowledge.

  • Shift summaries and reporting
    Tools that turn logs and operator notes into draft reports for supervisors to review.
    This saves time on documentation and more consistent reporting.

All of these examples reflect advisory agents in action, supporting decisions while operators stay in control.

These use cases deliver real, measurable improvements in efficiency, consistency, and uptime without handing control over to automation

“Structured workflows like data cleaning and onboarding, that’s where agent value is very real today,” - Ashtad Engineer, Worldwide Head of Automotive and Manufacturing Solutions, AWS

These are practical, lower-risk places to start with AI.



4. The Real Blockers: Explainability, the Ability to Replay Decisions, Security, and Liability

Before any agent influences production, four conditions need to be in place:
You must be able to explain its logic, replay the scenario, secure the system, and own the outcome.

The operators and engineers need more than a recommendation, they need to see how it was generated and simulate what would happen if they followed it.

“Explainability and replayability are critical…Operators want to know: How did the agent come to that conclusion?” - Ashtad Engineer, Worldwide Head of Automotive and Manufacturing Solutions, AWS

Security and data privacy add another layer. When agents access enterprise systems, cloud environments, or vendor-managed models, questions arise: Who owns the data? Can it be isolated? Is IP protected?

The final blocker is liability. If an agent causes rework, downtime, or worse, who's responsible? The manufacturer? The vendor? The model provider?



5. Validation + Drift: The Ops Reality Vendors Skip

In manufacturing, it’s not enough for an AI agent to work once. It has to keep working as conditions change.

In regulated industries like biopharma, medical device manufacturing, etc there is formal validation. If an agent changes anything in a live system, such as triggering a step or updating a record, it must be tested, documented, and traceable. There’s no way around that.

Even in non-regulated plants, things change over time. Materials vary. Machines wear down. Processes get adjusted. When the real world changes, the data going into the AI model changes too. And when the data changes, the model’s accuracy can drop. This is called model drift, when a model slowly becomes less accurate because the environment it was trained on has changed.

What worked last month may not work next quarter. If no one is monitoring performance, small errors can build up until the agent makes a bad recommendation.

That’s why AI systems need monitoring, version control, and regular review. They may need retraining. They may need to be rolled back. AI in operations isn’t “set it and forget it.” It has to be managed and checked like any other critical production system.

Versioning and re-validation are important. Agents need a clear change history, with guardrails for retraining, rollback, and periodic checks. AI in ops isn’t “set it and forget it.” It’s manage, validate, and monitor like any other critical system.


6. Autonomy Needs System Understanding, Digital Twin Direction

If agents are ever going to take autonomous action, they need more than data, they need context and causality. That means knowing both the current system state and how the system will respond to a change.

“Autonomy requires understanding the system’s state and response dynamics. That’s your digital twin.” - Ashtad Engineer, Worldwide Head of Automotive and Manufacturing Solutions, AWS

Digital twins help bridge the gap. By combining first-principles modeling (physics, chemistry, flow rates) with real-time empirical data, they allow teams to simulate outcomes before taking action.

This kind of system-level reasoning is essential for safe autonomy. Without it, agents are guessing. And in manufacturing, a wrong guess can mean wasted product, safety risks, or failed audits.

That’s why autonomy remains rare in production. But with digital twin foundations in place, teams can begin to test agent behavior in controlled, simulated environments before handing over control. Simulation first. Autonomy second.


7. Standards: One Winner vs. Federated Reality

A common hope in AI tooling is that one universal protocol will emerge, something that lets all agents, tools, and systems speak the same language.

“Ideally, there’s one open protocol created collectively,” - Pattie Maes, Professor, MIT Media Lab

And that complexity is real. Most plants run a patchwork of protocols across decades-old equipment, vendor-specific APIs, and homegrown systems. Standardizing it all under one protocol? Not happening anytime soon.

Instead, the practical approach is federated:

  • Accept mixed systems

  • Build translation layers

  • Focus on semantic consistency (shared meaning, not shared syntax)

If agents can reason about a “batch,” “setpoint,” or “alarm” across systems, even if the protocols differ, they can still be effective.

So the future isn’t one protocol to rule them all. It’s interoperability through meaning and governance that keeps it traceable.


What This Means for Manufacturers Right Now

Industrial AI agents aren’t magic, and they’re not autonomous (yet). What’s working today are advisory agents embedded in human-managed workflows, scoped to specific, structured problems.

If you’re leading operations, quality, or IT/OT, here’s a pragmatic path forward:

  • Start with embedded agents inside workflows, not standalone copilots.

  • Focus on areas like maintenance support, data cleanup, and inspection where the process is structured and risk is bounded.

  • Build trust before autonomy: Require explainability, replayability, approvals, and clear boundaries.

  • Treat agents as part of a composable orchestration layer, not a new monolith.

  • Invest in the unsexy stuff: shared vocabularies, validation workflows, versioning, and drift monitoring.

  • Only explore autonomy where system state and response dynamics are well understood or simulated.

AI agents can help, but only when they’re grounded in your reality, governed by your processes, and accountable to your standards. That’s not hype. That’s the work.


How Tulip Helps Teams Operationalize Agent Workflows Safely

Tulip isn’t an AI agent, it’s the platform that helps manufacturers build, manage, and scale human-in-the-loop workflows where agents can assist without overstepping.

With Tulip, teams create structured frontline apps that standardize work, enforce approvals, and capture context in real time. AI tools like copilots or vision models can be embedded directly into these workflows, all with clear guardrails.

Tulip’s platform also brings the controls that agent adoption demands:

  • Secure connectivity across OT and enterprise systems

  • Permissions, versioning, and audit trails baked into every app

  • A composable architecture that grows with your needs, not against them

That means your operators stay in control. Your data stays protected. And your workflows stay compliant, whether you’re in discrete, batch, or regulated manufacturing.

AI agents aren’t a shortcut, they’re a layer. Tulip helps you build that layer with confidence.

Digitally transform your operations with Tulip

See how systems of apps enable agile and connected operations

Day in the life CTA illustration