Workload Automation Blog Mainframe Blog

SOAP as the Control Plane for AI: Why Everything Runs—And Still Isn’t Under Control

6 minute read
BMC Software

TL;DR

AI isn’t the hard part anymore. Operationalizing AI is.

Most AI projects stall because no one really has control over how the whole thing behaves in production.

Pipelines run. Workflows exist. But they’re stitched together across tools, scripts, and “somebody who knows how it works.” What you have isn’t a system. It’s coordination by coincidence.

A SOAP is the layer that turns all of that into something you can actually run and rely on.

Why So Many AI Projects Stall — and What’s Missing in the Middle

The model works. The use case is valid. And then it stalls. Not because the model failed, but because everything around it starts to break.

The problem shows up in the middle:

  • data isn’t where it needs to be
  • steps don’t run in the right order
  • systems drift out of sync

Individually, those pieces exist. Together, they don’t behave like a system. That’s why “we got it working” rarely turns into “this runs reliably in production.”

There’s a layer that’s supposed to hold all of this together. In most environments, it doesn’t exist as a single system. It’s spread across schedulers, pipelines, scripts, and a lot of tribal knowledge.

When you actually pull that together into something coherent, it’s what’s referred to as a Service Orchestration and Automation Platform (SOAP).

Is SOAP Just Rebranded Orchestration?

It sounds like it. Most teams already have plenty of orchestration. And yet the common experience is: everything runs but it still doesn’t feel under control.

That’s because orchestration is built to execute workflows—even across systems—but it doesn’t inherently give you control over how those workflows behave at the system level.

The problem is that real systems don’t stay in boundaries. Workflows interact. Dependencies cross tools. Upstream delays ripple. AI steps behave inconsistently. So you end up here: everything ran, but the outcome is still wrong.

That’s not an execution failure. It’s a control failure. That’s the shift:

  • Orchestration defines how the workflow is supposed to run across systems.
  • SOAP is what actually keeps those workflows running correctly across the environment.

You don’t feel that difference when you’re building a pipeline. You feel it when:

  • that pipeline connects to everything else
  • something upstream is late
  • something downstream quietly breaks
  • and no single tool can explain what actually happened

In practice, this is where platforms like Control‑M show up—not as “another orchestrator,” but as the layer that coordinates workflows across data, applications, and AI so they behave like a system.

Takeaway: It’s not more orchestration. It’s the layer that makes everything already orchestrated actually behave predictably.

What a Control Plane Has to Get Right

Once you think in terms of a SOAP as the control plane, the question shifts pretty quickly from: “how do we run this workflow?” to “what does it take to keep this thing behaving under real conditions?”

In practice, there are a few things a control plane needs to get right if it’s going to hold up in production:

1. It actually has to run reliably

Not just once. Not just in a clean path. Across:

  • cloud + onprem
  • internal systems + external APIs
  • workloads that don’t all behave the same way

AI makes this harder, not easier. Latency varies. Dependencies drift. Retries don’t always help. At some point, you realize the pipeline is only as reliable as the least predictable thing in it.

2. It has to understand what depends on what

This is the one most teams feel immediately. Something upstream is delayed, and you don’t find out until something downstream fails—or worse, runs with bad data.

Without system-level awareness, you’re always reacting. With it, you can actually see what’s at risk, what’s impacted, and what needs attention now.

3. It has to explain what’s going on

This is where things usually fall back to people. Someone knows how the workflow works, why it fails in weird ways, and what “normal” looks like. And everyone else is stuck asking them.

A control plane starts to pull that knowledge into the system itself:

  • what this workflow does
  • what changed
  • what likely caused the issue

Not magic. Just enough context to stop everything from being a guessing game.

Takeaway: You’re no longer just running workflows. You’re running a system where behavior is visible, dependencies are understood, and issues are explainable. That’s what lets AI pipelines move from “it works” to something you can actually rely on.

What This Looks Like When It Actually Works

This is where the control plane idea stops being conceptual and starts showing up in real workflows.

1. When a “quick change” isn’t a scramble anymore

You get the request: “Can we refresh this dashboard with updated projections today?”

Without a control plane, that usually turns into a scramble—figuring out which pipelines are involved, coordinating across a few teams, manually triggering jobs, and then watching closely to see what breaks.

With a control plane, that same request is already understood as a workflow. The dependencies are mapped, the execution path is known, and the change can be applied in one place without chasing it across tools.

The difference isn’t just speed. It’s that the work becomes predictable and repeatable instead of reactive.

2. When AI actually makes it to production

Moving revenue forecasting models, fraud analysis pipelines, or supply chain optimization workflows from proof-of-concept to operational requires enterprise-grade discipline: governed data ingestion, controlled model execution, managed LLM invocation.

With a control plane, ingestion, transformation, inference, and downstream updates are all coordinated, observable, and governed the same way. At that point, moving to production stops feeling like starting over.

3. When governance isn’t a periodic fire drill

In most environments, workflow sprawl builds up quietly over time. Pipelines stick around long after they’re needed, dependencies overlap, and no one really has a clear view of what’s still in use. You don’t notice it until something breaks, performance slips, or there’s an audit coming up.

Without a control plane, governance is something you piece together after the fact. With one, the system itself starts to surface what’s changed, what’s no longer used, and where things are drifting. Governance becomes continuous instead of reactive.

Why the Lack of a Control Plane Feels Worse With AI

This isn’t a new problem. The gaps in how workflows are coordinated have always been there. AI just makes them obvious.

You now have more steps, more variability, more external dependencies (LLMs, APIs), and less predictable behavior. So the same coordination gaps that were manageable before start to show up everywhere.

Without a Control Plane

No shared context

  • Each part runs its piece
  • Dependencies are implicit
  • Failures are discovered late
  • People connect the dots manually

With a Control Plane (SOAP)

  • One layer coordinates execution across everything
  • Dependencies are explicit and visible
  • Failures are understood in context
  • Impact is clear before it spreads

Where SOAP Changes Day-to-Day Work

This is where it gets real.

  • Instead of debugging across tools, you can actually see the workflow end to end.
  • Instead of isolated failures, you get context about what those failures affect.
  • Instead of relying on “who knows this pipeline,” the system carries that understanding.

With a SOAP, AI workloads stop being special cases and start behaving like everything else you run in production—the same way mature teams handle automation and orchestration across the rest of the stack.

Where to Start (Without Turning This Into a Rewrite)

You don’t need to replatform everything. You need to expose where you don’t have control.

1. Start with one pipeline that matters

Not a clean one. A real one. Map what actually happens: where data comes from, what it triggers, and what breaks if something is late. This is where hidden dependencies show up.

2. Count how many “control planes” you actually have

Most teams have multiple orchestration tools, scripts filling gaps, and people connecting the dots. Everything is orchestrated. Nothing is coordinated.

3. Make the system visible before you automate it

You should be able to answer, in one place: what does this workflow actually do, what depends on it, and what happens if it fails or drifts. If you can’t answer those, automation just makes troubleshooting harder.

4. Treat AI workloads like real workloads

This is where things quietly break. AI steps often have different retry behavior, weaker monitoring, and less consistent control. That works in testing. It doesn’t in production. Treat them like everything else: observable, governed, and part of the same system—which is increasingly what agentic orchestration is being built to handle.

5. Don’t try to fix everything at once

If one pipeline becomes visible, predictable and understandable, that’s already meaningful progress. From there, it scales.

What You’re Actually Building Toward

Not a new tool. Not a perfect architecture. Just this: a single layer that understands how your workflows behave and keeps them from drifting.

Once you have that, automation gets easier, failures are less surprising, and scaling doesn’t multiply chaos.

To Sum Up: The Shift That Actually Matters

AI isn’t a modeling problem anymore. It’s an operations problem: can you run complex, cross-system workflows reliably under real conditions? That’s the same shift driving teams to operationalize data and AI projects through orchestration.

If this feels familiar

If your current setup works, but only with careful coordination, tribal knowledge, and a few “don’t touch that” pipelines, you’re not behind. You’re just missing the layer that turns all of it into a system you can actually control. That’s the role SOAP is starting to play.

If you’re trying to move from “we got it working” to “we can run this reliably, every day”—here’s a practical guide to turning complex workflows into AI-powered outcomes: Orchestration: The Missing Layer in Enterprise AI.

This guide goes deeper into what’s missing when it comes to operationalizing AI—what it actually looks like in production, why it shows up so consistently, and how teams are starting to close the gap with a real control plane.

BMC named a Leader in 2025 Gartner® Magic Quadrant for Service Orchestration and Automation Platforms

See why BMC is named a Leader in this Gartner Magic Quadrant report
Read the report ›

These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing blogs@bmc.com.

About Us

As BMC, we are committed to a shared purpose for customers in every industry and around the globe. BMC empowers 86% of the Forbes Global 50 to accelerate business value faster than humanly possible by automating critical applications, systems, and services to take advantage of cloud, data, and emerging AI technologies. BMC, now operating as an independent company, helps the world’s most forward-thinking IT organizations turn AI into action—unlocking human potential to multiply productivity so teams can focus on the work that matters most.
Learn more about BMC ›

About the author

BMC Software

BMC works with 86% of the Forbes Global 50 and customers and partners around the world to create their future. With our history of innovation, industry-leading automation, operations, and service management solutions, combined with unmatched flexibility, we help organizations free up time and space to become an Autonomous Digital Enterprise that conquers the opportunities ahead.