The moat is not a bug. It is the business model.
Legacy desktop apps with no API are a moat. Here is how it actually works.
The takes on this go in two directions. Either “legacy apps will get APIs eventually” (they will not, the missing API is the pricing power) or “an AI agent can just look at the pixels” (it can, and it stalls in production for reasons that have nothing to do with the model). Both miss the structural point. The no-API state is a moat, and the only thing that actually leaks past it is the operating system, not the vendor and not the LLM.
Direct answer (verified 2026-05-21)
Legacy desktop apps with no API form a moat because the data is captive, the switching cost is a multi-year reimplementation, and every cheaper automation layer (iPaaS, browser agents, vision agents) bounces off the native Windows surface. The one practical leak is reading the OS accessibility tree (Windows UI Automation, AT-SPI on Linux, AX on macOS), the same surface screen readers use.
A reference implementation is open source under MIT at github.com/mediar-ai/terminator. The serializer that turns a SAP, Jack Henry, or Epic window into a structured indented tree the model can read is recording_processor.rs:1016.
What “no API moat” actually means
Moats are usually framed as switching costs, network effects, or regulatory capture. The no-API moat is a fourth thing that gets less airtime: it is a moat of data immobility. The vendor owns a system of record, the system of record has no documented way to get data out, and every workflow that touches that data gets routed through a human typing into a desktop UI. The human labor pool is the integration layer. The vendor charges as if it is the only integration option, because it is.
This is why customers do not leave SAP, Jack Henry, or Epic when the contract comes up. The migration cost is not the new system; it is moving twenty years of state out of an interface that was deliberately not built to release it. Anyone who has tried knows the conversation: the export module costs another seven figures, ships on a quarterly cadence, and covers about 60% of the fields you need. The vendor is not being malicious. They are pricing their pricing power.
The shape of the moat is what determines what kind of tool can break it. Network-effect moats fall to better networks. Switching cost moats fall to ten-year incumbents going lazy. Data-immobility moats fall to whoever can route around the missing API without asking the vendor for permission. That is the whole shape of this conversation.
The five categories holding most of the moat
When people say “legacy desktop with no API” they usually picture one of these five surfaces. The buyers staring down those quotes are CFOs, COOs, and RPA center-of-excellence leads at companies in finance, healthcare, F&B, manufacturing, and insurance.
ERP and accounting cores
SAP GUI, SAP Business One, Oracle EBS, JD Edwards, Microsoft Dynamics AX. The general ledger lives here and the audit posture says it cannot be touched. Vendors price like utilities because the switching cost is a multi-year reimplementation and a SOX exposure window.
Banking core
Jack Henry, Fiserv, FIS. Green-screen and Win32 surfaces in front of mainframe back-ends. Community and regional banks that try to replace these end up running both for two years and paying twice.
Clinical EHR
Epic, Cerner (now Oracle Health), eClinicalWorks, MEDITECH. Workflow lives in Hyperspace or Citrix-published thick clients. Public APIs (FHIR, USCDI) cover a fraction of the surface an intake clerk actually touches.
Mainframe terminals
Reflection, PuTTY, Rocket BlueZone in front of z/OS, AS/400, and 30-year-old COBOL forms. The 'modernization' has been a procurement line item at most enterprises since the late 1990s.
Custom internal apps
VB6, Delphi, MFC, WinForms apps built by a contractor in 2003 and now central to one workflow nobody wants to rewrite. Vendor is gone. Source code is in a Subversion repo somebody might still have the password to.
Why every cheaper automation layer bounces off
The last twenty years of enterprise automation have produced three serious attempts to climb this moat. They have each failed for a different reason. The reasons are worth naming because the failure modes tell you what the next attempt has to look like.
iPaaS and connector platforms (Workato, Zapier, Mulesoft). These work where there is an API. The moat is defined by the absence of one. The category cannot reach SAP GUI, Jack Henry, or Epic Hyperspace because there is no endpoint to subscribe a webhook to. They are real businesses; they are not in the conversation here.
Selector-based RPA (UiPath, Automation Anywhere, Blue Prism). The first generation that reached the desktop. They reached it through brittle selectors: a stored path through the control tree for each recorded click. The path breaks when the UI shifts, which it does every SAP support pack, every Windows update, every theme change. The economics are well documented: certified developers, $100K+ implementations, $250K+ annual maintenance. The labor savings have to clear that hurdle, which they often do not, which is why most RPA estates are dramatically smaller in practice than the procurement spreadsheet projected.
Browser-based AI agents (Skyvern, Browser Use, CloudCruise). The current wave. They are excellent at what they do. What they do is web automation. SAP GUI, Jack Henry, and Epic Hyperspace are not web pages, so a headless Chrome reaches none of them. You can pin a browser agent to the moat-protected workflows and watch it idle. The category is right about the future of new SaaS; it is not in the moat conversation.
Vision-on-pixels agents (Anthropic Computer Use, OpenAI computer-use, the “agentic RPA” pitches). These at least clear the surface bar: an LLM can read a screen. They stall on three other things. Latency (30 to 60 seconds per UI step), cost (tokens per step times steps per workflow times workflows per queue), and determinism (the model can pick a different button on the second run). For audited workloads (general ledger posts, claims approvals, charting), the non-determinism alone disqualifies the architecture in procurement. They are demo-ready; the moat is not.
The one bypass: a tree the OS already publishes
The legacy desktop apps do have a machine-readable surface. Nobody from the vendor side put it there; the operating system did, because screen readers have been a regulated accessibility requirement on Windows since the late 1990s. JAWS, NVDA, and Narrator do not look at pixels to read a SAP screen to a blind user. They ask the window for its tree of controls, and the window returns a structured graph of role, name, value, and state. That graph is published by every Win32, WinForms, WPF, and standard-controls UI ever shipped, including the 1995 ones.
The surface is UI Automation on Windows, AT-SPI on Linux, and the AX accessibility API on macOS. It is headless in the sense that matters: you read every control's role, name, and state programmatically with no rendering, no screenshot, and no inference call. The vendor did not give you an API; the operating system did. The moat leaks there.
The reference implementation is open source under MIT at github.com/mediar-ai/terminator. The function that flattens a Windows window into the indented format the model reads is generate_simplified_ui_tree_string at apps/desktop/src-tauri/src/recording_processor.rs:1016. A real SAP customer-master window comes out looking like this:
1. I. [Window] 'Customer Master Data Maintenance'
2. II. [Toolbar] 'Standard'
3. III. [Button] 'Save' {focusable=true}
4. II. [Group] 'General Data'
5. III. [Edit] 'Customer' {value="0000470192"}
6. III. [Edit] 'Name 1' {value="Imperial Treasure Pte Ltd"}
7. III. [Edit] 'Country' {value="SG"}
8. III. [ComboBox] 'Reconciliation Account' {selected=true}
9. III. [CheckBox] 'Posting Block' {checked=false}About 40 lines of text for a screen that has 80 controls. The model reads the tree once, at authoring time, when a human walks the workflow on screen-share. It emits a TypeScript file that targets each control by role and name. The runtime that replays the file in production is plain Rust calling UI Automation. Grep the runtime for “openai,” “claude,” or “anthropic” and you find zero matches. That is the architectural commitment: the model is in the recorder, not in the hot path that types into your general ledger at 4am.
What the math looks like once the moat leaks
0%
cost reduction reported to the board by an LG-supplier F&B chain after moving an SAP B1 workload from UiPath to a tree-based agent.
$0K
annual saving at a mid-market insurance carrier on claims intake. 30 minutes per claim down to 2, at their existing headcount math.
0 weeks
shaved off customer onboarding at a community bank running Jack Henry. 8 weeks down to 2.
“The runtime is priced per minute, not per seat. The $10K turn-key program fee converts to credits with a bonus, so it is effectively prepaid usage. The math gets boring fast against a six-figure RPA license renewal.”
Mediar pricing
What this means for the buyer
The mental shift is the part most procurement decks skip. Stop treating the no-API state as a problem the vendor is going to solve. They are not, because the missing endpoint is the pricing. Stop treating it as a problem the next iPaaS or browser agent is going to solve. They do not reach those surfaces. Stop treating it as a problem worth waiting on. The labor cost of having clerks retype the data the system of record will not export keeps compounding.
The pragmatic move is to treat the OS accessibility surface as the API and route automation through it. Days to a first production workflow rather than months. Per-minute runtime rather than per-seat licensing. A workflow artifact that lives in a git repo your team can diff and audit. A runtime that does not call an LLM on the hot path, so the bill is bounded and the action sequence is deterministic across runs.
The moat does not disappear. The incumbent ERP, EHR, or banking core vendor still owns the system of record, still charges what they charge, still controls the export module. What changes is that the data is no longer captive to the human labor pool you have been pricing as the integration layer. That is the part worth ten or eleven figures across the industry, depending on which analyst you trust.
Have a SAP, Jack Henry, Epic, or mainframe workflow you want quoted?
Book 25 minutes. We will record one pass of the workflow on a screen-share, show you the authored TypeScript file, and price the runtime in minutes. No slides.
Common questions
Why is a legacy desktop app with no API a moat?
Three reasons stacked. First, the data is captive: without a documented REST or SOAP surface, customers cannot pipe it into anything else without an export, which the vendor controls. Second, the switching cost is structural: replacing SAP, Jack Henry, or Epic is not a software project, it is a two to four year reimplementation with regulatory and audit exposure. Third, every cheaper automation layer (Zapier, iPaaS, web SDKs, browser agents) bounces off because there is nothing to call. The combined effect is that the incumbent vendor can keep raising prices without losing customers, and customers can keep paying clerks to manually retype the data the vendor will not export. That is the moat.
Is 'no API' temporary? Won't vendors ship one eventually?
Some have, partially. SAP has OData, Oracle has REST endpoints for some modules, Epic has FHIR. The public APIs cover a fraction of what the desktop UI exposes, and the part they do cover is throttled, license-gated, or limited to read paths. The reason is economic, not technical: an unconditional API would commoditize the data the vendor owns. Twenty-five years into this story, the API gap is no longer a roadmap item the customer can plan around. It is the vendor's pricing power, written down as missing endpoints.
What about browser-based AI agents? Do they break the moat?
Not for the systems above. Browser agents (Skyvern, Browser Use, CloudCruise, the various OpenAI computer-use demos with Chrome) only work where the target is a web page. SAP GUI, Jack Henry's teller window, Epic's Hyperspace client, and a mainframe terminal in Reflection are all native Windows processes painting their own controls. A headless browser sees nothing. They are great for new SaaS where the SDK is the moat. The legacy desktop layer is exactly where they cannot reach.
What about vision-based agents that look at pixels?
Vision agents can technically see anything on screen, so they at least clear the 'can you read the surface' bar. They stall on cost, latency, and audit. One LLM call per UI action stacks into 30 to 60 seconds per step, the token bill compounds across a queue of thousands of items, and the action sequence is non-deterministic across runs, which is unacceptable for any post against a general ledger. They are demo-ready, not production-ready, for the legacy desktop workloads that hold the moat.
So what does actually break the moat?
Reading the OS-level accessibility tree. Every visible Windows control already publishes a structured node through UI Automation (UIA), the same surface JAWS and Narrator read aloud to a blind user. The role, the name, the value, the state are all there, in microseconds, with no inference cost. It is the API the vendor never gave you, attached at the OS layer rather than in the application. The reference implementation is open source under MIT at github.com/mediar-ai/terminator. The serializer that turns a SAP window into a structured indented tree is in recording_processor.rs at line 1016. AT-SPI on Linux and the AX accessibility API on macOS are the equivalent surfaces.
If the OS already publishes this, why has nobody built on it before?
Two reasons. First, the historical RPA generation (UiPath, Automation Anywhere, Blue Prism) used selectors rather than the full tree because their products predate the era when you could reasonably ship an AI model to interpret a serialized tree at authoring time. They picked one node per click and stored a path; the path breaks when the UI shifts. Second, the current AI-agent wave has been visually obvious in the browser, which made vision-on-pixels look like the universal solution. The accessibility tree is unglamorous and Windows-specific and reads to most ML researchers like a footnote. It is also the only path that gives you sub-millisecond reads and deterministic replay on the workloads that actually pay for automation.
What is the cost difference once you do break the moat?
On one published deployment, an LG-supplier F&B chain on SAP Business One moved off UiPath onto a tree-based agent and reported a 70% reduction in automation cost to their board. A mid-market insurance carrier doing claims intake went from 30 minutes per claim to 2 minutes, saving roughly $750K a year at their existing headcount math. A community bank running Jack Henry compressed customer onboarding from 8 weeks to 2 weeks. The pattern: the moat made the workload look uneconomic to automate at UiPath prices ($100K+ implementations, $250K+ annual maintenance on selector cascades), so it stayed manual. Once the cost of automating drops by 80%, the workload tips and the labor savings show up.
What does the structured tree the model actually reads look like?
For a SAP Customer Master Data window, about 40 lines of text in the format '{LineNumber}. {RomanNumeralIndent}. [Role] \'Name\' {Attributes}'. Role is one of Window, Button, Edit, ComboBox, CheckBox, Tree, TreeItem, and a handful more. Name is the human-readable label a sighted user reads. Attributes are state: value, checked, selected, focusable. No image, no OCR, no pixel coordinates. The model gets this once, at authoring time, and emits a TypeScript workflow file that targets each control by role and name. The runtime that replays the file has zero LLM calls.
What happens when the legacy UI changes?
Tree-based runtime has more options than either selectors or pixels. The Mediar runtime walks four strategies in focus_state.rs: match by accessibility/automation ID, match by window plus bounds, match by visible text, and finally focus the parent window so the next event has a chance to fire. Most enterprise UI patches (a label rename, a panel reorder, a control ID change) are absorbed by one of the first three. A change that breaks all four flags the run for human review rather than silently posting against the wrong field, which is the conservative default for anything touching a general ledger or a clinical chart.
What is the right buyer mental model for this?
Treat the moat as a fact, not a problem to solve. The incumbent ERP, EHR, or banking core vendor is not going to ship the API. The cheaper automation layers are not going to reach the desktop. The labor cost of the workflow is going to keep compounding. The only practical move is to put an automation layer on top of the OS surface the vendor cannot remove, deploy it in days not months, and price the runtime per minute rather than per seat. That is what Mediar sells. The $10K turn-key program fee converts to credits at $0.75 per minute of runtime. The math gets boring fast.
Adjacent reading
AI agents on legacy desktop systems with no API
The technical companion. The exact serialized tree format the model reads, the four-strategy fallback when the UI shifts, and the source-code grep that disambiguates tree-based from vision-based architectures.
UiPath brittle selector maintenance cost
Where the moat-protected automation money actually goes inside a UiPath estate: not licenses, but the cascade of selector breaks every quarterly SAP support pack triggers.
Where RPA stalls on legacy apps
The failure taxonomy for the previous automation generation. Why selectors are not the same as tree reads, and why pixel-matching never recovered.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.