A buyer's runtime checklist

A robotic process automation tool is five operational guarantees in a trench coat.

The pages that currently rank for this topic answer the question with a leaderboard of logos. That is fine for a marketing audience and useless for a buyer who has to live with the runtime. The actual decision sits below the logos, in five guarantees the executor makes when no human is watching. This page names them, gives you the question to ask any vendor, and points at the exact line of code where Mediar's open-source runtime Terminator answers each one.

Matthew Diakonov, Written with AI

Published May 1, 202611 min

Direct answer · verified 2026-05-01

A robotic process automation tool is software that records a desktop or browser workflow once and replays it later by walking the live accessibility tree, with five runtime guarantees: bounded execution time, position-tolerant matching, multi-strategy element resolution, capped concurrency, and an auditable workflow file on disk.

The five guarantees are what separate a tool you can run in production from one that pages your team weekly. The Mediar runtime publishes its answer to each one in the open-source Terminator agent at github.com/mediar-ai/terminator. If a vendor cannot answer all five with concrete numbers and file references, the runtime is not load-bearing. Move on.

The five guarantees, at a glance

One sentence each. The detail sections below expand them into a question you can put in front of any vendor and a real Mediar source location that answers it.

1. Bounded execution

A scheduled run that hangs forever is not an automation, it is a leak. Ask the vendor for the hard wall-clock cap on a single run and the cleanup interval that reaps stuck executions.

2. Position-tolerant matching

If a window moves five pixels and the bot fails, the bot is not a tool, it is a script you wrote in a hostile editor. Ask which fields the runtime ignores when comparing two captures of the same screen.

3. Multi-strategy element resolution

One selector per step is the 2003 Blue Prism contract. Ask how many independent strategies the runtime will try, in what order, and what the failure mode is when all of them miss.

4. Capped concurrency with backpressure

Production RPA fleets run dozens of workflows in parallel against the same Windows session. Ask for the concurrency primitive (semaphore, queue, lease), the default limit, and whether it survives a host restart.

5. Auditable workflow files

If the bot's source of truth is a binary blob you open in proprietary studio software, you cannot diff it, code-review it, or roll it back. Ask for a text format the workflow lives in on disk.

The numbers Mediar's runtime publishes for each guarantee

These four are visible in the open repo. Concurrency is the default ceiling per executor host. Poll interval is how often the queue looks for new work. Cleanup window is the wall-clock age past which a stuck execution is marked failed. Per-run timeout is the hard ceiling on a single workflow.

0Default concurrent workflows per host

0sQueue poll interval

0 minStuck-execution cleanup window

0sPer-run hard timeout

Sources: queue_processor.rs lines 32-55 (concurrency, ticker), queue_processor.rs lines 60-69 (15-minute cleanup), typescript_executor.rs line 292 (3600-second timeout). All four are environment-overridable. The point is not the specific number, it is that a number exists and lives in source you can read.

Guarantee 1

Bounded execution time

The first failure mode an RPA program hits in production is not a wrong click. It is a stuck run. A modal dialog the recorder did not see, a license server that timed out, a credentials prompt no one dismissed. Without a hard wall-clock cap, a single stuck workflow occupies a queue slot forever and silently halves your throughput. Two stuck workflows halve it again.

Question to ask any vendor

"What is the hard wall-clock cap on a single workflow run, what is the cleanup interval that reaps stuck executions, and where do those constants live in the source?"

What a bad answer looks like

"We do not impose a cap, the workflow runs as long as it needs to." Translation: a stuck run will sit in your queue until a human notices and kills it. Combined with per-bot licensing, this is how RPA programs quietly stall.

What Mediar's answer is

Three independent timers. The scheduled-trigger path enforces MAX_SCHEDULED_EXECUTION_SECS = 30 * 60 in apps/desktop/src-tauri/src/workflow_scheduler.rs line 161 and fires stop_execution against the MCP server when it trips. The on-demand executor wraps the MCP call in tokio::time::timeout(Duration::from_secs(3600)) at crates/executor/src/services/typescript_executor.rs line 292. The queue processor wakes a 60-second cleanup_ticker that calls cleanup_stale_executions(15) and marks anything older than 15 minutes as failed (queue_processor.rs lines 60-69).

Guarantee 2

Position-tolerant matching

The second failure mode is a five-pixel window move. The user docked their browser to the right half of the screen, or moved an external monitor, or switched DPI. A pixel-coordinate-based recorder fails. A selector that includes the bounding box fails. What you want is a runtime that explicitly throws away those fields before comparing two snapshots of the same UI, so cosmetic motion does not become a maintenance ticket.

Question to ask any vendor

"Which fields does the runtime ignore when comparing two captures of the same screen, and where is the function that strips them?"

What a bad answer looks like

"The recorder captures the exact screen state at record time and replays against it." That is a macro recorder, not an RPA tool. The bot will fail any time a window moves.

What Mediar's answer is

The function is remove_volatile_dom_attributes in apps/desktop/src-tauri/src/dom_tree_diff.rs at line 6. It walks the JSON tree and drops the keys "x", "y", "width", "height", and the "value" field on input elements (input values are captured separately in the meaningful event stream). A unit test at line 124, test_dom_diff_no_changes, asserts that comparing two trees identical except for x going from 100 to 200 produces None. The diff size is also capped at 50 lines so an avalanche of cosmetic changes does not blow up the analyzer.

Guarantee 3

Multi-strategy element resolution

This is the one buyers underweight and engineers overweight, with good reason. A runtime that has one match strategy per step is the 2003 Blue Prism contract. The first time a label changes from "Submit" to "Submit claim", the run fails. A modern runtime carries four pieces of evidence per step and walks them in order, treating the next as a fallback when the previous misses.

What 'self-healing' actually means at the source level

The recorder stores one selector per step. Maybe an XPath, maybe a control id, maybe an image hash of the button. At replay time, the runtime asks the live UI for that exact selector. If it matches, the click runs. If it does not, the run fails and a maintenance ticket is filed in the orchestrator. A 'maintenance ticket' is a developer reopening the studio software, re-recording the broken step, redeploying the workflow, and praying the next layout shift waits a week.

One match strategy per step, no fallback
A label rename, a DPI change, or a tab reorder fails the run
Maintenance cost scales with workflow_count multiplied by UI_change_rate
Recorder and runtime are coupled to the studio software's vendor

The Mediar cascade lives in apps/desktop/src-tauri/src/focus_state.rs at lines 168 to 196, inside restore_focus_state. Strategy 1 is find_element_by_id (automation id). Strategy 2 is find_element_by_window_and_bounds. Strategy 3 is find_element_by_text. Strategy 4 is restore_window_focus, which only refocuses the parent window so the next step can retry. Each helper lives in its own function so the ordering is testable in isolation. The whole file is 403 lines and public.

Guarantee 4

Capped concurrency with backpressure

A real RPA fleet does not run one workflow at a time. It runs dozens, sometimes hundreds, against shared Windows sessions. The runtime has to gate them with a concrete primitive (a semaphore, a lease table, a queue with a worker pool), expose a default ceiling, and survive a host restart without orphaning claims. Marketing language about "unlimited robots" without naming the primitive means the limit got pushed into the billing layer instead of being a real engineering decision.

Mediar uses a Tokio Semaphore. The constant is MAX_CONCURRENT_EXECUTIONS, default 10, overridable per host via the environment, declared in crates/executor/src/services/queue_processor.rs lines 32 to 35. A 5-second ticker drives the claim loop. A separate 60-second ticker drives the cleanup loop. Stale entries in the cancellation registry are pruned with a 1000-deep budget (line 70) so the data structure does not grow unbounded across long-running hosts.

executor host log, single workflow execution

The interesting line is "permit 7/10". That is the semaphore handing out one of ten available slots. Once permits run out the claim loop spins without claiming, which is the backpressure surface that protects the database from a runaway producer.

Guarantee 5

Auditable workflow files

The last guarantee is the one that decides whether your RPA program scales past five workflows. The bot's source of truth has to be plain text on disk. A binary .xaml that only opens in studio software cannot be code-reviewed, cannot be diffed in a pull request, and cannot be rolled back without that software running. Compliance teams cannot audit it, and a new engineer cannot read it on their second day.

Mediar emits the workflow as a TypeScript file with eight semantic fields per step: step_title, user_intent, what_was_clicked, what_was_typed, expected_outcome, validation_rules, fallback_behavior, retry_policy. Every one of them is a string a human can read. The file checks into git, runs through whatever review process you already have for application code, and shows up in git blame when the workflow changes. The executor that runs this file (typescript_executor.rs) is 871 lines and contains zero references to gemini, claude, or openai. The model is gone after authoring; runtime is plain Rust calling MCP. That separation is what makes the file auditable: it is a deterministic spec, not a model invocation.

Question to ask any vendor

"Show me the workflow on disk. Open it in a plain text editor. Tell me which lines change when I rename a field, and where that diff shows up in your version control story."

What a bad answer looks like

"The workflow lives in our orchestrator, you can export it as an XML package." That is a binary blob with an XML wrapper. You will not diff it.

Honest scoreboard

Five guarantees, scored against where the category sits today. The scoreboard below is what we have observed across active buyer evaluations against UiPath, Power Automate Desktop, Blue Prism, and Automation Anywhere; it is not a published vendor ranking. Run the questions above against any vendor and score them yourself.

Guarantee

Selector-only RPA (typical)

Mediar runtime

Bounded execution time

Per-bot license, not per-run cap

30-min schedule cap, 60-min on-demand cap, 15-min stale reaper

Position-tolerant matching

Selector includes coordinates, fails on layout shift

x, y, width, height stripped in dom_tree_diff.rs

Multi-strategy resolution

One selector per step, fallback is a maintenance ticket

Four-strategy cascade in focus_state.rs lines 168-196

Capped concurrency

Implicit, gated by license count

Tokio Semaphore, default 10, env-overridable

Auditable workflow file

Binary .xaml or closed JSON, opens only in studio software

TypeScript with eight semantic fields per step, git-diffable

What this checklist does not measure

Five things, in the spirit of being honest. The runtime checklist is necessary, not sufficient.

It does not measure how well the recorder generates a workflow from a single demonstration. That is an authoring-time question and depends on the model behind step analysis, labeling, and synthesis.
It does not measure how well the tool handles applications that do not expose accessibility properly. SAP GUI, Oracle EBS, Jack Henry, Fiserv, FIS, and Epic all expose enough of a tree to be automatable; some custom apps do not, and no runtime survives a screen with zero structure.
It does not measure compliance posture. SOC 2, HIPAA, audit logs, on-prem deployment are real concerns and are orthogonal to the runtime details on this page.
It does not measure unit economics. Per-seat licensing versus per-runtime-minute billing changes total cost more than the runtime details do, and is a separate decision.
It does not measure people. The fastest way to fail an RPA program is to buy great tooling and assign one part-time developer to a hundred workflows. No runtime saves that.

Frequently asked

What is a robotic process automation tool, in one paragraph?

A piece of software that records a UI workflow once while a person performs the task, then replays the workflow later by walking the live accessibility tree to find the same controls. The 'tool' is really three things in one: a recorder, a workflow file format, and a runtime. The runtime is the part that matters when you are picking between vendors. It is what survives a UI update on a Tuesday morning, what kills a stuck run before it eats your queue, and what writes the audit log compliance asks for. The other two layers are a packaging detail.

Why is bounded execution time the first thing on this checklist?

Because a single hung workflow occupying a queue slot is the cheapest way to kill an RPA program. Mediar's scheduled-trigger path caps a run at 30 minutes (MAX_SCHEDULED_EXECUTION_SECS in apps/desktop/src-tauri/src/workflow_scheduler.rs at line 161) and kills it with stop_execution against the MCP server. The on-demand executor adds a 3600-second wall-clock timeout in crates/executor/src/services/typescript_executor.rs at line 292. A periodic cleanup_stale_executions(15) call on a 60-second ticker reaps anything in the database stuck for more than 15 minutes. Three independent timers protecting three distinct failure modes is the right shape; one timer is not.

What does 'position-tolerant' actually mean at the source level?

It means the runtime explicitly drops x, y, width, and height fields before it compares two snapshots of the same DOM tree. In Mediar that happens in remove_volatile_dom_attributes at apps/desktop/src-tauri/src/dom_tree_diff.rs line 14. The same function also drops the value field on input elements (input values live in the recorded events, not in the DOM diff). A test in the same file (test_dom_diff_no_changes at line 124) asserts that two trees identical except for x going from 100 to 200 produce a diff of None. If a vendor cannot point you at code that does this, their bot will fail every time IT pushes an interface refresh.

How many element-resolution strategies should a runtime have?

Four is enough; one is not. Mediar's restore_focus_state in apps/desktop/src-tauri/src/focus_state.rs walks them in order from line 168: automation id, window plus bounds, visible text, parent window. Each strategy has its own find_* helper. If all four return None, the call returns Ok(false) and the step pauses for re-recording instead of clicking on the wrong control. The cascade is what lets a workflow survive normal application updates without a person opening the studio software.

What does the concurrency model look like in production?

In Mediar, a Tokio Semaphore with MAX_CONCURRENT_EXECUTIONS permits (default 10, environment-overridable) gates how many workflows a single executor host runs at once. The queue processor wakes every 5 seconds, claims one pending execution per available permit, and spawns it. A separate cleanup ticker fires every 60 seconds to drop stale claims and stale cancellation entries. The whole loop is in crates/executor/src/services/queue_processor.rs and is around 270 lines. If a vendor's docs talk about 'unlimited robots' without naming a primitive, they have either pushed the limit into a billing layer or they have not stress-tested the runtime.

What does 'auditable workflow file' mean?

Plain text on disk that a code reviewer can read without installing the vendor's software. Mediar emits the workflow as a TypeScript file with eight semantic fields per step (step_title, user_intent, what_was_clicked, what_was_typed, expected_outcome, validation_rules, fallback_behavior, retry_policy). It checks into your source control system, runs through the same review process as application code, and shows up in git blame when something changes. Tools that store the workflow as a binary .xaml or a closed JSON blob fail this test: you cannot diff them, you cannot review them, and you cannot roll one back without studio software running.

Why does the open-source angle matter for a buying decision?

Because every claim in this checklist is verifiable. The Mediar runtime ships as the Terminator agent at github.com/mediar-ai/terminator. You can read the queue processor, the focus_state cascade, the dom_tree_diff stripper, and the workflow scheduler without signing an NDA. Closed-source RPA vendors have to ask you to take their reliability claims on faith. We do not, and the reason this page exists is that we think a buyer should not need to.

How does Mediar's runtime billing change the math?

Per-seat licensing and per-unattended-robot licensing turn the buying decision into a procurement negotiation. Mediar bills $0.75 per minute of workflow runtime against a $10,000 program fee that converts to credits. A workflow that runs 23 steps in 174 seconds bills 2.91 minutes, end of math. Quiet weeks cost less. Busy weeks cost more. Idle bots cost nothing. There is no per-developer floor and no minimum to negotiate around.

Want to run the five questions on a real workflow?

Book a 30-minute call with the founders. Bring one workflow. We walk through the runtime guarantees against the actual screens the bot has to drive, with the source on a screen share.

Three companion pieces that go deeper on individual parts of this checklist.

Keep reading

Companion

Tools for robotic process automation: the 28 named primitives an RPA runtime is built from

The companion catalogue. If this page is the runtime checklist, that one is the API surface that satisfies it. Read the two together to see which tool calls satisfy which guarantee.

Read

Definition

What robotic process automation is, in three numbers: six event types, four stages, four match strategies

Sibling piece on the recording side. Walks the six meaningful event types the recorder admits and the four-stage pipeline that turns a recording into a runnable file.

Read

Comparison

Robotic process automation with UiPath: where Studio, Robot, and Orchestrator end and the limits begin

If you are evaluating UiPath specifically, this page maps the studio + robot + orchestrator architecture against the same five guarantees and shows where Mediar lands on each.

Read