Reference

RPA Wiki: Robotic Process Automation, in plain terms

A short reference you can actually use: what RPA is, the glossary that matters, and the one detail every other RPA wiki skips, how a bot decides what to click. That decision, not the vendor logo, is what determines how often your automation breaks.

M
Matthew Diakonov
9 min read

Direct answer · verified 2026-06-20

RPA (robotic process automation) is software “robots” that automate rule-based, repetitive tasks by operating an application’s user interface the way a person would, clicking, typing, copying, and pasting.

It is distinct from AI: classic RPA executes a predefined script and does not learn or reason on its own. Source: Robotic process automation (Wikipedia).

RPA at a glance

Full name
Robotic Process Automation
Also written
RPA, software robots, software bots, digital workers
Category
Business process automation
Does
Drives existing application interfaces (clicks, keystrokes, copy/paste, form fills) to run rule-based, repetitive tasks
Not the same as
AI. Classic RPA follows a predefined script. It does not learn or reason on its own
Common vendors
UiPath, Automation Anywhere, Blue Prism, Microsoft Power Automate
Typical buyer
Ops, finance, IT, and the RPA Center of Excellence

How an RPA bot actually “sees” your screen

Most RPA references stop at “the bot mimics a human.” That hides the most important engineering decision in the whole field. A bot has to locate an element before it can click it, and there are only four ways to do that. The method chosen at build time is what decides whether your automation runs for years or breaks on the next interface update.

Perception layerHow it finds elementsBreaks whenMaintenance cost
Image / template matchingStores a screenshot of a button and searches the screen for matching pixels (OCR for text).Theme change, resolution change, font smoothing, a redrawn icon.High. Visuals change often and silently.
Fixed coordinatesClicks a recorded x/y position, sends keystrokes blindly.Window moves or resizes, a field shifts, a popup appears.Highest. The most brittle of all.
DOM / web selectorsTargets HTML elements by id, class, or XPath in a browser.Markup is refactored, classes are minified, the app is not a web app.Medium for web, unusable for native desktop apps.
Accessibility treeReads the structured element tree the OS already exposes (the same data screen readers use), targeting elements by name and role.Rarely. A relabelled field, not a moved pixel.Lowest, and works on legacy desktop apps with no API.

The bottom row is why the field is shifting. Reading the accessibility tree means the bot tracks an element’s identity (its name and role), not its appearance, so cosmetic UI changes stop being outages. More detail in RPA selectors vs the accessibility tree and where RPA stalls on legacy apps.

What accessibility-tree execution looks like

Mediar’s open-source Terminator SDK (Rust, MIT license) is one concrete implementation of the bottom row. It locates controls by name and role on the Windows Accessibility Tree, the same data a screen reader consumes, rather than by recorded coordinates or stored screenshots. In the executor examples it addresses a target as role:Document and waits for the element to resolve, instead of clicking a pixel position. When a label moves, the agent re-resolves it by name; there is no brittle selector to rebuild.

Record once, then execute against the accessibility tree

You (record once)Mediar agentAccessibility treeSAP GUIDemonstrate the task onceRead elements by name + roleResolve the real controlField labels, roles, valuesStructured element handleType / click the resolved controlUI relabelled? Re-resolve by name, no rebuild

The self-loop on the agent is the part that matters for maintenance: when the interface is relabelled, the element is re-resolved by its name and role, not rediscovered by appearance.

The RPA landscape, by category

People look up “RPA” expecting a vendor list. It is more useful to group tools by what they are good at, because that maps directly to where each one breaks.

Legacy desktop RPA

UiPath, Automation Anywhere, Blue Prism. Studio-built bots, orchestrators, attended and unattended runners. Powerful, but selector and image maintenance grows with every UI change.

Office / cloud automation

Microsoft Power Automate, Workato, Zapier. Strong on SaaS and APIs. Weaker the moment a workflow touches a native desktop app or a green-screen terminal.

Document / IDP layer

Intelligent document processing that reads PDFs and scans, usually bolted onto an RPA bot to feed it structured fields.

AI desktop agents

Newer agents that watch a workflow once, then execute through OS accessibility APIs. No recorded pixels, no selectors to maintain. Mediar and its open-source Terminator SDK sit here.

RPA glossary

The terms you will hit reading any RPA documentation, defined without the marketing.

Bot (software robot)

A configured automation that performs a defined task by operating application interfaces. Not a physical robot.

Attended automation

Runs on a person's machine, triggered by the user, often mid-task. Think a bot that fills a form while an agent is on a call.

Unattended automation

Runs on a server or VM with no human present, usually on a schedule or queue. The bulk of back-office RPA.

Orchestrator

The control plane that schedules bots, distributes jobs, stores credentials, and reports runs. UiPath calls it Orchestrator, others use different names.

Selector

The rule a bot uses to find a UI element (an XPath, an image, a coordinate, or an accessibility name/role). The selector strategy is the single biggest driver of how often a bot breaks.

Accessibility tree

A structured representation of on-screen controls that the operating system exposes for assistive tech (screen readers). On Windows it is exposed through UI Automation. Bots can read it to target elements by name and role instead of pixels.

Self-healing

A bot's ability to keep working when the interface shifts. Real self-healing comes from re-resolving an element by a stable identity (its name and role), not from a second backup screenshot.

IDP (intelligent document processing)

Extracting structured data from unstructured documents (invoices, claims, forms) so a bot can act on it.

Center of Excellence (CoE)

The internal team that owns RPA standards, pipeline, and governance. The CoE lead is usually the technical sponsor for any new automation tool.

Hyperautomation

An umbrella term for combining RPA with AI, process mining, and IDP to automate end to end rather than task by task.

Idempotency

A run that is safe to repeat. If a bot reruns a posted transaction, idempotency is what stops it from double-posting.

For a broader set of definitions across automation, see the Mediar glossary and the longer What is RPA guide.

Stuck maintaining brittle RPA selectors?

Bring one workflow that keeps breaking and we will show you what accessibility-tree execution does with it.

RPA, common questions

What is RPA in one sentence?

Robotic process automation is software that runs rule-based, repetitive tasks by driving the same application interface a person uses, clicking, typing, copying, and pasting, so the work happens faster and without manual effort.

Is RPA the same as AI?

No. Classic RPA follows a predefined script and does not learn or reason. AI agents perceive context and decide what to do. The two are increasingly combined (often called hyperautomation), but traditional RPA on its own is deterministic automation, not intelligence.

Why do RPA bots break so often?

Because of how they find elements on screen. Bots built on recorded coordinates or stored screenshots break whenever the layout, theme, or resolution shifts. Bots that read the operating system's accessibility tree target elements by name and role, so they survive cosmetic UI changes that would break a pixel matcher.

What is the difference between attended and unattended RPA?

Attended automation runs on a person's machine and is usually triggered by that person during their work. Unattended automation runs on a server with no human present, typically on a schedule or from a queue. Most back-office volume is unattended.

Can RPA automate legacy desktop apps with no API?

Tools that rely on the accessibility tree can, because the operating system exposes those controls even when the app has no API. This is exactly where browser-only and API-only automation cannot help: SAP GUI, mainframe terminals, banking core screens, and older EHR clients.

What does Mediar's Terminator SDK do differently?

Terminator is an open-source (MIT) Rust SDK that locates UI elements by name and role on the Windows Accessibility Tree rather than by pixels or recorded coordinates. Because there are no brittle selectors to maintain, automations re-resolve elements when a UI is relabelled instead of needing a rebuild. It is on GitHub at github.com/mediar-ai/terminator.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.