Reference
RPA Wiki: Robotic Process Automation, in plain terms
A short reference you can actually use: what RPA is, the glossary that matters, and the one detail every other RPA wiki skips, how a bot decides what to click. That decision, not the vendor logo, is what determines how often your automation breaks.
Direct answer · verified 2026-06-20
RPA (robotic process automation) is software “robots” that automate rule-based, repetitive tasks by operating an application’s user interface the way a person would, clicking, typing, copying, and pasting.
It is distinct from AI: classic RPA executes a predefined script and does not learn or reason on its own. Source: Robotic process automation (Wikipedia).
RPA at a glance
- Full name
- Robotic Process Automation
- Also written
- RPA, software robots, software bots, digital workers
- Category
- Business process automation
- Does
- Drives existing application interfaces (clicks, keystrokes, copy/paste, form fills) to run rule-based, repetitive tasks
- Not the same as
- AI. Classic RPA follows a predefined script. It does not learn or reason on its own
- Common vendors
- UiPath, Automation Anywhere, Blue Prism, Microsoft Power Automate
- Typical buyer
- Ops, finance, IT, and the RPA Center of Excellence
How an RPA bot actually “sees” your screen
Most RPA references stop at “the bot mimics a human.” That hides the most important engineering decision in the whole field. A bot has to locate an element before it can click it, and there are only four ways to do that. The method chosen at build time is what decides whether your automation runs for years or breaks on the next interface update.
| Perception layer | How it finds elements | Breaks when | Maintenance cost |
|---|---|---|---|
| Image / template matching | Stores a screenshot of a button and searches the screen for matching pixels (OCR for text). | Theme change, resolution change, font smoothing, a redrawn icon. | High. Visuals change often and silently. |
| Fixed coordinates | Clicks a recorded x/y position, sends keystrokes blindly. | Window moves or resizes, a field shifts, a popup appears. | Highest. The most brittle of all. |
| DOM / web selectors | Targets HTML elements by id, class, or XPath in a browser. | Markup is refactored, classes are minified, the app is not a web app. | Medium for web, unusable for native desktop apps. |
| Accessibility tree | Reads the structured element tree the OS already exposes (the same data screen readers use), targeting elements by name and role. | Rarely. A relabelled field, not a moved pixel. | Lowest, and works on legacy desktop apps with no API. |
The bottom row is why the field is shifting. Reading the accessibility tree means the bot tracks an element’s identity (its name and role), not its appearance, so cosmetic UI changes stop being outages. More detail in RPA selectors vs the accessibility tree and where RPA stalls on legacy apps.
What accessibility-tree execution looks like
Mediar’s open-source Terminator SDK (Rust, MIT license) is one concrete implementation of the bottom row. It locates controls by name and role on the Windows Accessibility Tree, the same data a screen reader consumes, rather than by recorded coordinates or stored screenshots. In the executor examples it addresses a target as role:Document and waits for the element to resolve, instead of clicking a pixel position. When a label moves, the agent re-resolves it by name; there is no brittle selector to rebuild.
Record once, then execute against the accessibility tree
The self-loop on the agent is the part that matters for maintenance: when the interface is relabelled, the element is re-resolved by its name and role, not rediscovered by appearance.
The RPA landscape, by category
People look up “RPA” expecting a vendor list. It is more useful to group tools by what they are good at, because that maps directly to where each one breaks.
Legacy desktop RPA
UiPath, Automation Anywhere, Blue Prism. Studio-built bots, orchestrators, attended and unattended runners. Powerful, but selector and image maintenance grows with every UI change.
Office / cloud automation
Microsoft Power Automate, Workato, Zapier. Strong on SaaS and APIs. Weaker the moment a workflow touches a native desktop app or a green-screen terminal.
Document / IDP layer
Intelligent document processing that reads PDFs and scans, usually bolted onto an RPA bot to feed it structured fields.
AI desktop agents
Newer agents that watch a workflow once, then execute through OS accessibility APIs. No recorded pixels, no selectors to maintain. Mediar and its open-source Terminator SDK sit here.
RPA glossary
The terms you will hit reading any RPA documentation, defined without the marketing.
Bot (software robot)
A configured automation that performs a defined task by operating application interfaces. Not a physical robot.
Attended automation
Runs on a person's machine, triggered by the user, often mid-task. Think a bot that fills a form while an agent is on a call.
Unattended automation
Runs on a server or VM with no human present, usually on a schedule or queue. The bulk of back-office RPA.
Orchestrator
The control plane that schedules bots, distributes jobs, stores credentials, and reports runs. UiPath calls it Orchestrator, others use different names.
Selector
The rule a bot uses to find a UI element (an XPath, an image, a coordinate, or an accessibility name/role). The selector strategy is the single biggest driver of how often a bot breaks.
Accessibility tree
A structured representation of on-screen controls that the operating system exposes for assistive tech (screen readers). On Windows it is exposed through UI Automation. Bots can read it to target elements by name and role instead of pixels.
Self-healing
A bot's ability to keep working when the interface shifts. Real self-healing comes from re-resolving an element by a stable identity (its name and role), not from a second backup screenshot.
IDP (intelligent document processing)
Extracting structured data from unstructured documents (invoices, claims, forms) so a bot can act on it.
Center of Excellence (CoE)
The internal team that owns RPA standards, pipeline, and governance. The CoE lead is usually the technical sponsor for any new automation tool.
Hyperautomation
An umbrella term for combining RPA with AI, process mining, and IDP to automate end to end rather than task by task.
Idempotency
A run that is safe to repeat. If a bot reruns a posted transaction, idempotency is what stops it from double-posting.
For a broader set of definitions across automation, see the Mediar glossary and the longer What is RPA guide.
Stuck maintaining brittle RPA selectors?
Bring one workflow that keeps breaking and we will show you what accessibility-tree execution does with it.
RPA, common questions
What is RPA in one sentence?
Robotic process automation is software that runs rule-based, repetitive tasks by driving the same application interface a person uses, clicking, typing, copying, and pasting, so the work happens faster and without manual effort.
Is RPA the same as AI?
No. Classic RPA follows a predefined script and does not learn or reason. AI agents perceive context and decide what to do. The two are increasingly combined (often called hyperautomation), but traditional RPA on its own is deterministic automation, not intelligence.
Why do RPA bots break so often?
Because of how they find elements on screen. Bots built on recorded coordinates or stored screenshots break whenever the layout, theme, or resolution shifts. Bots that read the operating system's accessibility tree target elements by name and role, so they survive cosmetic UI changes that would break a pixel matcher.
What is the difference between attended and unattended RPA?
Attended automation runs on a person's machine and is usually triggered by that person during their work. Unattended automation runs on a server with no human present, typically on a schedule or from a queue. Most back-office volume is unattended.
Can RPA automate legacy desktop apps with no API?
Tools that rely on the accessibility tree can, because the operating system exposes those controls even when the app has no API. This is exactly where browser-only and API-only automation cannot help: SAP GUI, mainframe terminals, banking core screens, and older EHR clients.
What does Mediar's Terminator SDK do differently?
Terminator is an open-source (MIT) Rust SDK that locates UI elements by name and role on the Windows Accessibility Tree rather than by pixels or recorded coordinates. Because there are no brittle selectors to maintain, automations re-resolve elements when a UI is relabelled instead of needing a rebuild. It is on GitHub at github.com/mediar-ai/terminator.
Keep reading
RPA without pixels: accessibility APIs and no-API apps
Why reading the OS accessibility tree beats image and selector matching on legacy desktop software.
AI agents replacing UiPath
What changes when an agent watches a workflow once instead of being built selector by selector.
Power Automate Desktop and the SAP GUI limit
Where mainstream desktop RPA hits a wall on SAP, and what the accessibility layer does about it.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.