HEALTHCARE AUTOMATION
Automate Epic in Citrix without touching the VDA
UIA selectors do not cross the Citrix ICA boundary. Every other RPA vendor solves that by installing software on your IDN's gold image. Mediar solves it from the client endpoint, on day one, with no change request.
Last updated: April 2026
Why client-side selectors fail inside Citrix
Citrix Workspace on the client renders the entire remote application (Epic Hyperspace or Hyperdrive) inside one opaque window owned by wfica32.exe. The ICA protocol that connects client and session host carries pixels, keyboard input, mouse input, and a few side channels like clipboard. It does not carry the Windows UI Automation tree.
From the client's perspective, every Epic button, field, and chart cell lives behind a single bitmap. Standard RPA tools that depend on UIA, MSAA, or DOM selectors see nothing useful. They report one element: the Citrix window itself.
This is not a Citrix bug. It is the protocol. The controls run on the VDA (Virtual Delivery Agent, the Windows server hosting the Epic session), and only their rendered pixels cross the wire.
What actually crosses the ICA boundary
Anything you build client-side has to compose from these primitives:
| Channel | Crosses ICA? | Useful for |
|---|---|---|
| Pixels (the remote window's bitmap) | Yes | Screenshot, OCR, vision model analysis |
| Synthetic keyboard input | Yes | Typing, hotkeys, Epic's keyboard-first workflows |
| Synthetic mouse clicks | Yes | Clicking targets located visually |
| Clipboard (usually mapped) | Yes | Side channel for getting structured text in and out |
| UIA accessibility tree | No | Why traditional client-side selectors fail |
| DOM, process info, file handles inside session | No | Why CDP based browser automation alone does not work |
How Mediar automates Epic inside Citrix
Under the hood, Mediar uses Terminator, our open-source desktop automation engine. For Citrix sessions, Terminator switches to a vision-driven backend that composes the primitives above into something that feels like real UIA automation:
Detect the Citrix session
Mediar recognizes wfica32.exe, CDViewer, and Citrix Workspace window classes, then switches into Citrix mode automatically.
Capture the remote window
Pixel-perfect capture of the Citrix-rendered Epic window, with DPI matching across client and session resolutions.
Reconstruct a pseudo UIA tree
Element segmentation models (OmniParser class) turn the screenshot into a tree of interactive elements with inferred roles, labels, and bounding boxes.
Same selector API
Workflows written as role:Button name:'Sign Order' run unchanged. The selector engine queries the pseudo tree when the target lives inside Citrix, real UIA otherwise.
VLM grounding for hard cases
When segmentation misses a custom Epic widget or a dense flowsheet, a vision-language model locates the target by description and returns coordinates.
Keyboard-first execution
Epic was designed for hotkey-driven workflows. Mediar prefers keyboard input over mouse coordinates wherever possible, which is dramatically more reliable.
Verify after act
Every action takes a before-and-after screenshot, perceptually diffed to confirm the screen changed in the expected way. If not, retry once, then escalate.
Layout caching
Stable Epic screens cache their element positions. Mediar only re-segments when the visual hash changes, dropping per-action latency to tens of milliseconds.
Why this matters for clinics and Community Connect tenants
Most small and mid-size clinics on Epic are Community Connect tenants of a larger IDN. The Citrix VDA, the gold image, and every change request flow through the parent IDN's EUC team, Epic application team, InfoSec review, and a change advisory board. Getting third-party software onto that gold image is a multi-month project that usually does not happen at all.
UiPath, Blue Prism, and Automation Anywhere all need that install. They cannot ship a Citrix automation without one. For the typical clinic, that means the answer is no.
Mediar runs entirely on the client endpoint the clinician already uses. It acts inside the existing Citrix session under the clinician's own credentials, so it inherits the same access controls and audit trails Epic already enforces. No gold-image change. No parent-IDN signoff. No new attack surface on the VDA.
Mediar vs. the VDA-install approach
Both approaches work. The trade is structural, not a feature checklist.
| Aspect | Mediar (client-side) | UiPath / Blue Prism / AA |
|---|---|---|
| Where it runs | Client endpoint only | VDA agent + client extension |
| VDA install required | No | Yes |
| Parent IDN approval needed | No | Yes (months of CAB review) |
| Time to first workflow | Same day | Months |
| Works in Community Connect tenants | Yes | Only if parent IDN agrees |
| Selector reliability | Vision + verify-after-act | Real UIA tree |
| Per-action latency | ~50 ms cached, ~500 ms cold | Sub-millisecond |
| Survives Epic UI updates | Yes (semantic targeting) | Often breaks on selector drift |
What you trade off
Vision-based selectors are not free. Here is what you give up, and how Mediar mitigates each:
Latency
Vision and segmentation add 50 to 500 ms per action versus sub-millisecond UIA. Mitigated heavily by layout caching for stable screens.
Off-screen state
Mediar can only see what is currently rendered. Workflows must scroll to reveal data, then re-segment. Same constraint a human user has.
Theme and DPI sensitivity
Resolution changes, dark-mode toggles, and Citrix DPI mismatch can shift bounding boxes. Verify-after-act catches drift, but pixel-perfect tools are slightly more brittle than real UIA.
Boolean state queries
Asking 'is this checkbox checked' requires a visual inspection rather than a tree property read. Reliable, but slower than native UIA.
Where Hyperdrive fits in
Epic is steadily migrating customers from Hyperspace (the legacy Win32 client typically delivered through Citrix) to Hyperdrive, a Chromium-based client designed to install locally on each endpoint. As of mid-2025, roughly 15 percent of Epic sessions were running Hyperdrive on endpoint, and that share is growing quickly.
Three deployment modes matter for automation:
- Hyperspace via Citrix: The opaque-window case described above. Mediar handles it with vision and keyboard.
- Hyperdrive via Citrix: Same opaque window. The DOM lives on the VDA, not the client. Mediar handles this the same way.
- Hyperdrive installed locally: The Citrix layer disappears. Mediar uses native UIA on the Windows endpoint and runs at full speed.
Most large IDNs are landing on a hybrid: physicians and ambulatory clinics on local Hyperdrive, ED and inpatient kiosks on Citrix-published Hyperdrive (for badge tap and fast user switching), and BYOD or remote users on Citrix or VMware Horizon. Mediar works across all three with the same workflows.
What teams automate in Epic with Mediar
Prior authorizations
- Pull patient demographics from chart
- Submit to payer portal
- Track status, write outcome back to Epic
- Flag denials for human review
Refill processing
- Triage queued refill requests
- Verify against allergies and active meds
- Route to provider for sign-off
- Send notification to patient
Order entry & follow-through
- Place order sets from voice or template
- Confirm correct encounter context
- Log every action for audit
- Verify with after-action screenshot
Chart abstraction
- Read flowsheets, notes, and labs
- Extract structured data with vision + OCR
- Drop results into a spreadsheet or registry
- Run nightly across patient cohorts
Frequently Asked Questions
Why can't standard RPA tools see inside a Citrix Epic session?
Citrix uses the ICA protocol, which transports pixels and keyboard or mouse input but not the Windows UI Automation (UIA) tree. The Citrix Workspace app on the client renders the remote app (Epic Hyperspace or Hyperdrive) inside a single opaque window. Client-side UIA only sees that wrapper window, never the buttons and fields inside.
How does UiPath, Blue Prism, or Automation Anywhere solve the Citrix problem?
All three install a runtime component on the Citrix VDA (the Windows session host where Epic actually runs) and ship a matching extension on the client. The VDA component reads the real UIA tree and proxies it back over a custom HDX virtual channel. This works, but it requires installing software on the parent IDN's gold image, which means months of change-advisory-board review and HIPAA sign-off.
Can Mediar automate Epic in Citrix without installing anything on the VDA?
Yes. Mediar runs entirely on the Citrix client endpoint. It captures the Citrix window, uses vision-language models and element segmentation to reconstruct a pseudo UIA tree from pixels, and sends synthetic keyboard and mouse input back to the session. This means workflows can launch on day one, with no VDA gold image change and no parent IDN approval.
Is screen-scraping reliable enough for clinical workflows?
Modern vision models (OmniParser, UI-TARS, Claude and GPT vision) are an order of magnitude more reliable than the pixel-template tools of 2014. We combine vision with keyboard-first navigation (Epic was built for hotkey workflows), clipboard side-channels for data extraction, and a verify-after-act loop that re-screenshots to confirm every action landed. For most order entry, refill, prior auth, and chart-review tasks this is production-grade.
Does Hyperdrive change anything?
Hyperdrive (Epic's new Chromium-based client replacing Hyperspace) is friendly to automation when installed locally on the endpoint, because the whole Citrix layer disappears. When Hyperdrive runs inside Citrix, the ICA boundary problem is unchanged. Mediar handles all three cases: Hyperspace via Citrix, Hyperdrive via Citrix, and Hyperdrive installed locally.
What about small clinics that are tenants in a parent IDN's Epic environment?
Community Connect clinics typically have zero authority over the parent IDN's VDA gold image. This is exactly the case where the VDA-install approach (UiPath, Blue Prism) is structurally blocked. Client-side automation is the only viable option, and Mediar is built for it.
Is this HIPAA-safe? Does PHI leave the endpoint?
Mediar can run with vision and OCR fully on-device using self-hosted models (Qwen2-VL, OmniParser), so screenshots of PHI never leave the endpoint. Hosted vision models are available for non-PHI workflows or under a BAA. Every action is logged for audit, and the agent acts under the clinician's existing Citrix session, so it inherits the same access controls and audit trails Epic already enforces.
Automate Epic without waiting on IT
Mediar runs on the Citrix client endpoint. No VDA install, no gold-image change, no parent-IDN signoff. Keyboard, vision, and the clinician's normal session.