An argument

Compliance paperwork automation: the layer most articles skip

Read ten articles on this category and they all describe the same thing: pulling evidence from cloud APIs, generating policies, running vendor questionnaires. That is the GRC layer, and Vanta and Drata do it well. Underneath, in regulated industries, sits a second layer that the same articles never reach: the legacy desktop form where the regulated record has to land. Different audit shape, different tooling, different failure modes. This page draws the line and traces what the second layer actually puts in front of an auditor.

M
Matthew Diakonov
11 min

Direct answer (verified 2026-05-08)

What is compliance paperwork automation? Two layers under one phrase.

The first layer is governance, risk, and compliance: software that collects evidence from cloud systems with APIs, generates policy documents, runs vendor questionnaires, and produces audit reports. Vanta, Drata, Sprinto, and Scrut all sit here. Most articles on this category describe only this layer.

The second layer is the system of record: the regulated form in the legacy desktop app where the record has to actually post. The CMS-1500 in Epic, the new-account flow in Jack Henry SilverLake, the FNOL form in Guidewire, the vendor master record in SAP GUI. That layer is what Mediar handles. The runtime is a Rust program executing a TypeScript workflow file, with no LLM call between a read and a write, so the same input always posts the same record.

Authoritative scope for the GRC layer: Vanta’s compliance-automation guide. Open-source executor for the system-of-record layer: github.com/mediar-ai/terminator (MIT, Rust, ripgrep-able).

Why the category collapses two layers into one

Strip the marketing off and the phrase “compliance paperwork” covers two separate problems that share almost no tooling.

Layer one is the GRC layer. The compliance team has to prove to an auditor that the cloud systems are configured correctly, that vendors were reviewed, that policies exist, that controls are tested. Most of the inputs are available through APIs: AWS, GitHub, Okta, Workday, Jira. A platform like Vanta or Drata polls those APIs, files the responses as evidence, and assembles the report. The reviewable artifact is a collection of API responses plus generated policy documents.

Layer two is the system-of-record layer. The regulated record has to land in a specific application. A patient registration in Epic. A new account in Jack Henry SilverLake. A first notice of loss in Guidewire ClaimCenter. A vendor master record in SAP GUI. The application has a UI but often no documented API for the screen flow. The reviewable artifact has to be assembled from what the runtime actually did during the post: the click, the typed value, the verification that the next dialog opened, the screenshot of the saved state.

Both are real. Both produce paperwork. Both touch compliance. The rest of this page is about why the second one needs different tooling.

What auditors expect from each layer

The two layers produce different audit packs. A SOC 2 Type II evidence pack from a GRC tool is a folder of API responses and policy PDFs, time-stamped, with each control mapped to a piece of evidence. The reviewer’s job is to confirm the evidence is current, the policy was acknowledged, and the control was tested.

A regulated transaction posted into a system of record produces a different artifact. A KYC review at a community bank ends with a field changed in a customer record. A 12-month chart audit in a regional clinic ends with notes added to a patient encounter. A quarterly inventory reconciliation ends with line items moved between plants in SAP. The auditor wants to see who did it, when, what changed, and whether the system confirmed the change.

For a manual operator the answer is the application’s built-in activity log. For an automated workflow the answer has to come from the workflow runtime itself. That trace has to be precise, per step, and reproducible from the same input. If it is not, every audit becomes a debate about which row in the database came from a person and which row came from the bot, with no clean way to tell.

What an auditor opens, by layer

OperatorGRC toolMediar runtimeSystem of recordAudit filelogs into Vanta to refresh evidencewrites evidence-collection reporttriggers workflow run with input recordtype_into_element + verify_element_existsnext dialog observed (post-condition met)writes per-step trace + screenshots + durationtwo artifacts on audit day, not one

The trace you can actually verify, step by step

Here is the part most pages on this category never get to. When the Mediar workflow recorder converts a captured event into an executable step, it does not just record the click and the typed value. It attaches a set of post-conditions to the step, and the runtime enforces them on every replay. Those post-conditions are the spine of the audit trail.

The function that does this lives in apps/desktop/src-tauri/src/mcp_converter.rs, around line 110. It is called add_mcp_action_fields, and every step that goes through the converter receives the same shape:

add_mcp_action_fields

Read the field names slowly. The two verify_* fields are slots for a post-condition selector: after the action, the runtime looks for an element that should now exist (or should have disappeared) and waits up to verify_timeout_ms for that condition to be satisfied. The include_tree_after_action and ui_diff_before_after fields, when enabled, capture a snapshot of the accessibility tree before and after the action, so a reviewer can see exactly which subtree changed.

For navigation steps the same converter also writes an expected_navigation block with the destination URL and a wait timeout. For application-switch steps it writes an expected_application_switch block naming the target application and process. None of this is best-effort: a step that does not satisfy its post-condition fails loudly, and the failure is recorded as a StepResult with status Failed and a duration in milliseconds.

What gets stored per execution

A workflow run does not vanish into a log file. The schema for every run is defined in supabase/migrations/20250630000001_create_workflow_executions.sql. The columns relevant to a compliance review are workflow_id, client_id, status, execution_params, results, execution_logs, screenshots, error_message, queued_at, started_at, completed_at, and execution_duration_seconds.

Inside that row, the StepResult sequence (defined in crates/executor/src/models/execution.rs) records, for every step, the step_id, tool_name, status, duration_ms, and retry_count. If a step retries because a dialog took an extra moment to render, the retry count is in the row. If the four-strategy match cascade fell back from automation id to parent window before resolving, that fact is recoverable from the execution logs. None of this is a marketing claim; it is the database schema.

$750K/year

Claims intake at one mid-market carrier went from 30 minutes per claim to 2 minutes. That is the AP-team headcount math, not a press-release number.

Mediar customer, insurance vertical

Why the runtime cannot have an LLM in it

Determinism is the load-bearing claim. A regulator is not auditing a model. A regulator is auditing whether the same input always produced the same record. A probabilistic step in the runtime breaks that property: two identical claims can produce two different field values, two identical KYC packets can produce two different beneficiary records, two identical journal entries can produce two different account postings.

Mediar splits inference and execution at the workflow boundary. The model only runs once, during the offline recording-processing pass, where it reads the captured event stream and writes the workflow file. After that file is reviewed and checked in, the runtime is a Rust program executing the file step by step, calling the Windows UI Automation accessibility framework directly. There is no inference library loaded into the executor process.

A compliance team can verify this independently. The Terminator SDK is open source under MIT at github.com/mediar-ai/terminator. Clone the repo, run ripgrep on the executor crate at crates/executor for any of the common inference SDKs (gemini, claude, openai, anthropic, langchain), confirm zero hits, then run the runtime against a test environment and capture the trace. That zero is the architectural bet: the workflow file is the testable unit, the same way a SQL migration is the testable unit for a database change.

The honest counterargument

A determinism story is only useful if the underlying locator is stable. If the application UI shifts and the recorded automation id no longer matches, a deterministic runtime simply fails deterministically, which is not the same as succeeding. That is a real problem and it is what most discussions of pixel-based RPA stall on.

The mitigation is in apps/desktop/src-tauri/src/focus_state.rs, which walks four strategies in order: recorded automation id, window handle plus bounds, visible text content, and parent window as a last fallback. Three of those four strategies do not depend on absolute position, so a routine UI tweak (a panel reorders, a button shifts down a row, a form gains a tab) usually still resolves through one of the earlier strategies. Only when all four miss does the step fail, and at that point the runtime queues the step for re-recording rather than guessing. That failure mode, loud and queueable, is what makes the trace honest.

The other honest limit is the application surface. Some Citrix-streamed legacy apps remote the screen as pixels and never expose an accessibility tree to the client. In that case the desktop agent has to run inside the Citrix session, which requires IT cooperation. Browser-only flows in modern SaaS are usually a better fit for browser-based agents than for desktop automation. Those are not failures of the architecture; they are the cases where a different tool is the right answer.

We moved an LG-customer F and B chain from UiPath to Mediar; their CFO told the board they are now saving 70 percent on costs. The audit pack got smaller and the workflows became diffable.
M
Mediar deployment notes
Manufacturing, SAP B1

Where the two layers actually meet

Picture a regional bank running through a SOC 2 Type II audit while also onboarding 200 small-business customers a week. Drata pulls AWS, GitHub, and Okta evidence and produces the SOC 2 report. That report does not care how the customer record got into Fiserv DNA; it only cares that the access controls around Fiserv DNA exist. The customer record itself still has to land in the core system, and that is what the bank’s onboarding team does by hand on most days, taking the new-account form from PDF to PDF to screen eight separate times.

A Mediar workflow can do that post deterministically and produce the per-step trace with screenshots. Drata still owns the SOC 2 report. Mediar owns the per-customer audit trail. They sit next to each other in the audit pack on review day. Treat them as one tool and you end up either trying to make Drata do data entry into a green-screen (it cannot) or trying to make Mediar prove that AWS IAM policies are configured correctly (it should not).

The same shape repeats in healthcare (Vanta plus Epic), insurance (Sprinto plus Guidewire), manufacturing (Drata plus SAP GUI). The GRC layer and the system-of-record layer answer different questions and produce different artifacts. A page about compliance paperwork automation that addresses only one of them is silently picking a side.

Compliance paperwork automation, in detail

What does compliance paperwork automation actually mean?

It is two distinct categories that share a phrase. The first is the governance, risk, and compliance layer: software that pulls evidence from cloud systems with APIs, generates policy documents, runs vendor questionnaires, and produces audit reports. Vanta, Drata, Sprinto, and Scrut sit there. The second is the system-of-record layer: software that fills the regulated form inside the legacy desktop app where the record actually persists, then writes a structured trace of what it did. CMS-1500 in Epic, the new-account onboarding screen in Jack Henry SilverLake, the FNOL form in Guidewire, the vendor master record in SAP GUI. The pipelines are different, the failure modes are different, and what an auditor opens at the end is different. Most articles on this topic describe only the first.

Which layer does Mediar work on?

The second one. Mediar is a Windows desktop automation platform. Its job is to fill the form inside the legacy app where the regulated record actually posts, and to leave behind a per-step trace that an auditor can read. The cloud-API evidence-collection tools handle the first layer well, and the two are complementary, not competitive. A bank can run Drata for SOC 2 evidence collection and Mediar for new-account paperwork that has to land in Fiserv DNA without breaking anything.

Why is the system-of-record layer harder than the cloud-API layer?

Because the cloud-API layer has hooks. AWS, GitHub, Okta, and Workday all expose APIs that a GRC tool can poll. The system-of-record layer often does not. An Epic registration session involves modal lookup dialogs, validation popups, and field-level autocomplete that the documented APIs do not expose. SAP GUI is a Windows desktop client that talks to the application server through DIAG. A Jack Henry green-screen runs over a 5250 emulator. The forms are real, the regulator wants the record posted there, and there is no REST endpoint to call. The automation has to happen at the UI layer, and the audit trail has to be assembled from what the runtime observed during the run.

What is the artifact a Mediar workflow produces for the regulated form?

Three things. First, the workflow file itself, a TypeScript or YAML sequence checked into source control. A reviewer can diff it the way they would diff a stored procedure. Second, a per-execution row in the workflow_executions table (defined in supabase/migrations/20250630000001_create_workflow_executions.sql) with started_at, completed_at, execution_duration_seconds, results, execution_logs, screenshots, and error_message. Third, the StepResult sequence inside that row, which records the tool_name, status, duration_ms, and retry_count for every step. Together those three are the audit pack. Two of the three were written by the runtime, deterministically.

What is auto-attached to every step at conversion time?

The McpConverter in apps/desktop/src-tauri/src/mcp_converter.rs writes seven fields onto every workflow step at conversion time: process, selector, verify_element_exists, verify_element_not_exists, verify_timeout_ms (2000 milliseconds default, 5000 milliseconds for navigation steps), include_tree_after_action, and ui_diff_before_after. For navigation steps the same converter writes an expected_navigation block (url, wait_for_navigation, timeout_ms). For application-switch steps it writes an expected_application_switch block (to_application, to_process, wait_for_focus, timeout_ms). Those fields are post-conditions: they are what the runtime checks after each action to decide whether the step actually did what was recorded. They are why the trace can say not just 'I clicked save' but 'I clicked save and then the next dialog appeared within 2000 ms', which is what an auditor wants to read.

Why does determinism matter for compliance paperwork?

Because the regulator is not auditing a model. The regulator is auditing whether the same input always produced the same record. A probabilistic step in the runtime breaks that property: two identical claims can produce two different field values, two identical KYC packets can produce two different beneficiary records, two identical journal entries can produce two different account postings. The Mediar runtime has no LLM call between a read and a write. The model only runs once, during the offline recording-processing pass, where it reads the captured event stream and writes the workflow file. After that file is reviewed and checked in, the runtime is a Rust program executing that file step by step. A reviewer can verify there is no inference library in the production executor crate at crates/executor in github.com/mediar-ai/terminator with a single ripgrep.

How is this different from RPA tools like UiPath or Automation Anywhere?

Two ways that matter for compliance paperwork. First, the locator strategy. Classic RPA records pixel templates or fragile selectors. When the form moves, the bot misclicks. The Mediar runtime walks a four-strategy match cascade in apps/desktop/src-tauri/src/focus_state.rs: recorded automation id, window handle plus bounds, visible text content, parent window. Three of those four strategies do not depend on absolute position, so a routine UI tweak resolves through one of the earlier strategies and the workflow keeps running. Second, the audit shape. The RPA studios produce a binary process artifact that compliance teams have a hard time reviewing. Mediar produces a TypeScript file. Diffing the file is the same skill as diffing any other source code change.

Does this replace Vanta, Drata, or Sprinto?

No. Those tools answer a different question: how does the compliance team prove to an auditor that the cloud systems are configured correctly, that vendors have been reviewed, that policies exist, and that controls are tested. Mediar does not collect evidence from AWS or Okta. It posts the regulated record into the system where the regulated record has to live. A team running Drata for SOC 2 still needs something to do the patient intake into Epic, the new-account flow into Jack Henry, the supplier creation into SAP. That something is what this page is about.

What about document storage and retention, the other thing 'compliance paperwork' sometimes means?

Document repositories like Mitratech and Moxo solve a third question: where does the file live after it has been generated, who can access it, when does it expire, who has to sign off. They are not redundant with either layer above. A regulated bank typically has all three: a GRC tool collecting evidence (layer one), a system-of-record automation tool posting the record (layer two), and a document repository holding the artifact (layer three). The category called 'compliance paperwork automation' on most marketing pages flattens these three into one and then describes only the first.

What is the open-source surface a compliance team can verify independently?

The Terminator SDK at github.com/mediar-ai/terminator under MIT. A compliance team can clone the repo, ripgrep the executor crate at crates/executor for inference libraries (zero matches against gemini, claude, openai), read the four-strategy match cascade, run the runtime against a test environment, and confirm the determinism claim independently. The orchestration layer, the cloud workflow runner, and the recording pipeline are commercial. A team that wants to wire form-fill primitives into their own queue can build directly on Terminator without taking a runtime dependency on Mediar's cloud.

What does an auditor actually read on audit day?

Three artifacts. The workflow file, the same way they would read a SQL migration. The execution row, with the started_at and completed_at timestamps, the screenshots, and the per-step durations. The StepResult sequence inside that row, naming each tool_name and step_id. If a step fails the four-strategy match, the runtime emits a failure record into the same trace. Nothing is silently retried with a different element. That property, plus the absence of an LLM in the runtime, is what lets a compliance team sign off the workflow itself rather than the model.

Where does this break, honestly?

Three places. First, applications that do not expose a Windows accessibility tree at all, including some Citrix-streamed legacy apps that remote the screen as pixels. Mediar can still run inside the Citrix session if the desktop agent is installed there, but that requires IT cooperation. Second, browser-only flows in modern SaaS where browser-based agents are honestly fine. Third, anything where the source document is so unstructured that a vision model cannot extract the destination schema reliably; the recording-time vision pass can fail on a heavily handwritten or low-quality scan, at which point the workflow needs a human-in-the-loop queue at the entry point. None of those cases are dishonest about which layer the tool sits on.

See the per-step trace on your own forms

Bring one regulated workflow you currently run by hand. We will record it once and walk you through the trace shape an auditor would see at the end.