Skyvern-AI/skyvern @ d1de195

What is Skyvern commit d1de19556efa2de6ab3420de0aa908d732a62024?

Short version: it is the commit where Skyvern stopped being wired to a single model. Below is exactly what it changed, the router package it introduced, and the one thing the diff quietly tells you about the ceiling of any browser agent.

M
Matthew Diakonov
6 min read

Direct answer · verified 2026-06-21

Commit d1de19556efa2de6ab3420de0aa908d732a62024 in Skyvern-AI/skyvern is “Implement LLM router (#95)”, authored by Kerem Yilmaz (GitHub ykeremy) and dated 2024-03-17. It changed 16 files (+485 / -308), deleted the hardcoded OpenAI client, and added a provider-agnostic LLM router that resolves to OpenAI, Anthropic, or Azure.

View the commit on GitHub

What the diff actually contains

Before this commit, Skyvern talked to exactly one place: skyvern/forge/sdk/api/open_ai.py. Model choice was not a setting; it was a file. The d1de195 change deleted that file (and a small chat_completion_price.py helper) and replaced both with a new package, skyvern/forge/sdk/api/llm/. Two of the added files carry the weight: api_handler_factory.py builds the call handler, and config_registry.py holds the table of which model keys are allowed.

git show --stat d1de195

File list and stats are from the GitHub commit API for this revision. The two red lines are deletions; the six green lines are the new llm/ package.

The registry: three providers, three enable flags

The interesting code is in config_registry.py. Each provider is wrapped in a settings check, so a model key only exists if you turned its provider on. At this revision the registry reads, in effect:

if ENABLE_OPENAI:
register("OPENAI_GPT4_TURBO", gpt-4-turbo-preview)
register("OPENAI_GPT4V", gpt-4-vision-preview)
if ENABLE_ANTHROPIC:
register("ANTHROPIC_CLAUDE3", claude-3-opus-20240229)
if ENABLE_AZURE:
register("AZURE_OPENAI_GPT4V", azure/<deployment>)

Two of those keys (GPT4V, Claude 3, and the Azure deployment) are registered as vision-capable, which matters because Skyvern reasons over a screenshot of the rendered page, not just the DOM. The router does not pick a model for you; it just guarantees the one you named in config is wired and reachable.

One router, three providers

Skyvern agent
LLM router
OpenAI
Anthropic
Azure

How a request resolves after this commit

Once the router lands, the agent no longer imports a provider. It asks the factory for a handler keyed by the configured model name, the factory looks the key up in the registry, and only then does a real provider get called. If the key is not registered (because its enable flag is off), you get a clean configuration error instead of a deep stack trace from inside a vendor SDK.

Resolving a completion through the router

agent.pyfactoryregistryproviderget_handler(model_key)lookup(model_key)LLMConfig or KeyErrorcall if enabledcompletion

This is a clean abstraction and it did exactly what it set out to do. But notice what it does not touch: skyvern/webeye/actions/handler.py was modified in the same commit, and it is the part that turns a model decision into an action. Every action there is a browser action. Swapping OpenAI for Claude changes the brain. It does not change the hands.

Why this commit is a good lens on the browser boundary

People treat “which model” as the big lever in agent automation. The d1de195 commit is a useful reminder that it is not. Skyvern is a genuinely good open-source browser agent (AGPL-3.0, 20k+ stars), and the router made it model-portable. But the reach of the system is set by the action layer, and that layer speaks Chromium. The line below is the line the router cannot move.

FeatureSkyvern (browser agent)Mediar (accessibility tree)
Where actions landInside a Chromium tab: DOM + rendered pixelsOS accessibility tree, the layer screen readers use
SAP GUI, mainframe green screensNo DOM to read, out of reachNative controls exposed via accessibility APIs
Jack Henry, Fiserv, FIS, Epic, CernerThick desktop clients, no browser surfaceRead and drive the same controls a user sees
Model choiceRouter picks OpenAI / Anthropic / AzureModel-agnostic too; the difference is the action layer
What changing the LLM fixesReasoning quality on a reachable pageSame, but the reachable surface is the desktop, not a tab

This is not a knock on Skyvern. For automating a web app, a browser agent is the right tool and the router commit made it a better one. The point is narrower: if your stalled workflow lives in a thick Windows client with no API, no model on the router's list will reach it, because the action handler never had a path there. That gap is the reason the accessibility-API approach exists. We dig into that input layer in accessibility tree vs pixels and trace Skyvern's exact run loop in Skyvern as RPA.

Stuck where the browser agent stops?

If your workflow lives in SAP GUI, a mainframe, or a banking core, a 20-minute call shows whether the accessibility-tree approach reaches it.

Frequently asked questions

Frequently asked questions

What is Skyvern commit d1de19556efa2de6ab3420de0aa908d732a62024?

It is the commit titled 'Implement LLM router (#95)', authored by Kerem Yilmaz (GitHub ykeremy) and merged into Skyvern-AI/skyvern on 2024-03-17. It changed 16 files (+485 / -308) and replaced Skyvern's single hardcoded OpenAI client with a provider-agnostic LLM router that can resolve to OpenAI, Anthropic, or Azure.

What files did the commit add and remove?

It deleted skyvern/forge/sdk/api/open_ai.py and skyvern/forge/sdk/api/chat_completion_price.py, then added a new package at skyvern/forge/sdk/api/llm/ containing api_handler_factory.py, config_registry.py, models.py, utils.py, exceptions.py and an __init__.py. It also touched config.py, agent.py, app.py, handler.py, the env example, and the setup script.

Which model keys did the router register?

In that commit's config_registry.py the router registers OPENAI_GPT4_TURBO (gpt-4-turbo-preview), OPENAI_GPT4V (gpt-4-vision-preview), ANTHROPIC_CLAUDE3 (anthropic/claude-3-opus-20240229), and AZURE_OPENAI_GPT4V. Each block is gated behind an enable flag: ENABLE_OPENAI, ENABLE_ANTHROPIC, and ENABLE_AZURE respectively.

Does swapping the LLM change what Skyvern can automate?

No. The router changes which model reasons about the page, not what actions are reachable. Every Skyvern action is still bounded by a Chromium tab: navigate a URL, click a rendered element, type into a field, read the DOM and the pixels. A different model does not give the browser agent a handle on a native Windows control.

Why would someone search this exact commit hash?

Usually because they hit a reference to it: a dependency audit, a blame line, a changelog entry, or a tutorial that pinned the LLM abstraction to this revision. The short answer is that d1de195 is the historical moment Skyvern became multi-provider, so it is the natural anchor for anyone tracing how model selection works in the codebase.

Is this where browser automation stops being enough?

For browser-native SaaS, no. For data that lives in SAP GUI, a mainframe green screen, Jack Henry, Fiserv, FIS, Epic, or Cerner, yes. Those are native desktop controls with no DOM and no headless API. Mediar reads them through the OS accessibility tree, the same interface screen readers use, which is the layer the router commit's architecture cannot reach.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.