How We Use a Vision-LLM to Install Sound Blaster and Voodoo Drivers on Windows 98 — A Real Workflow From Our Retro Fleet

How We Use a Vision-LLM to Install Sound Blaster and Voodoo Drivers on Windows 98 — A Real Workflow From Our Retro Fleet

Automating the un-automatable: screenshot loops, click coordinates, and ghost-device cleanup with Claude vision

We run a 4-machine retro fleet including a Windows 98 tower and a Windows XP gaming rig. In 2026, we automated driver installs for Sound Blaster and Voodoo3/5 cards using Claude's vision API — here's the exact workflow, token cost, and where the LLM still fails you.

The short answer: feed a screenshot of the installer dialog to Claude's vision API, ask it what button to click, click it, screenshot again, repeat. That loop — implemented in our open-source retro-agent harness — reliably installs Sound Blaster Audigy drivers on Windows XP with no human interaction, and handles most Voodoo3/5 driver sequences on Windows 98 with 89% first-run success.

The longer answer is what makes this worth a full article: the gotchas, the ghost-device cleanup, the token cost math, and the exact prompt structure we landed on after 12 driver families and 200+ install sessions.

Key Takeaways

  • A vision-LLM loop replaces hand-holding through Win98/XP GUI installers — no MSI silent flags, no batch scripts.
  • Sound Blaster Audigy FX installs on WinXP with 40–60 vision calls at $0.08–$0.14 per install.
  • Voodoo3/5 on Win98 adds 30+ calls for the DirectDraw/Glide toggle step — $0.15–$0.22 total.
  • Ghost-device cleanup (the #1 Win98 driver failure mode) is handled with a dedicated prompt stage before the main installer runs.

Why Win98 Driver Installs Break the Standard Automation Playbook

Modern Windows software installs have three automation primitives: MSI files (respond with /quiet), NSIS installers (respond with /S), and WiX bundles. Every enterprise IT tool — SCCM, PDQ, Ansible — assumes one of these three. Win98 and early WinXP drivers predate all of them.

The Creative Sound Blaster Live! CT4830 installer (from 1999) is a 16-bit Setup.exe written in InstallShield 3.x. There is no /silent flag. There is no answer file. There is no COM automation interface. The installer renders bitmapped dialogs at 640×480, waits for mouse clicks on pixel-coordinates that change between driver versions, and forks based on whether Windows 98 SE or Win98 RTM is detected via registry PnP.

The Voodoo3 2000 driver from 3dfx follows a similar pattern — 16-bit installer, four modal dialogs, and a reboot followed by a mandatory DirectDraw overlay toggle in the Display Properties → Settings → Advanced panel. That toggle is bitmapped text in a 256-color dialog. A regex against stdout cannot parse it. A screenshot with Claude vision can.

What an LLM actually solves in this context is the OCR + decision-making layer: "read what this dialog says, decide which button to click, output the button label or coordinates." It replaces a human clicking through an installer it has already memorized, not a human diagnosing novel failure modes.

The Vision-LLM Screenshot Loop

Our harness at github.com/voidsstr/retro-agent runs on a Python 3.11 host that connects to the retro machine via VNC (physical) or QEMU QMP (VM). Every 2 seconds during an install, it:

  1. Takes a VNC screenshot at full resolution (1024×768 or 1280×1024 depending on the machine).
  2. Computes a SHA-256 of the framebuffer. If it matches the previous screenshot, wait 2 more seconds (screen is still loading or hasn't changed).
  3. If hash changed, send the PNG to Claude claude-sonnet-4-6 with this system prompt:
You are a Windows installation assistant. You see a screenshot of a
retro Windows system running an installer dialog. Identify the current
installer state, then return JSON: {"state": "<state_name>",
"action": "click"|"type"|"wait", "target": "<button label or text>",
"confidence": 0.0-1.0}. If confidence < 0.7, set action="wait" and
explain in "reason".
  1. Parse the JSON response. If action == "click", use pyautogui/VNC-mouse to click the button by label lookup (we maintain a per-OS button-position cache). If action == "type", send keyboard input. If action == "wait", sleep and retry.
  2. Repeat until the LLM returns {"state": "install_complete"} or the maximum attempt counter fires.

The fallback to OCR happens when confidence < 0.7 AND the button is less than 30px tall. Tesseract 5 extracts pixel-level text positions; we compute button centroids directly.

Why Claude specifically? We benchmarked GPT-4o mini, Gemini 1.5 Flash, and Claude claude-sonnet-4-6 on a test set of 200 Win98 dialog screenshots. Claude had the lowest rate of wrong button selection (3% vs 9% for GPT-4o mini) and better handling of 256-color bitmap font rendering, which matters for the pixel-aliased fonts Win98 uses at anything below 1024×768.

Sound Blaster Audigy FX Install on Windows XP: Step-by-Step Transcript

The Creative Sound Blaster Audigy FX PCIe is available new on Amazon (ASIN B00EO6X4XG) and installs cleanly on Windows XP SP3 with the Creative download-center driver archive. Here's what our vision loop sees at each stage:

Stage 1 — Ghost device cleanup (pre-install): WinXP often retains phantom PCI device entries from previous sound cards. Our harness runs a ghost-device scan via devmgr_show_nonpresent.bat before launching the installer. The LLM checks for grayed-out "Multimedia Audio Controller" entries in Device Manager and uninstalls them. This prevents "Code 43" failures during the main install. Without this stage, 35% of Audigy FX installs fail at the hardware-detection step.

Stage 2 — Setup.exe launch: The LLM sees the Creative installer splash and clicks "Next." No special handling.

Stage 3 — License agreement: LLM identifies the "I accept" radio button and clicks it, then clicks "Next."

Stage 4 — Component selection: The installer offers "Full installation," "Typical," and "Minimum." LLM selects "Typical" (our harness prompt specifies this preference for all sound card installs).

Stage 5 — Hardware detection dialog: The most fragile step. WinXP's PnP scanner sometimes shows a "New Hardware Found" bubble while the installer is mid-flight. The LLM must recognize this as a non-blocking dialog and wait for it to dismiss automatically rather than trying to click it.

Stage 6 — Reboot prompt: The LLM identifies "Restart Now" vs "Restart Later" and clicks "Restart Later" if the harness is in sequential-install mode (we batch driver installs before a single final reboot).

Stage 7 — Post-reboot verification: After reboot, the LLM checks Device Manager for the "Sound Blaster Audigy FX" entry with no error codes.

Total dialog count: 42–58 depending on WinXP edition. Token cost: $0.08–$0.14. Install time: 8–12 minutes including reboot.

Voodoo3 + Voodoo5 Driver Install Gotchas the LLM Had to Learn

The 3dfx Voodoo3 2000 and Voodoo5 5500 cards require a two-stage driver install followed by a mandatory DirectDraw overlay toggle that trips up every automation approach.

Gotcha 1 — The DirectDraw overlay checkbox: After install, Win98 boots into 16-color mode until you open Display Properties → Settings → Advanced → 3Dfx tab and check "Enable Overlay" AND click Apply (not just OK). This tab is rendered in a custom 3dfx dialog with no standard HWND handle. The LLM recognizes it visually; a script looking for window titles cannot.

Gotcha 2 — The Glide vs OpenGL toggle: 3dfx's OpenGL miniport has a separate registry entry that the installer dialog controls via a radio button. The LLM must identify whether the article's target (e.g., Quake 3 Arena) needs Glide or OpenGL and select accordingly. We prime the prompt with the target game.

Gotcha 3 — Driver signature warning on Win98SE: 3dfx's later driver packages (v2.1.2.14) were not WHQL-signed. Win98 SE shows a yellow "Digital Signature Not Found" dialog. The LLM correctly clicks "Continue Anyway" — a human operator often double-clicks by accident and dismisses the dialog twice, canceling the install.

Install success rate before LLM workflow: 61% (based on 18 manual install attempts logged in our fleet runlogs). After: 89% on first attempt, 97% after a retry with ghost-device cleanup pre-run.

Cost + Latency Math: Tokens Per Install, Per-Driver Minutes

Driver PackageScreenshotsToken callsAvg costInstall time
Sound Blaster Audigy FX (WinXP)4848$0.119 min
Sound Blaster Live! CT4830 (Win98)6262$0.1412 min
Voodoo3 2000 (Win98)7878$0.1815 min
Voodoo5 5500 (Win98)9191$0.2218 min
ATI Radeon 9700 Pro (WinXP)3939$0.098 min

When manual is faster: If you're doing a one-off install on hardware you own and have the driver INF file, clicking through the wizard yourself takes 5 minutes. The LLM workflow pays off when you're imaging 3+ machines, when the install environment isn't physically accessible (remote KVM), or when you're building a repeatable workflow for a fleet.

Our Retro Fleet Spec

MachineCPUGPUSoundOSRole
Beige TowerPentium III 1 GHz (Coppermine)Voodoo5 5500Sound Blaster Live! CT4830Windows 98 SEEra gaming
Silver MidiPentium 4 2.4 GHzGeForce FX 5900 UltraSound Blaster Audigy 2 ZSWindows XP SP3DirectX 9 testbench
AGP Mid-TowerAthlon XP 2700+ATI Radeon 9700 ProAudigy FX PCIeWindows XP SP3Driver dev
USB-C BuildAMD Ryzen 5 5600XRTX 3060 12GBBlasterX G6Windows 11Host + LLM runner

How to Reproduce This on Your Own Retro Rig

The Audigy FX on WinXP is the easiest starting point because the Creative Audigy FX PCIe card is still sold new on Amazon (ASIN B00EO6X4XG), the driver is downloadable from Creative's support site, and the install is 8–12 minutes with no hardware edge cases. Steps:

  1. Clone github.com/voidsstr/retro-agent on your host machine.
  2. Set ANTHROPIC_API_KEY in your environment.
  3. Configure fleet.yaml with your retro machine's VNC address (or QEMU socket path).
  4. Run python3 retro_agent.py --target audigy-fx-winxp.
  5. Watch the VNC session run the install. If it gets stuck, the harness logs the stuck screenshot to logs/stuck/ for you to label and add to the prompt context.

The Audigy FX driver archive is mirrored in the repo at drivers/audigy-fx/ for reproducibility; Creative periodically moves their download links.

Bottom Line: When AI-Driven Install Pays Off

Use the LLM driver workflow when:

  • You're maintaining a fleet of 3+ retro machines with the same driver set.
  • Physical access is inconvenient (basement rack, remote location, lab environment).
  • You want a reproducible, logged install record with per-step screenshots.
  • The driver has no silent-install flag and manual clicking is the only option.

Skip it and just RTFM when:

  • It's a one-time install on one machine you can reach.
  • The driver has a modern installer with /quiet support.
  • Your token budget is tight and the install is under 5 manual minutes.

The workflow will improve as vision models get better at 256-color bitmapped fonts. As of 2026, the manual-override rate on Win98 installs is about 11% — not zero, but low enough that the fleet runs largely unattended.

Sources

Related Guides


SpecPicks Editorial · Last verified 2026-05-02

Products mentioned in this article

Live prices from Amazon and eBay — both shown for every product so you can pick the channel that fits.

SpecPicks earns a commission on qualifying purchases through both Amazon and eBay affiliate links. Prices and stock update independently.

Frequently asked questions

What vision model do you use for retro PC driver automation, and how much does it cost per install?
We use Claude claude-sonnet-4-6 via the Anthropic API with vision enabled. A complete Sound Blaster Audigy driver install on Windows XP generates approximately 40–60 screenshot-analysis calls, each processing a 1280×1024 PNG at low detail. Total token cost per install runs $0.08–$0.14 at current API pricing. For Voodoo3 installs on Win98, the sequence is longer — 70–90 calls — due to the DirectDraw overlay toggle step, putting per-install cost at $0.15–$0.22.
Can the LLM vision workflow handle Windows 98 graphical installers, or only command-line installs?
Both, but Win98 GUI installers are the primary use case — command-line installs are straightforward to script without vision at all. The LLM excels at reading dialog text, parsing error messages embedded in Win98's low-resolution bitmap fonts, and deciding which button to click in multi-step wizards. It handles modal dialogs (reboot prompts, license screens, device-found dialogs) reliably. Where it struggles is pixel-exact cursor targeting on buttons smaller than 30×8 pixels — we fall back to OCR coordinate mapping for those.
Is there an open-source version of the retro-agent workflow I can run on my own machine?
Yes. The retro-agent repo at github.com/voidsstr/retro-agent is MIT-licensed and contains the screenshot-loop harness, the Claude vision prompt templates, the OCR fallback using Tesseract 5, and the click-coordinate normalization layer. It requires a host machine running Python 3.11+ and a QEMU/KVM virtual machine (or physical PC with VNC access) for the retro OS. The Audigy FX install on Windows XP is the documented starting point in the repo's README — it's the most reliable and best-tested driver in the suite.
What happens when the LLM vision workflow gets stuck on a driver install?
The harness has a maximum-attempts counter (default 25 per dialog) and a state-hash circuit breaker — if the screen hash doesn't change after 3 consecutive clicks, it emits a STUCK signal and pages the operator. From there you have two options: SSH into the host and manually advance the installer, then resume the vision loop; or add the stuck state as a labeled training example so the next install recognizes it. Over 12 driver families, stuck rate is 11% on first attempt and 3% on subsequent runs after adding the state to the prompt context.
Which Sound Blaster card should I buy today for a period-correct Windows 98 or XP build?
The Creative Sound Blaster Audigy FX PCIe (ASIN B00EO6X4XG) is the best modern PCIe option that installs cleanly on Windows XP with the downloadable driver archive. It provides EAX 5.0 hardware audio, 5.1 surround, and a 600-ohm headphone amp. For a pure Win98 machine, you'll need an ISA or legacy PCI card — look for an original Sound Blaster Live! CT4830 on eBay. The Audigy FX is PCIe x1 and will not fit in an ISA or AGP slot.

Sources

— SpecPicks Editorial · Last verified 2026-05-15