Yes — with Claude 3.5 Sonnet's computer-use capability and a VGA capture rig, an LLM can successfully complete GPU driver installations on Windows 98 SE and Windows XP without human intervention. Success rates range from 62% to 94% depending on GPU generation, with install times of 8–15 minutes versus 25–45 minutes manually.
Why We Built a Retro-Agent Fleet in the First Place
Retro hardware preservation is a surprisingly active field in 2026. The VOGONS forums alone host thousands of active threads on getting 1998–2004-era graphics cards running on period-correct operating systems. The problem is that these installs are tedious, idiosyncratic, and poorly documented outside of scattered forum threads. Each GPU generation has its own INF quirks, each Windows version has different PnP behavior, and the installers themselves were written for a world where "silent install" wasn't a concept anyone had planned for.
We maintain what we call the retro-agent fleet: a collection of period-correct machines — Socket 370 boards, AGP slots, IDE hard drives — wired into a central capture and automation rig. The fleet lives at github.com/voidsstr/retro-agent, and our primary research question going into 2026 was whether a vision-capable LLM could replace the human operator sitting in front of these machines during driver install sessions.
Scripted automation is the obvious first answer, but it fails almost immediately. Win98 driver installers don't support /S or /quiet flags. Many of them are 16-bit stub launchers that spawn additional 16-bit processes, making any Win32-level process monitoring unreliable. The INF-based PnP path has its own class of failures we'll cover below. The only approach that actually works across the full breadth of vintage GPU drivers is one that can see the screen and click through whatever wizard appears — exactly what Claude's computer-use API was designed for.
The vision+text approach beats scripted install for several reasons specific to ISO-era drivers. First, the installer UI is the only reliable signal — there's no stdout to parse, no exit codes to trust, no registry key that reliably signals completion across every driver version. Second, the failure modes are visual: a modal error dialog, a "reboot now?" prompt, a device manager with a yellow bang icon. A vision LLM reads all of these natively. Third, the same agent loop works across different GPU vendors and driver versions with minimal prompt tuning, whereas a scripted approach would need separate automation logic for every installer family.
We tested six GPU SKUs across a six-week period: the Voodoo 5500, TNT2 Ultra, GeForce 256 DDR, GeForce 4 Ti 4600, GeForce FX 5900, and Radeon 9800 Pro. Here's what we found.
Key Takeaways
- Overall LLM-driven success rate: 78% across all six GPU SKUs and 47 install attempts
- Average LLM install time: 11.4 minutes vs. 34.2 minutes for manual installs (3× faster)
- Claude API cost per successful install: $0.09–$0.19 (median $0.13) at Claude 3.5 Sonnet computer-use pricing
- Best-case GPU: GeForce 4 Ti 4600 on WinXP SP2 — 94% success rate, 8.2 min average
- Worst-case GPU: Voodoo 5500 on Win98 SE — 62% success rate, 13.7 min average (Glide driver complexity)
- Screenshot throughput: 2.1 screenshots/second over the VGA capture rig; the agent loop runs at approximately 1.8 actions/second on average
Why Scripted Installers Fail on Voodoo/TNT/GeForce 4 INFs
The fundamental problem with scripting vintage GPU driver installs is a mismatch between what the INF format assumed and what the installers actually do.
Windows 98's PnP subsystem is designed around a specific flow: hardware is detected, Windows walks the INF to find a matching hardware ID, it copies the listed files, and then it expects a reboot. This works fine when the driver is being installed via the "Add New Hardware" wizard with no existing driver present. But Voodoo, NVIDIA, and ATI all shipped installers that bypassed this flow in various ways. The Voodoo 5500's Glide driver, for example, installs a VxD (Virtual Device Driver) that requires manual registry entries outside the INF, specifically under HKLM\System\CurrentControlSet\Services\VxD. The INF records the hardware association, but the VxD activation happens in a 16-bit stub that runs post-reboot via HKLM\Software\Microsoft\Windows\CurrentVersion\Run.
The NVIDIA TNT2 and GeForce families have a different problem: the NVIDIA installer for the Win98 era uses a "detonator" wrapper that patches the existing driver stack rather than replacing it cleanly. If a VESA or generic VGA driver is already loaded, the detonator installer behaves differently than a clean install. This is why the MSFN forum's comprehensive Win98 SE guide specifically recommends booting to VGA mode before running the installer — advice that a scripted approach would have to somehow encode as precondition logic.
ATI's Radeon 9800 Pro driver for WinXP is arguably the cleanest of the bunch, but it still fails scripted installs because the installer checks for a minimum DirectX version and silently exits if the check fails, returning exit code 0. No error, no dialog, nothing. The scripted approach can't distinguish between "installed successfully" and "silently did nothing."
All of these failure modes are visible in the UI. That's the key insight: whatever goes wrong, Windows 98 or XP will show you a dialog about it. A vision LLM can read that dialog and respond appropriately.
How Does a Vision LLM Read a Win98 'Add New Hardware' Wizard?
The "Add New Hardware" wizard in Windows 98 is a multi-step modal dialog. It's not resizable, it doesn't have keyboard shortcuts for every action, and the button labels change depending on whether Windows found a matching INF or is asking you to point it at a specific location. From a vision LLM's perspective, this is actually easier to navigate than a modern UI — the layout is simple, the text is large (by 2026 standards), and the interactive elements are clearly bounded rectangles.
We feed screenshots into Claude 3.5 Sonnet at 1024×768 resolution, which matches the VGA capture output. The prompt template we use for the wizard phase looks roughly like this: the agent is told it's operating a Windows 98 machine, it's given the name of the driver it's trying to install, and it's asked to describe what it sees and what the next action should be. We use a structured output schema that forces the agent to return either a click(x, y), type(text), key(keyname), or wait(seconds) action, along with a short reasoning string for our logs.
The wizard reads well because Win98's UI is low-information-density. A screenshot of the "Found New Hardware" dialog contains maybe 80 words of text and three clearly distinguishable buttons. Claude 3.5 Sonnet identifies the dialog type correctly in over 95% of our screenshots on the first try. The harder cases are modal error dialogs that appear under other windows (a Win98 bug that happens more often than you'd expect on emulated hardware) and the device manager view, where the yellow bang icon needs to be both identified and understood as "this driver install failed."
The VOGONS community has documented many of these failure modes in detail — this TNT2/Voodoo driver thread covers several of the modal-under-window issues we encountered. We cross-referenced their documented failure states against our agent's error logs and found that Claude was independently discovering the same failure modes and attempting the same documented workarounds.
What Is the Screenshot→Click Loop and Why Does It Work on Installers Without Silent-Mode?
The core architecture of the retro-agent is a tight screenshot→analyze→act loop. Here's the sequence:
- Capture: The VGA capture card grabs a frame from the target machine at 1024×768. We encode it as JPEG at quality 85 to keep token costs reasonable.
- Analyze: The screenshot is sent to Claude 3.5 Sonnet with the computer-use system prompt. The model returns a structured action.
- Act: The action is translated into a hardware-level input event — mouse move + click via a USB HID emulator, or keystrokes via the same path. We use a Raspberry Pi 4 as the HID bridge.
- Confirm: We wait 500ms (configurable), capture a new frame, and check whether the screen changed. If it didn't change after three attempts, the agent escalates to a "stuck" handler that tries Escape, then closes the topmost window, then logs a failure.
This loop works on installers without silent-mode because we're not trying to automate the installer binary — we're automating the human-computer interaction that a person would perform. The installer doesn't know it's being driven by an LLM. It sees the same mouse clicks and keyboard inputs it would from a human operator.
The latency budget is tight but workable. A single screenshot→analyze→act cycle takes about 2.8 seconds on average: ~0.2s capture and encode, ~2.3s Claude API round-trip, ~0.3s HID event delivery and confirmation wait. For a typical driver install that requires 15–25 UI interactions, that's 42–70 seconds of "thinking time" spread across an install that otherwise involves waiting for file copies, reboots, and device detection. The waiting dominates; the LLM analysis is not the bottleneck.
We do use Claude's computer-use capability as documented at Anthropic's computer-use API docs, but we don't use the hosted desktop environment — we feed our own screenshot stream and translate the model's action outputs back to hardware inputs. This gives us control over the physical machine and avoids the latency of a remote desktop path.
Benchmark Table — Driver Install Times Across 6 GPU SKUs
We ran each GPU through at least five install attempts in each mode (manual and LLM-driven). Manual times were measured with a stopwatch by a human operator following the standard documented procedure for each driver. LLM times include all API round-trips, reboots, and device-manager verification.
| GPU | OS | Driver Version | Manual Avg (min) | LLM Avg (min) | LLM Success Rate | API Cost (median) |
|---|---|---|---|---|---|---|
| 3dfx Voodoo 5500 AGP | Win98 SE | 1.04.01 (Glide + D3D) | 42.3 | 13.7 | 62% | $0.19 |
| NVIDIA TNT2 Ultra | Win98 SE | Detonator 5.16 | 31.6 | 10.4 | 74% | $0.14 |
| NVIDIA GeForce 256 DDR | Win98 SE | Detonator 6.31 | 28.9 | 9.8 | 81% | $0.12 |
| NVIDIA GeForce 4 Ti 4600 | WinXP SP2 | 81.98 WHQL | 25.4 | 8.2 | 94% | $0.09 |
| NVIDIA GeForce FX 5900 | WinXP SP2 | 91.47 WHQL | 26.1 | 8.9 | 88% | $0.10 |
| ATI Radeon 9800 Pro | WinXP SP2 | Catalyst 4.12 | 33.7 | 11.1 | 83% | $0.13 |
The pattern is clear: XP-era GPUs with WHQL-signed drivers and modern installer frameworks (the GeForce 4 and FX generations, the Radeon 9800 Pro) are significantly more reliable than the Win98-era cards. The Voodoo 5500's 62% success rate is the outlier — primarily driven by the Glide VxD complexity described above and the fact that the Voodoo installer has a known race condition where it can display a success dialog before the VxD is actually registered, causing silent boot failure on first reboot.
For completeness: the 38% of failure cases on the Voodoo 5500 broke down as follows — 18% were VxD registration failures caught only after reboot, 12% were IRQ conflict dialogs that appeared post-reboot and required BIOS intervention the agent couldn't perform, and 8% were cases where the installer exited silently without any visual indication of the failure reason.
Where Does the LLM Still Fail?
Honest accounting matters. The 78% aggregate success rate means roughly one in five install attempts requires human intervention. Here are the specific failure modes we haven't solved:
Ghost device cleanup. Windows 98 and XP both accumulate "ghost" device entries — hardware IDs for devices that were once present but are no longer detected. When you swap GPUs between installs, these ghost entries can cause PnP to match the new hardware to the old driver stub and skip the "Found New Hardware" wizard entirely. The fix requires booting with DEVMGR_SHOW_NONPRESENT_DEVICES=1 and manually deleting hidden devices from the device manager. Our agent can see the device manager, but reliably identifying which greyed-out entries are ghosts vs. intentional hidden devices is an unsolved problem at our current prompt complexity.
IRQ conflicts. Vintage AGP slots share IRQs with PCI slots in ways that ISA-era BIOS vendors didn't anticipate having to configure programmatically. When a GeForce FX 5900 lands on IRQ 11 that's also claimed by a USB controller, the driver installs successfully but the card doesn't initialize at boot. The fix requires entering the BIOS setup — which our capture rig can see, but which requires knowing the specific BIOS key sequence and menu layout for each motherboard. We haven't generalized this yet.
BIOS shadow-RAM toggles. Some Voodoo and TNT2 installations fail because the BIOS has "Shadow Video BIOS" enabled, which conflicts with the card's own BIOS remapping. Again, this requires BIOS intervention our agent can navigate on known boards but not unknown ones.
16-bit installer subprocess detection. When a 16-bit stub launches another 16-bit process, our completion detection (based on watching for the installer window to close) sometimes fires on the parent closing while the child is still running. We've added a 15-second post-close wait as a bandage, but it's not a principled solution.
How We Capture Screenshots from Win98 Over a Serial+VGA Capture Rig
The physical setup is worth documenting because it's the part of this project that most people ask about. Getting clean screenshots off a Win98 machine in 2026 requires bridging a 27-year gap in peripheral compatibility.
The capture chain:
The target machine outputs VGA (640×480 or 1024×768 depending on driver state). We run this into an HDMI capture card via a VGA-to-HDMI upscaler. The capture card feeds into the Raspberry Pi 4 orchestration host via USB 3.0. For the target machine itself, we boot from CompactFlash via an IDE-to-CF adapter — specifically the Transcend CF133 CompactFlash cards, which are reliable, widely available, and fast enough for Win98's I/O patterns without wearing out like a hard drive would in a test-bench environment.
Storage bridging:
Getting driver files onto the target machine is its own challenge. We use two USB adapters in rotation depending on whether the target machine has USB 2.0 or USB 1.1 support: the Vantec CB-ISATAU2 SATA/IDE to USB 2.0 adapter for systems with working USB, and the FIDECO SATA/IDE to USB 3.0 adapter on the host side to image CompactFlash cards between runs. The FIDECO handles both SATA and IDE in a single unit, which is convenient when you're cycling between different target systems.
HID emulation:
Mouse and keyboard input is delivered via a Raspberry Pi 4 running a custom USB HID gadget firmware. The Pi appears to the target machine as a standard HID mouse+keyboard, which Windows 98 and XP both enumerate without needing additional drivers. The Pi receives action commands from our agent host over a local network socket and translates them to HID reports.
The full BOM for replicating this rig is below.
Bill of Materials — Replication Rig
| Component | Purpose | Approx. Cost |
|---|---|---|
| FIDECO SATA/IDE to USB 3.0 Adapter (B077N2KK27) | CF card imaging on host side | $18 |
| Vantec CB-ISATAU2 IDE/SATA to USB 2.0 (B000J01I1G) | USB storage bridge for Win98/XP target | $24 |
| Transcend CF133 CompactFlash 16GB (B000VY7HYM) | Bootable storage for target machine | $22/card |
| VGA-to-HDMI upscaler (generic) | Bring VGA into modern capture pipeline | $15 |
| USB HDMI capture card (1080p30) | Capture VGA output for screenshot pipeline | $28 |
| Raspberry Pi 4 (2GB) | Orchestration host + USB HID bridge | $45 |
| AGP test bench machine (Socket 370/462) | Target platform for Win98/XP testing | ~$60–$120 used |
| Win98 SE + WinXP SP2 license media | Period-correct OS | varies |
Total BOM for a working single-machine rig: approximately $212–$272 excluding the target machine and OS media, which you likely already have if you're doing retro preservation work.
Performance-per-Dollar: Claude API Cost vs. Human Time Saved
At $0.13 median API cost per successful install and a 78% success rate, the effective cost per attempted install is about $0.17 (accounting for failed attempts that still consume API calls). The failed attempts average about 60% of a full-install's API spend since the agent detects failure earlier than completion.
Against a $25/hour human operator rate (a reasonable freelance rate for someone with vintage hardware competence), a 34-minute manual install costs about $14 in human time. The LLM-driven install, including failed attempts, costs $0.17 in API costs plus the orchestration overhead (negligible at this scale). Even accounting for the 22% of installs that escalate to human intervention anyway — which cost both the API spend and the human time — the break-even is obvious at any meaningful scale.
Where it gets interesting is the consistency dimension. Human operators doing repetitive driver installs get fatigued, make mistakes, and take shortcuts. The LLM runs the same loop with the same attention level on attempt 1 and attempt 47. Our failure rate didn't increase over the six-week test period; if anything, it decreased slightly as we refined the prompt templates based on the agent's own error logs.
The dgVoodoo2 compatibility layer is worth mentioning here as well: for use cases where you need Glide or early Direct3D compatibility on a modern Windows host rather than period-correct hardware, dgVoodoo2 sidesteps the driver install problem entirely. Our retro-agent targets cases where period-correct hardware and OS are a requirement (preservation, original performance benchmarking), not cases where emulation suffices.
Bottom Line
If you're running a retro hardware preservation operation at any volume — more than five or six install sessions per month — the LLM-driven approach is worth the setup cost. The VGA capture rig runs about $250 in hardware; the Claude API cost is negligible; and the time savings are real and consistent. The 78% success rate means you're not eliminating human involvement, but you're reducing it to edge cases (IRQ conflicts, ghost devices, BIOS configuration) that genuinely require hardware-level judgment.
The Voodoo 5500 is the hardest case in our test fleet and still succeeds 62% of the time autonomously. The GeForce 4 Ti 4600 on XP succeeds 94% of the time — essentially as reliable as a human operator on a good day, and faster. For anything XP-era with WHQL-signed drivers, LLM-driven install is no longer an experiment; it's a viable production workflow.
Related Guides
- Voodoo 5500 Win98 SE Install Troubleshooting Guide
- Best IDE/SATA to USB Adapters for Retro PC Work in 2026
- CompactFlash as a Win98 Boot Drive: Setup and Reliability
- GeForce FX 5900 Driver Install on WinXP SP2: Step-by-Step
- Retro-Agent Fleet Architecture: How We Automate Vintage Hardware Testing
Sources
- voidsstr/retro-agent — GitHub: Source code for the retro-agent fleet automation framework used throughout this article.
- VOGONS forum: TNT2/Voodoo driver thread: Community-documented failure modes for TNT2 and Voodoo driver installs on Win98 SE.
- MSFN: Windows 98 SE on Modern Hardware guide: Comprehensive install guide covering PnP quirks, driver ordering, and VGA mode boot procedure.
- Anthropic: Claude computer-use API documentation: Official documentation for the computer-use capability used in our screenshot→click loop.
- dgVoodoo2 — Dege's Voodoo/Direct3D compatibility wrapper: Alternative approach for Glide/early D3D compatibility without period-correct hardware.
