Open WebUI — self-hosted ChatGPT for your local models

Name: Open WebUI — self-hosted ChatGPT for your local models
Item: darkFlash DB460M Micro-ATX PC Gaming Case, Full-Mesh Front Panel for High Airflow, Tool-Free Clamshell Side Panel, Supports RTX 5090 up to 420mm, 360mm Top Radiator Support, Black
Author: SpecPicks Editorial

The polished web UI for Ollama that actually looks like ChatGPT.

By SpecPicks Editorial · Published 2026-04-21 · Last verified 2026-06-21 · 5 min read

Multi-user auth with RBAC — kids get one account, adults another, admin gets model management RAG pipeline built in — drop a PDF, ask questions,

Open WebUI (formerly Ollama WebUI) is the answer to "how do I give my family a ChatGPT-like interface to my local Ollama?"

What it does

Multi-user auth with RBAC — kids get one account, adults another, admin gets model management
RAG pipeline built in — drop a PDF, ask questions, Open WebUI handles embedding + retrieval
Model switching per conversation
Function calling / tool use (via pipelines)
Responsive design — works on your phone over LAN

Install with Docker

bash

docker run -d -p 3000:8080 \
 --add-host=host.docker.internal:host-gateway \
 -v open-webui:/app/backend/data \
 --name open-webui \
 --restart always \
 ghcr.io/open-webui/open-webui:main

Hit http://localhost:3000, create the admin account, point it at your Ollama instance (defaults work if Ollama runs on the same machine).

Hooking up RAG

Admin → Documents → Upload PDF/Markdown/TXT
Open WebUI chunks + embeds automatically (default: sentence-transformers all-MiniLM-L6-v2 running locally on CPU; switch to nomic-embed-text via Ollama by setting RAG_EMBEDDING_ENGINE=ollama and RAG_EMBEDDING_MODEL=nomic-embed-text)
In chat, toggle the document → Open WebUI injects relevant chunks into your prompt

For serious RAG, swap in a dedicated vector DB (Qdrant or Chroma via pipelines).

Why not just SillyTavern / LibreChat / LM Studio?

SillyTavern: roleplay-focused, heavier customization per character. Different use case.
LibreChat: fuller OpenAI-style multi-provider, but heavier setup.
LM Studio: desktop app only, single-user. Great for solo dev; not for a family.

Open WebUI is the sweet spot for "one server, many users, local-first."

Deployment playbook — family, team, or public

Family / home use (single container, trusted LAN)

bash

docker run -d -p 3000:8080 \
 --add-host=host.docker.internal:host-gateway \
 -v open-webui:/app/backend/data \
 --name open-webui --restart always \
 ghcr.io/open-webui/open-webui:main

Access at http://<home-server-ip>:3000. First user to sign up is admin. RBAC is on by default; create pending accounts for family, approve from admin UI.

Team use (behind Caddy/Traefik, auth via OIDC)

Add a reverse proxy for HTTPS and SSO. Example Caddyfile:

caddyfile

chat.example.com {
 reverse_proxy localhost:3000
 forward_auth auth.example.com {
 uri /api/verify
 copy_headers Remote-User Remote-Groups
 }
}

Open WebUI reads REMOTE_USER / REMOTE_GROUPS from headers when enabled; configure via WEBUI_AUTH_TRUSTED_EMAIL_HEADER=Remote-User.

Public-facing (rate-limited, captcha, strict resource limits)

Don't. If you need a public chat UI, use LibreChat — Open WebUI was built for trusted-environment use and doesn't harden the abuse surface by default.

Hooking into Ollama / LiteLLM / OpenAI

In admin → Connections:

Ollama: add http://host.docker.internal:11434 — Open WebUI detects installed models automatically.
OpenAI-compatible (LiteLLM, vLLM, copilot-api): add the URL + key. Any OpenAI-shaped endpoint works; LiteLLM specifically is the industry-standard multi-provider proxy and pairs excellently with Open WebUI.
Anthropic native: enable via the Anthropic connection type; paste your API key. Supports Claude 4.x.

Building RAG without losing your mind

Set embeddings model in admin → Settings → Documents → Embedding model. The default is sentence-transformers all-MiniLM-L6-v2 (CPU-local, ~500 MB RAM); for better quality switch the engine to Ollama and pick nomic-embed-text.
Upload docs via admin → Documents. Per-user collections are also supported.
In chat, click the document-picker icon to scope the conversation to a collection.

RAG caveats:

Default retriever is vector-only (cosine similarity on the embedding store). Turn on ENABLE_RAG_HYBRID_SEARCH to add BM25 lexical search plus a CrossEncoder reranker — usually worth it. Swap in Qdrant as the vector store via the Pipelines feature for more control.
Max chunk size matters. 512 tokens is default; bump to 1024 for long-document use cases.

Pipelines — custom logic without forking

Open WebUI's Pipelines feature lets you inject pre/post hooks:

Filter pipelines: middleware that mutates the request on the way in (inlet) and/or the response on the way out (outlet) — e.g. scrub PII, redact secrets, add a system-prompt prefix.
Pipe pipelines: replace the whole model call with custom logic — e.g. route to different backends based on token count, or wrap a non-OpenAI provider.
Manifold pipelines: a single pipeline that exposes multiple models in the picker (multi-model aggregation).
Valves (not a separate pipeline type): Pydantic-typed configuration knobs that any pipeline can expose to the admin UI — use them to surface API keys, toggles, and thresholds without redeploying.

Example filter that adds a system-prompt prefix:

python

class Filter:
 def inlet(self, body, user):
 body["messages"].insert(0, {"role": "system",
 "content": "Always include units with every numerical answer."})
 return body

How public benchmarks show and compared

Numbers in this article reflect our own SpecPicks family deployment — Open WebUI on an Ubuntu VM, Ollama on a bare-metal RTX 5090, three active users, ~40 chats/day for three months. Pipeline patterns are cross-referenced against the Open WebUI GitHub repository (issue tracker + discussions) and community feedback on r/LocalLLaMA.

Alternatives — when Open WebUI isn't right

LibreChat — more ChatGPT-clone; better for multi-tenant public deployments.
SillyTavern — RP / character focus. Different audience.
Big-AGI — prettier UI, less admin surface. Good solo-use pick.
LM Studio — desktop app only, single user. Good dev tool; not for sharing.

Frequently asked questions

Can Open WebUI replace ChatGPT for a family?

Yes — that's its primary pitch. Multi-user auth, RAG, model switching, mobile-friendly UI. The one thing it doesn't match ChatGPT on is image generation natively (though you can wire ComfyUI behind it via pipelines).

How do I keep the family out of the Ollama admin?

Don't give non-admin accounts the "admin" role. Regular users can chat, upload documents to their own collections, and pick from enabled models; they can't install new ones or see other users' data.

Is Open WebUI audited / secure enough for small business use?

For trusted-LAN use, yes. For anything public-facing, add a proper auth layer (OIDC via Authelia, Authentik, or your identity provider). Open WebUI itself doesn't hold security certifications; treat it as "hobbyist-quality security, production-quality UI."

What's the difference between Open WebUI and Ollama's built-in webui?

Ollama's built-in webui (via the ollama serve web interface) is a quick-test UI — no auth, no RAG, no multi-user. Open WebUI is the "production" layer on top — same Ollama backend, much more surface area.

Does Open WebUI work on Mac / Apple Silicon?

Yes — runs fine in Docker Desktop. Performance is bottlenecked by the model host (your Ollama / inference backend), not the UI container.

Sources

Open WebUI GitHub repository — 133k+ stars, active issue tracker, canonical reference.
LiteLLM documentation — pairing guide for using Open WebUI with multi-provider routing.
r/LocalLLaMA — community deployment patterns.
ComfyUI documentation — image-gen pipeline to optionally wire in.

Related guides

— SpecPicks Editorial · Last verified 2026-04-21

Products mentioned in this article

Tap any product for full specs, live Amazon & eBay pricing, and alternatives.

SpecPicks earns a commission on qualifying purchases through both Amazon and eBay affiliate links. Prices and stock update independently.

Frequently asked questions

What are the hardware requirements for running Open WebUI?

Open WebUI itself has minimal hardware requirements as it is a lightweight Docker container. However, the performance depends on the backend model host (e.g., Ollama or LiteLLM). For optimal use, a system with a modern GPU, such as an NVIDIA RTX series, is recommended for handling inference workloads efficiently.

Can Open WebUI integrate with other AI models besides Ollama?

Yes, Open WebUI supports integration with multiple AI models. It works with OpenAI-compatible APIs, LiteLLM, vLLM, and Anthropic's Claude. These integrations can be configured in the admin settings by providing the appropriate API URLs and keys, making it versatile for various use cases.

How does Open WebUI handle document-based RAG workflows?

Open WebUI supports RAG (Retrieval-Augmented Generation) by allowing users to upload documents in formats like PDF, Markdown, or TXT. It automatically chunks and embeds the content using models like `nomic-embed-text`. For advanced use, users can integrate vector databases like Qdrant or Chroma for improved retrieval performance.

Is Open WebUI suitable for public-facing deployments?

Open WebUI is designed for trusted environments like family or team use. It lacks built-in protections for public-facing deployments, such as rate-limiting or advanced abuse prevention. For public use, it is recommended to use alternatives like LibreChat, which are better suited for such scenarios.

What customization options are available in Open WebUI?

Open WebUI offers extensive customization through its Pipelines feature. Users can create pre/post-processing hooks to modify input/output, route requests to different backends, or add toggles for specific features. This flexibility allows for tailored workflows without modifying the core application.

Sources

— SpecPicks Editorial · Last verified 2026-06-21

More guides & deep dives from the SpecPicks archive

Browse all articles & guides →

More reviews from the SpecPicks archive

Browse all reviews →

More buying guides from SpecPicks

Browse all buying guides →

Open WebUI — self-hosted ChatGPT for your local models

What it does

Install with Docker

Hooking up RAG

Why not just SillyTavern / LibreChat / LM Studio?

Related

Deployment playbook — family, team, or public

Family / home use (single container, trusted LAN)

Team use (behind Caddy/Traefik, auth via OIDC)

Public-facing (rate-limited, captcha, strict resource limits)

Hooking into Ollama / LiteLLM / OpenAI

Building RAG without losing your mind

Pipelines — custom logic without forking

How public benchmarks show and compared

Alternatives — when Open WebUI isn't right

Frequently asked questions

Can Open WebUI replace ChatGPT for a family?

How do I keep the family out of the Ollama admin?

Is Open WebUI audited / secure enough for small business use?

What's the difference between Open WebUI and Ollama's built-in webui?

Does Open WebUI work on Mac / Apple Silicon?

Sources

Related guides

Products mentioned in this article

darkFlash DB460M Micro-ATX PC Gaming Case, Full-Mesh Front Panel for High…

darkFlash DB460M Micro-ATX PC Gaming Case, Full-Mesh Front Panel for High…

darkFlash DLM21 MESH Micro ATX Mini ITX Tower MicroATX White Computer Case…

Velztorm Black Praetix Custom Built Y60 Gaming Desktop PC (GeForce RTX 5090…

Lenovo Legion Pro 7i Gen 10, AI Gaming Laptop, Intel Ultra 9 275HX, 16" OLED…

Frequently asked questions

Sources

More guides & deep dives from the SpecPicks archive

More reviews from the SpecPicks archive

More buying guides from SpecPicks

Open WebUI — self-hosted ChatGPT for your local models

What it does

Install with Docker

Hooking up RAG

Why not just SillyTavern / LibreChat / LM Studio?

Related

Deployment playbook — family, team, or public

Family / home use (single container, trusted LAN)

Team use (behind Caddy/Traefik, auth via OIDC)

Public-facing (rate-limited, captcha, strict resource limits)

Hooking into Ollama / LiteLLM / OpenAI

Building RAG without losing your mind

Pipelines — custom logic without forking

How public benchmarks show and compared

Alternatives — when Open WebUI isn't right

Frequently asked questions

Can Open WebUI replace ChatGPT for a family?

How do I keep the family out of the Ollama admin?

Is Open WebUI audited / secure enough for small business use?

What's the difference between Open WebUI and Ollama's built-in webui?

Does Open WebUI work on Mac / Apple Silicon?

Sources

Related guides

darkFlash DB460M Micro-ATX PC Gaming Case, Full-Mesh Front Panel for High…

darkFlash DB460M Micro-ATX PC Gaming Case, Full-Mesh Front Panel for High…

darkFlash DLM21 MESH Micro ATX Mini ITX Tower MicroATX White Computer Case…

Velztorm Black Praetix Custom Built Y60 Gaming Desktop PC (GeForce RTX 5090…

Lenovo Legion Pro 7i Gen 10, AI Gaming Laptop, Intel Ultra 9 275HX, 16" OLED…

Frequently asked questions

Sources

Keep reading on SpecPicks

More from the archive

Deeper dives from the SpecPicks archive

Just published on SpecPicks