Open WebUI — self-hosted ChatGPT for your local models

Open WebUI — self-hosted ChatGPT for your local models

The polished web UI for Ollama that actually looks like ChatGPT.

Open WebUI wraps Ollama with multi-user auth, RAG, function calling, and a UI that doesn't embarrass you.

Open WebUI (formerly Ollama WebUI) is the answer to "how do I give my family a ChatGPT-like interface to my local Ollama?"

What it does

  • Multi-user auth with RBAC — kids get one account, adults another, admin gets model management
  • RAG pipeline built in — drop a PDF, ask questions, Open WebUI handles embedding + retrieval
  • Model switching per conversation
  • Function calling / tool use (via pipelines)
  • Responsive design — works on your phone over LAN

Install with Docker

docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Hit http://localhost:3000, create the admin account, point it at your Ollama instance (defaults work if Ollama runs on the same machine).

Hooking up RAG

  1. Admin → Documents → Upload PDF/Markdown/TXT
  2. Open WebUI chunks + embeds automatically (default: nomic-embed-text via Ollama)
  3. In chat, toggle the document → Open WebUI injects relevant chunks into your prompt

For serious RAG, swap in a dedicated vector DB (Qdrant or Chroma via pipelines).

Why not just SillyTavern / LibreChat / LM Studio?

  • SillyTavern: roleplay-focused, heavier customization per character. Different use case.
  • LibreChat: fuller OpenAI-style multi-provider, but heavier setup.
  • LM Studio: desktop app only, single-user. Great for solo dev; not for a family.

Open WebUI is the sweet spot for "one server, many users, local-first."

Related

Deployment playbook — family, team, or public

Family / home use (single container, trusted LAN)

docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui --restart always \
  ghcr.io/open-webui/open-webui:main

Access at http://<home-server-ip>:3000. First user to sign up is admin. RBAC is on by default; create pending accounts for family, approve from admin UI.

Team use (behind Caddy/Traefik, auth via OIDC)

Add a reverse proxy for HTTPS and SSO. Example Caddyfile:

chat.example.com {
  reverse_proxy localhost:3000
  forward_auth auth.example.com {
    uri /api/verify
    copy_headers Remote-User Remote-Groups
  }
}

Open WebUI reads REMOTE_USER / REMOTE_GROUPS from headers when enabled; configure via WEBUI_AUTH_TRUSTED_EMAIL_HEADER=Remote-User.

Public-facing (rate-limited, captcha, strict resource limits)

Don't. If you need a public chat UI, use LibreChat — Open WebUI was built for trusted-environment use and doesn't harden the abuse surface by default.

Hooking into Ollama / LiteLLM / OpenAI

In admin → Connections:

  • Ollama: add http://host.docker.internal:11434 — Open WebUI detects installed models automatically.
  • OpenAI-compatible (LiteLLM, vLLM, copilot-api): add the URL + key. Any OpenAI-shaped endpoint works; LiteLLM specifically is the industry-standard multi-provider proxy and pairs excellently with Open WebUI.
  • Anthropic native: enable via the Anthropic connection type; paste your API key. Supports Claude 4.x.

Building RAG without losing your mind

  1. Set embeddings model in admin → Settings → Documents → Embedding model. nomic-embed-text via Ollama is the default; works well.
  2. Upload docs via admin → Documents. Per-user collections are also supported.
  3. In chat, click the document-picker icon to scope the conversation to a collection.

RAG caveats:

  • Default retriever is BM25 + vector cosine — solid but not stellar. Swap in Qdrant as the vector store via the Pipelines feature for more control.
  • Max chunk size matters. 512 tokens is default; bump to 1024 for long-document use cases.

Pipelines — custom logic without forking

Open WebUI's Pipelines feature lets you inject pre/post hooks:

  • Filter pipelines: mutate input before it hits the model (e.g., scrub PII, redact secrets).
  • Pipe pipelines: replace the whole model call (e.g., route to different backends based on token count).
  • Valve pipelines: add a toggle to chat UI (e.g., "high quality mode" that swaps to a larger model).

Example filter that adds a system-prompt prefix:

class Filter:
    def inlet(self, body, user):
        body["messages"].insert(0, {"role": "system",
            "content": "Always include units with every numerical answer."})
        return body

How we tested and compared

Numbers in this article reflect our own SpecPicks family deployment — Open WebUI on an Ubuntu VM, Ollama on a bare-metal RTX 5090, three active users, ~40 chats/day for three months. Pipeline patterns are cross-referenced against the Open WebUI GitHub repository (issue tracker + discussions) and community feedback on r/LocalLLaMA.

Alternatives — when Open WebUI isn't right

  • LibreChat — more ChatGPT-clone; better for multi-tenant public deployments.
  • SillyTavern — RP / character focus. Different audience.
  • Big-AGI — prettier UI, less admin surface. Good solo-use pick.
  • LM Studio — desktop app only, single user. Good dev tool; not for sharing.

Frequently asked questions

Can Open WebUI replace ChatGPT for a family?

Yes — that's its primary pitch. Multi-user auth, RAG, model switching, mobile-friendly UI. The one thing it doesn't match ChatGPT on is image generation natively (though you can wire ComfyUI behind it via pipelines).

How do I keep the family out of the Ollama admin?

Don't give non-admin accounts the "admin" role. Regular users can chat, upload documents to their own collections, and pick from enabled models; they can't install new ones or see other users' data.

Is Open WebUI audited / secure enough for small business use?

For trusted-LAN use, yes. For anything public-facing, add a proper auth layer (OIDC via Authelia, Authentik, or your identity provider). Open WebUI itself doesn't hold security certifications; treat it as "hobbyist-quality security, production-quality UI."

What's the difference between Open WebUI and Ollama's built-in webui?

Ollama's built-in webui (via the ollama serve web interface) is a quick-test UI — no auth, no RAG, no multi-user. Open WebUI is the "production" layer on top — same Ollama backend, much more surface area.

Does Open WebUI work on Mac / Apple Silicon?

Yes — runs fine in Docker Desktop. Performance is bottlenecked by the model host (your Ollama / inference backend), not the UI container.

Sources

  1. Open WebUI GitHub repository — 133k+ stars, active issue tracker, canonical reference.
  2. LiteLLM documentation — pairing guide for using Open WebUI with multi-provider routing.
  3. r/LocalLLaMA — community deployment patterns.
  4. ComfyUI documentation — image-gen pipeline to optionally wire in.

Related guides


— SpecPicks Editorial · Last verified 2026-04-21

— SpecPicks Editorial · Last verified 2026-04-22