CVE-2026-45315: Open WebUI XSS Vulnerability

CVE-2026-45315 Overview

CVE-2026-45315 is a stored cross-site scripting (XSS) vulnerability in Open WebUI, a self-hosted artificial intelligence platform designed to run entirely offline. Versions prior to 0.9.3 derive the file extension from the user-supplied filename when handling audio transcription uploads. The /cache/{path} route then serves these files through FileResponse, setting Content-Type based on the on-disk extension and omitting any Content-Disposition header. An authenticated user with the default-enabled chat.stt permission can upload a polyglot WAV+HTML file and trick another user into opening it, executing attacker-controlled script in the Open WebUI origin [CWE-79].

Critical Impact
A verified user can hijack other users' sessions, steal API keys, and execute actions in the Open WebUI origin by delivering a malicious URL.

Affected Products

Open WebUI versions prior to 0.9.3
Deployments with the default-on chat.stt (speech-to-text) permission
Self-hosted Open WebUI instances exposed to multiple verified users

Discovery Timeline

2026-05-15 - CVE-2026-45315 published to NVD
2026-05-19 - Last updated in NVD database

Technical Details for CVE-2026-45315

Vulnerability Analysis

The vulnerability resides in the audio transcription upload endpoint in Open WebUI. The endpoint trusts the file extension from the user-supplied filename and writes the upload to CACHE_DIR/audio/transcriptions/. The /cache/{path} route returns the stored file using FastAPI's FileResponse, which derives Content-Type from the on-disk file extension. Because no Content-Disposition: attachment header is emitted, browsers render the response inline.

An attacker with the default chat.stt permission uploads a polyglot file that is simultaneously a valid WAV audio file and valid HTML. By naming the file pwn.html, the server stores it with an .html extension and later serves it as text/html. When a victim opens the resulting URL, embedded <script> content executes in the Open WebUI origin, granting access to session cookies, local storage tokens, and any in-app API.

Root Cause

The root cause is improper neutralization of input during web page generation [CWE-79], combined with insufficient validation of upload metadata. The application trusts the client-supplied filename extension to determine storage naming and downstream MIME handling. No allowlist constrains accepted extensions to audio formats, and no Content-Disposition header forces download semantics.

Attack Vector

Exploitation requires a verified account with the default-enabled speech-to-text permission and user interaction from the victim. The attacker uploads a crafted polyglot file through the transcription endpoint, then shares the resulting /cache/...html URL with another user via chat, email, or shared workspace. Opening the link renders attacker JavaScript in the Open WebUI origin, enabling session theft and arbitrary actions on behalf of the victim.

No verified proof-of-concept code is published. See the GitHub Security Advisory GHSA-m8f9-9whg-f4xr for vendor details.

Detection Methods for CVE-2026-45315

Indicators of Compromise

Files stored under CACHE_DIR/audio/transcriptions/ with non-audio extensions such as .html, .htm, .svg, or .xml
HTTP requests to /cache/audio/transcriptions/*.html returning Content-Type: text/html
Audio transcription uploads where the filename extension does not match the audio MIME type

Detection Strategies

Inspect Open WebUI access logs for GET /cache/audio/transcriptions/ responses with text/html, image/svg+xml, or scriptable content types
Audit the transcription cache directory for files whose magic bytes do not match their extension
Alert on outbound requests from browser sessions to attacker-controlled hosts immediately after a user opens a /cache/ URL

Monitoring Recommendations

Forward Open WebUI application and reverse-proxy logs to a centralized SIEM for query and correlation
Track creation of files under CACHE_DIR/audio/transcriptions/ and flag any extension outside an audio allowlist
Monitor session token usage from unexpected user agents or geolocations following cache URL access

How to Mitigate CVE-2026-45315

Immediate Actions Required

Upgrade Open WebUI to version 0.9.3 or later, which contains the vendor fix
Revoke the default chat.stt permission for untrusted user roles until upgrade is complete
Purge existing contents of CACHE_DIR/audio/transcriptions/ to remove any planted polyglot files
Rotate session secrets and force re-authentication to invalidate potentially stolen sessions

Patch Information

The vendor fixed CVE-2026-45315 in Open WebUI 0.9.3. Upgrade guidance and the full advisory are available at the Open WebUI Security Advisory GHSA-m8f9-9whg-f4xr.

Workarounds

Restrict the chat.stt permission so only trusted users can submit transcription uploads
Place a reverse proxy in front of Open WebUI that forces Content-Disposition: attachment and a restrictive Content-Security-Policy on /cache/ responses
Configure the reverse proxy to override Content-Type to application/octet-stream for files served from /cache/audio/transcriptions/
Limit Open WebUI access to authenticated users on trusted networks while the patch is applied

bash

# Example nginx override forcing safe headers on Open WebUI cache responses
location /cache/ {
    proxy_pass http://open_webui_upstream;
    proxy_hide_header Content-Type;
    proxy_hide_header Content-Disposition;
    add_header Content-Type "application/octet-stream" always;
    add_header Content-Disposition "attachment" always;
    add_header Content-Security-Policy "default-src 'none'; sandbox" always;
    add_header X-Content-Type-Options "nosniff" always;
}