CVE-2026-45315 Overview
CVE-2026-45315 is a stored cross-site scripting (XSS) vulnerability in Open WebUI, a self-hosted artificial intelligence platform designed to run entirely offline. Versions prior to 0.9.3 derive the file extension from the user-supplied filename when handling audio transcription uploads. The /cache/{path} route then serves these files through FileResponse, setting Content-Type based on the on-disk extension and omitting any Content-Disposition header. An authenticated user with the default-enabled chat.stt permission can upload a polyglot WAV+HTML file and trick another user into opening it, executing attacker-controlled script in the Open WebUI origin [CWE-79].
Critical Impact
A verified user can hijack other users' sessions, steal API keys, and execute actions in the Open WebUI origin by delivering a malicious URL.
Affected Products
- Open WebUI versions prior to 0.9.3
- Deployments with the default-on chat.stt (speech-to-text) permission
- Self-hosted Open WebUI instances exposed to multiple verified users
Discovery Timeline
- 2026-05-15 - CVE-2026-45315 published to NVD
- 2026-05-19 - Last updated in NVD database
Technical Details for CVE-2026-45315
Vulnerability Analysis
The vulnerability resides in the audio transcription upload endpoint in Open WebUI. The endpoint trusts the file extension from the user-supplied filename and writes the upload to CACHE_DIR/audio/transcriptions/. The /cache/{path} route returns the stored file using FastAPI's FileResponse, which derives Content-Type from the on-disk file extension. Because no Content-Disposition: attachment header is emitted, browsers render the response inline.
An attacker with the default chat.stt permission uploads a polyglot file that is simultaneously a valid WAV audio file and valid HTML. By naming the file pwn.html, the server stores it with an .html extension and later serves it as text/html. When a victim opens the resulting URL, embedded <script> content executes in the Open WebUI origin, granting access to session cookies, local storage tokens, and any in-app API.
Root Cause
The root cause is improper neutralization of input during web page generation [CWE-79], combined with insufficient validation of upload metadata. The application trusts the client-supplied filename extension to determine storage naming and downstream MIME handling. No allowlist constrains accepted extensions to audio formats, and no Content-Disposition header forces download semantics.
Attack Vector
Exploitation requires a verified account with the default-enabled speech-to-text permission and user interaction from the victim. The attacker uploads a crafted polyglot file through the transcription endpoint, then shares the resulting /cache/...html URL with another user via chat, email, or shared workspace. Opening the link renders attacker JavaScript in the Open WebUI origin, enabling session theft and arbitrary actions on behalf of the victim.
No verified proof-of-concept code is published. See the GitHub Security Advisory GHSA-m8f9-9whg-f4xr for vendor details.
Detection Methods for CVE-2026-45315
Indicators of Compromise
- Files stored under CACHE_DIR/audio/transcriptions/ with non-audio extensions such as .html, .htm, .svg, or .xml
- HTTP requests to /cache/audio/transcriptions/*.html returning Content-Type: text/html
- Audio transcription uploads where the filename extension does not match the audio MIME type
Detection Strategies
- Inspect Open WebUI access logs for GET /cache/audio/transcriptions/ responses with text/html, image/svg+xml, or scriptable content types
- Audit the transcription cache directory for files whose magic bytes do not match their extension
- Alert on outbound requests from browser sessions to attacker-controlled hosts immediately after a user opens a /cache/ URL
Monitoring Recommendations
- Forward Open WebUI application and reverse-proxy logs to a centralized SIEM for query and correlation
- Track creation of files under CACHE_DIR/audio/transcriptions/ and flag any extension outside an audio allowlist
- Monitor session token usage from unexpected user agents or geolocations following cache URL access
How to Mitigate CVE-2026-45315
Immediate Actions Required
- Upgrade Open WebUI to version 0.9.3 or later, which contains the vendor fix
- Revoke the default chat.stt permission for untrusted user roles until upgrade is complete
- Purge existing contents of CACHE_DIR/audio/transcriptions/ to remove any planted polyglot files
- Rotate session secrets and force re-authentication to invalidate potentially stolen sessions
Patch Information
The vendor fixed CVE-2026-45315 in Open WebUI 0.9.3. Upgrade guidance and the full advisory are available at the Open WebUI Security Advisory GHSA-m8f9-9whg-f4xr.
Workarounds
- Restrict the chat.stt permission so only trusted users can submit transcription uploads
- Place a reverse proxy in front of Open WebUI that forces Content-Disposition: attachment and a restrictive Content-Security-Policy on /cache/ responses
- Configure the reverse proxy to override Content-Type to application/octet-stream for files served from /cache/audio/transcriptions/
- Limit Open WebUI access to authenticated users on trusted networks while the patch is applied
# Example nginx override forcing safe headers on Open WebUI cache responses
location /cache/ {
proxy_pass http://open_webui_upstream;
proxy_hide_header Content-Type;
proxy_hide_header Content-Disposition;
add_header Content-Type "application/octet-stream" always;
add_header Content-Disposition "attachment" always;
add_header Content-Security-Policy "default-src 'none'; sandbox" always;
add_header X-Content-Type-Options "nosniff" always;
}
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


