A Leader in the 2026 Gartner® Magic Quadrant™ for Endpoint Protection. Six years running.Six years. Gartner® Magic Quadrant™ Leader.Find Out Why
Experiencing a Breach?Blog
Get StartedContact Us
SentinelOne
  • Platform
    Platform Overview
    • Singularity Platform
      Welcome to Integrated Enterprise Security
    • AI for Security
      Leading the Way in AI-Powered Security Solutions
    • Securing AI
      Accelerate AI Adoption with Secure AI Tools, Apps, and Agents.
    • How It Works
      The Singularity XDR Difference
    • Singularity Marketplace
      One-Click Integrations to Unlock the Power of XDR
    • Pricing & Packaging
      Comparisons and Guidance at a Glance
    Data & AI
    • Purple AI
      Accelerate SecOps with Generative AI
    • Singularity Hyperautomation
      Easily Automate Security Processes
    • AI-SIEM
      The AI SIEM for the Autonomous SOC
    • AI Data Pipelines
      Security Data Pipeline for AI SIEM and Data Optimization
    • Singularity Data Lake
      AI-Powered, Unified Data Lake
    • Singularity Data Lake for Log Analytics
      Seamlessly Ingest Data from On-Prem, Cloud or Hybrid Environments
    Endpoint Security
    • Singularity Endpoint
      Autonomous Prevention, Detection, and Response
    • Singularity XDR
      Native & Open Protection, Detection, and Response
    • Singularity RemoteOps Forensics
      Orchestrate Forensics at Scale
    • Singularity Threat Intelligence
      Comprehensive Adversary Intelligence
    • Singularity Vulnerability Management
      Application & OS Vulnerability Management
    • Singularity Identity
      Identity Threat Detection and Response
    Cloud Security
    • Singularity Cloud Security
      Block Attacks with an AI-Powered CNAPP
    • Singularity Cloud Native Security
      Secure Cloud and Development Resources
    • Singularity Cloud Workload Security
      Real-Time Cloud Workload Protection Platform
    • Singularity Cloud Data Security
      AI-Powered Threat Detection for Cloud Storage
    • Singularity Cloud Security Posture Management
      Detect and Remediate Cloud Misconfigurations
    Securing AI
    • Prompt Security
      Secure AI Tools Across Your Enterprise
  • Why SentinelOne?
    Why SentinelOne?
    • Why SentinelOne?
      Cybersecurity Built for What’s Next
    • Our Customers
      Trusted by the World’s Leading Enterprises
    • Industry Recognition
      Tested and Proven by the Experts
    • About Us
      The Industry Leader in Autonomous Cybersecurity
    Compare SentinelOne
    • Arctic Wolf
    • Broadcom
    • CrowdStrike
    • Cybereason
    • Microsoft
    • Palo Alto Networks
    • Sophos
    • Splunk
    • Trellix
    • Trend Micro
    • Wiz
    Verticals
    • Energy
    • Federal Government
    • Finance
    • Healthcare
    • Higher Education
    • K-12 Education
    • Manufacturing
    • Retail
    • State and Local Government
  • Services
    Managed Services
    • Managed Services Overview
      Wayfinder Threat Detection & Response
    • Threat Hunting
      World-Class Expertise and Threat Intelligence
    • Managed Detection & Response
      24/7/365 Expert MDR Across Your Entire Environment
    • Incident Readiness & Response
      DFIR, Breach Readiness, & Compromise Assessments
    Support, Deployment, & Health
    • Technical Account Management
      Customer Success with Personalized Service
    • SentinelOne GO
      Guided Onboarding & Deployment Advisory
    • SentinelOne University
      Live and On-Demand Training
    • Services Overview
      Comprehensive Solutions for Seamless Security Operations
    • SentinelOne Community
      Community Login
  • Partners
    Our Network
    • MSSP Partners
      Succeed Faster with SentinelOne
    • Singularity Marketplace
      Extend the Power of S1 Technology
    • Cyber Risk Partners
      Enlist Pro Response and Advisory Teams
    • Technology Alliances
      Integrated, Enterprise-Scale Solutions
    • SentinelOne for AWS
      Hosted in AWS Regions Around the World
    • Channel Partners
      Deliver the Right Solutions, Together
    • SentinelOne for Google Cloud
      Unified, Autonomous Security Giving Defenders the Advantage at Global Scale
    • Partner Locator
      Your Go-to Source for Our Top Partners in Your Region
    Partner Portal→
  • Resources
    Resource Center
    • Case Studies
    • Data Sheets
    • eBooks
    • Reports
    • Videos
    • Webinars
    • Whitepapers
    • Events
    View All Resources→
    Blog
    • Feature Spotlight
    • For CISO/CIO
    • From the Front Lines
    • Identity
    • Cloud
    • macOS
    • SentinelOne Blog
    Blog→
    Tech Resources
    • SentinelLABS
    • Ransomware Anthology
    • Cybersecurity 101
  • About
    About SentinelOne
    • About SentinelOne
      The Industry Leader in Cybersecurity
    • Investor Relations
      Financial Information & Events
    • SentinelLABS
      Threat Research for the Modern Threat Hunter
    • Careers
      The Latest Job Opportunities
    • Press & News
      Company Announcements
    • Cybersecurity Blog
      The Latest Cybersecurity Threats, News, & More
    • FAQ
      Get Answers to Our Most Frequently Asked Questions
    • DataSet
      The Live Data Platform
    • S Foundation
      Securing a Safer Future for All
    • S Ventures
      Investing in the Next Generation of Security, Data and AI
  • Pricing
Get StartedContact Us
CVE Vulnerability Database
Vulnerability Database/CVE-2026-5497

CVE-2026-5497: vLLM Out-of-Memory DoS Vulnerability

CVE-2026-5497 is an out-of-memory denial of service flaw in vLLM 0.8.0 and later that allows unauthenticated attackers to crash servers via malicious video data URLs. This article covers technical details, impact, and mitigation.

Published: June 11, 2026

CVE-2026-5497 Overview

CVE-2026-5497 is an unauthenticated Denial of Service (DoS) vulnerability in vLLM versions 0.8.0 and later. The flaw resides in the VideoMediaIO.load_base64() method, which processes video/jpeg data URLs without enforcing a frame count limit. An attacker can submit a single API request containing thousands of comma-separated base64-encoded JPEG frames. The server decodes every frame into memory, exhausts available RAM, and crashes. The vulnerability is reachable through the OpenAI-compatible chat completions API, making any internet-exposed vLLM inference endpoint a target. The weakness is categorized under [CWE-400] Uncontrolled Resource Consumption.

Critical Impact

Unauthenticated attackers can crash vLLM inference servers with a single crafted HTTP request, disrupting any LLM-backed application or service.

Affected Products

  • vLLM version 0.8.0 and later
  • vLLM deployments exposing the OpenAI-compatible chat completions API
  • Any application stack consuming vLLM as an inference backend with multimodal video input enabled

Discovery Timeline

  • 2026-06-11 - CVE-2026-5497 published to NVD
  • 2026-06-11 - Last updated in NVD database

Technical Details for CVE-2026-5497

Vulnerability Analysis

The vulnerability lies in how vLLM handles multimodal input, specifically the parsing of video/jpeg data URLs. When a client submits a chat completion request containing an embedded video data URL, VideoMediaIO.load_base64() splits the base64 payload on comma delimiters. Each resulting segment is treated as an individual JPEG frame and decoded into a memory buffer. No upper bound is applied to the number of frames extracted from a single request. An attacker can submit a request containing thousands of comma-separated segments, forcing the server to allocate memory proportional to the attacker-controlled frame count. Memory pressure escalates rapidly, triggering an Out-of-Memory (OOM) condition that terminates the inference worker.

Root Cause

The root cause is missing input validation on a user-controlled resource quantity. VideoMediaIO.load_base64() derives the frame count directly from the structure of the input string. No configuration parameter, hard-coded ceiling, or memory budget gate prevents pathological inputs from being fully expanded. This pattern matches [CWE-400] Uncontrolled Resource Consumption.

Attack Vector

The attack vector is the network. The exploit requires no authentication and no user interaction. An attacker sends a POST request to the vLLM /v1/chat/completions endpoint with a message containing a video/jpeg data URL composed of many comma-delimited base64 JPEG segments. The server enumerates and decodes each segment, exhausts host memory, and the kernel OOM killer terminates the vLLM process. Repeated requests prevent recovery, sustaining the denial of service.

Verified proof-of-concept details are documented in the Huntr Bounty Report. The upstream fix is available in the GitHub Commit Details.

Detection Methods for CVE-2026-5497

Indicators of Compromise

  • Inbound HTTP requests to /v1/chat/completions containing data:video/jpeg;base64, payloads with abnormally high comma counts.
  • vLLM worker processes terminated by the Linux OOM killer with Out of memory: Killed process entries in dmesg or journalctl.
  • Sudden spikes in resident set size (RSS) for the vLLM Python process immediately preceding a crash.
  • Repeated 5xx responses or connection resets from the vLLM API endpoint following a single malformed request.

Detection Strategies

  • Inspect request bodies at an API gateway or reverse proxy and flag video/jpeg data URLs whose payload size or comma count exceeds operational baselines.
  • Correlate process termination events on vLLM hosts with preceding API requests to identify the triggering client and payload signature.
  • Monitor host-level memory utilization for sharp, short-duration spikes that align with single inbound requests.

Monitoring Recommendations

  • Enable structured access logging on the vLLM endpoint and ship logs to a SIEM for retention and query.
  • Alert on oom-killer kernel events targeting vLLM or its Python interpreter.
  • Track request rate, payload size distribution, and 5xx error rate per source IP to surface abuse patterns.

How to Mitigate CVE-2026-5497

Immediate Actions Required

  • Upgrade vLLM to the patched release that includes commit 58ee61422169ce17e08248f8efa1e9df434fe395.
  • Restrict network exposure of the vLLM API to authenticated clients via a reverse proxy or service mesh.
  • Enforce request body size limits at the ingress layer to cap the maximum payload an attacker can submit.
  • Disable multimodal video input on deployments that do not require it.

Patch Information

The upstream fix introduces a frame count limit in VideoMediaIO.load_base64(), preventing unbounded expansion of comma-delimited base64 segments. Review the GitHub Commit Details for the exact code change and integrate it into your build pipeline. Operators running forks should backport the change rather than relying on configuration alone.

Workarounds

  • Place vLLM behind an API gateway that rejects requests whose video/jpeg data URLs contain more than a small, fixed number of commas.
  • Apply a maximum request body size (for example, 1 MB) at the reverse proxy to limit attacker leverage.
  • Run vLLM under a cgroup or container with a memory limit so OOM events terminate only the worker rather than destabilizing the host.
  • Require authentication and per-client rate limiting on the chat completions endpoint until patching is complete.
bash
# Example NGINX ingress hardening for vLLM
client_max_body_size 1m;
limit_req_zone $binary_remote_addr zone=vllm:10m rate=10r/s;

location /v1/chat/completions {
    limit_req zone=vllm burst=20 nodelay;
    proxy_pass http://vllm_upstream;
    proxy_read_timeout 60s;
}

Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

  • Vulnerability Details
  • TypeDOS

  • Vendor/TechVllm

  • SeverityHIGH

  • CVSS Score7.5

  • Known ExploitedNo
  • CVSS Vector
  • CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
  • Impact Assessment
  • ConfidentialityLow
  • IntegrityNone
  • AvailabilityHigh
  • CWE References
  • CWE-400
  • Technical References
  • GitHub Commit Details

  • Huntr Bounty Report
  • Related CVEs
  • CVE-2026-9540: vllm-project vllm DoS Vulnerability

  • CVE-2026-44223: Vllm Vllm DOS Vulnerability

  • CVE-2026-44222: Vllm Token Injection DoS Vulnerability

  • CVE-2026-34755: vLLM Engine DoS Vulnerability
Default Legacy - Prefooter | Experience the World’s Most Advanced Cybersecurity Platform

Experience the Most Advanced Cybersecurity Platform

See how the world’s most intelligent, autonomous cybersecurity platform can protect your organization today and into the future.

Try SentinelOne
  • Get Started
  • Get a Demo
  • Product Tour
  • Why SentinelOne
  • Pricing & Packaging
  • FAQ
  • Contact
  • Contact Us
  • Customer Support
  • SentinelOne Status
  • Language
  • Platform
  • Singularity Platform
  • Singularity Endpoint
  • Singularity Cloud
  • Singularity AI-SIEM
  • Singularity Identity
  • Singularity Marketplace
  • Purple AI
  • Services
  • Wayfinder TDR
  • SentinelOne GO
  • Technical Account Management
  • Support Services
  • Verticals
  • Energy
  • Federal Government
  • Finance
  • Healthcare
  • Higher Education
  • K-12 Education
  • Manufacturing
  • Retail
  • State and Local Government
  • Cybersecurity for SMB
  • Resources
  • Blog
  • Labs
  • Case Studies
  • Videos
  • Product Tours
  • Events
  • Cybersecurity 101
  • eBooks
  • Webinars
  • Whitepapers
  • Press
  • News
  • Ransomware Anthology
  • Company
  • About Us
  • Our Customers
  • Careers
  • Partners
  • Legal & Compliance
  • Security & Compliance
  • Investor Relations
  • S Foundation
  • S Ventures

©2026 SentinelOne, All Rights Reserved.

Privacy Notice Terms of Use

English