A Leader in the 2026 Gartner® Magic Quadrant™ for Endpoint Protection. Six years running.Six years. Gartner® Magic Quadrant™ Leader.Find Out Why
Experiencing a Breach?Blog
Get StartedContact Us
SentinelOne
  • Platform
    Platform Overview
    • Singularity Platform
      Welcome to Integrated Enterprise Security
    • AI for Security
      Leading the Way in AI-Powered Security Solutions
    • Securing AI
      Accelerate AI Adoption with Secure AI Tools, Apps, and Agents.
    • How It Works
      The Singularity XDR Difference
    • Singularity Marketplace
      One-Click Integrations to Unlock the Power of XDR
    • Pricing & Packaging
      Comparisons and Guidance at a Glance
    Data & AI
    • Purple AI
      Accelerate SecOps with Generative AI
    • Singularity Hyperautomation
      Easily Automate Security Processes
    • AI-SIEM
      The AI SIEM for the Autonomous SOC
    • AI Data Pipelines
      Security Data Pipeline for AI SIEM and Data Optimization
    • Singularity Data Lake
      AI-Powered, Unified Data Lake
    • Singularity Data Lake for Log Analytics
      Seamlessly Ingest Data from On-Prem, Cloud or Hybrid Environments
    Endpoint Security
    • Singularity Endpoint
      Autonomous Prevention, Detection, and Response
    • Singularity XDR
      Native & Open Protection, Detection, and Response
    • Singularity RemoteOps Forensics
      Orchestrate Forensics at Scale
    • Singularity Threat Intelligence
      Comprehensive Adversary Intelligence
    • Singularity Vulnerability Management
      Application & OS Vulnerability Management
    • Singularity Identity
      Identity Threat Detection and Response
    Cloud Security
    • Singularity Cloud Security
      Block Attacks with an AI-Powered CNAPP
    • Singularity Cloud Native Security
      Secure Cloud and Development Resources
    • Singularity Cloud Workload Security
      Real-Time Cloud Workload Protection Platform
    • Singularity Cloud Data Security
      AI-Powered Threat Detection for Cloud Storage
    • Singularity Cloud Security Posture Management
      Detect and Remediate Cloud Misconfigurations
    Securing AI
    • Prompt Security
      Secure AI Tools Across Your Enterprise
  • Why SentinelOne?
    Why SentinelOne?
    • Why SentinelOne?
      Cybersecurity Built for What’s Next
    • Our Customers
      Trusted by the World’s Leading Enterprises
    • Industry Recognition
      Tested and Proven by the Experts
    • About Us
      The Industry Leader in Autonomous Cybersecurity
    Compare SentinelOne
    • Arctic Wolf
    • Broadcom
    • CrowdStrike
    • Cybereason
    • Microsoft
    • Palo Alto Networks
    • Sophos
    • Splunk
    • Trellix
    • Trend Micro
    • Wiz
    Verticals
    • Energy
    • Federal Government
    • Finance
    • Healthcare
    • Higher Education
    • K-12 Education
    • Manufacturing
    • Retail
    • State and Local Government
  • Services
    Managed Services
    • Managed Services Overview
      Wayfinder Threat Detection & Response
    • Threat Hunting
      World-Class Expertise and Threat Intelligence
    • Managed Detection & Response
      24/7/365 Expert MDR Across Your Entire Environment
    • Incident Readiness & Response
      DFIR, Breach Readiness, & Compromise Assessments
    Support, Deployment, & Health
    • Technical Account Management
      Customer Success with Personalized Service
    • SentinelOne GO
      Guided Onboarding & Deployment Advisory
    • SentinelOne University
      Live and On-Demand Training
    • Services Overview
      Comprehensive Solutions for Seamless Security Operations
    • SentinelOne Community
      Community Login
  • Partners
    Our Network
    • MSSP Partners
      Succeed Faster with SentinelOne
    • Singularity Marketplace
      Extend the Power of S1 Technology
    • Cyber Risk Partners
      Enlist Pro Response and Advisory Teams
    • Technology Alliances
      Integrated, Enterprise-Scale Solutions
    • SentinelOne for AWS
      Hosted in AWS Regions Around the World
    • Channel Partners
      Deliver the Right Solutions, Together
    • SentinelOne for Google Cloud
      Unified, Autonomous Security Giving Defenders the Advantage at Global Scale
    • Partner Locator
      Your Go-to Source for Our Top Partners in Your Region
    Partner Portal→
  • Resources
    Resource Center
    • Case Studies
    • Data Sheets
    • eBooks
    • Reports
    • Videos
    • Webinars
    • Whitepapers
    • Events
    View All Resources→
    Blog
    • Feature Spotlight
    • For CISO/CIO
    • From the Front Lines
    • Identity
    • Cloud
    • macOS
    • SentinelOne Blog
    Blog→
    Tech Resources
    • SentinelLABS
    • Ransomware Anthology
    • Cybersecurity 101
  • About
    About SentinelOne
    • About SentinelOne
      The Industry Leader in Cybersecurity
    • Investor Relations
      Financial Information & Events
    • SentinelLABS
      Threat Research for the Modern Threat Hunter
    • Careers
      The Latest Job Opportunities
    • Press & News
      Company Announcements
    • Cybersecurity Blog
      The Latest Cybersecurity Threats, News, & More
    • FAQ
      Get Answers to Our Most Frequently Asked Questions
    • DataSet
      The Live Data Platform
    • S Foundation
      Securing a Safer Future for All
    • S Ventures
      Investing in the Next Generation of Security, Data and AI
  • Pricing
Get StartedContact Us
CVE Vulnerability Database
Vulnerability Database/CVE-2026-9540

CVE-2026-9540: vllm-project vllm DoS Vulnerability

CVE-2026-9540 is a denial of service flaw in vllm-project vllm 0.19.0 affecting the OpenAI-compatible serving path. This post covers the technical details, affected versions, and mitigation strategies.

Published: May 28, 2026

CVE-2026-9540 Overview

CVE-2026-9540 is a denial of service vulnerability affecting vllm-project vllm version 0.19.0. The flaw resides in the OpenAI-compatible Serving Path component, where unspecified processing can be manipulated by a remote attacker to disrupt service availability. No authentication or user interaction is required to trigger the condition, which makes the attack reachable from any network-accessible client. The issue is classified under CWE-404 (Improper Resource Shutdown or Release). Public exploit details are available, and a pull request to remediate the defect is pending acceptance upstream.

Critical Impact

Remote, unauthenticated attackers can degrade or disable vLLM inference endpoints exposed through the OpenAI-compatible API, interrupting downstream LLM-dependent applications.

Affected Products

  • vllm-project vllm 0.19.0
  • Deployments exposing the OpenAI-compatible Serving Path
  • LLM inference services built on the affected vLLM release

Discovery Timeline

  • 2026-05-26 - CVE-2026-9540 published to NVD
  • 2026-05-26 - Last updated in NVD database

Technical Details for CVE-2026-9540

Vulnerability Analysis

The vulnerability targets the OpenAI-compatible Serving Path inside vLLM 0.19.0. This component exposes REST endpoints that mimic the OpenAI API, enabling clients to send completion, chat, and embedding requests. An attacker can submit crafted input to this surface and trigger a denial of service condition. The defect maps to CWE-404, indicating that resources allocated during request handling are not properly released or shut down. As a result, repeated abusive requests can exhaust process resources or stall request servicing. The Exploit Prediction Scoring System currently places this issue in the lower probability range, but exploit material is already public per the VulDB vulnerability record.

Root Cause

The root cause is improper resource shutdown or release in the OpenAI-compatible Serving Path handlers. The component fails to reclaim resources tied to certain request patterns, leading to availability degradation over time or under repeated attacks. The upstream maintainers have proposed a fix in pull request #37594, tracked against issue #37343.

Attack Vector

The attack vector is network-based and requires no authentication or user interaction. An attacker reaches the affected serving endpoint over HTTP and submits malformed or abusive input. Because vLLM is frequently deployed as a backend for chat assistants, RAG pipelines, and agentic systems, a successful attack interrupts dependent services. Public discussion of vLLM latency and resource behavior is available on the Ingero blog. Refer to the VulDB submission record for additional disclosure context.

Detection Methods for CVE-2026-9540

Indicators of Compromise

  • Spikes in request latency or worker stalls on vLLM OpenAI-compatible endpoints with no corresponding legitimate traffic increase
  • Repeated requests to /v1/completions, /v1/chat/completions, or /v1/embeddings from a small set of source IPs preceding service degradation
  • Growth in process memory, file descriptors, or thread counts in vLLM workers without recovery between requests
  • Unexplained restarts or health-check failures of vLLM serving processes

Detection Strategies

  • Baseline normal request volume and latency for the OpenAI-compatible endpoints and alert on sustained deviations
  • Inspect HTTP access logs for high-rate or anomalously structured requests targeting the serving path
  • Correlate vLLM worker resource metrics with request-level telemetry to identify resource non-release patterns
  • Track CVE-2026-9540 indicators against threat intelligence sourced from the VulDB CTI feed

Monitoring Recommendations

  • Export vLLM Prometheus metrics and alert on queue depth, GPU utilization stalls, and pending request counts
  • Forward reverse-proxy and API gateway logs to a centralized analytics platform for rate and pattern analysis
  • Monitor container restart counts and OOM events on hosts running vLLM 0.19.0
  • Add synthetic probes against inference endpoints to detect availability loss quickly

How to Mitigate CVE-2026-9540

Immediate Actions Required

  • Inventory all vLLM deployments and identify any instances running version 0.19.0 exposing the OpenAI-compatible Serving Path
  • Restrict network exposure of vLLM inference endpoints to trusted clients using firewalls, VPNs, or service mesh policies
  • Place an authenticating reverse proxy or API gateway in front of vLLM to require credentials and enforce request validation
  • Apply rate limiting and request size limits at the gateway to reduce abuse surface until a patched release is available

Patch Information

At the time of publication, the upstream fix is staged in vllm pull request #37594 and awaits acceptance. Track the vllm-project repository and issue #37343 for the merged commit and a tagged release that includes the fix. Upgrade to the first vLLM release that incorporates the merged pull request once it is published.

Workarounds

  • Terminate client connections with strict timeouts at the reverse proxy to limit resource hold time on vLLM workers
  • Run vLLM under a process supervisor with resource limits (cgroups, Kubernetes requests/limits) and automatic restart on failure
  • Block or throttle anonymous traffic to the OpenAI-compatible endpoints and require API keys validated upstream
  • Isolate vLLM workloads in dedicated namespaces or nodes so that a denial of service event does not impact unrelated services
bash
# Configuration example
# Example NGINX snippet to rate-limit and bound request size in front of vLLM
http {
    limit_req_zone $binary_remote_addr zone=vllm_rl:10m rate=10r/s;
    client_max_body_size 1m;

    server {
        listen 443 ssl;
        location /v1/ {
            limit_req zone=vllm_rl burst=20 nodelay;
            proxy_read_timeout 30s;
            proxy_send_timeout 30s;
            proxy_pass http://vllm_backend;
        }
    }
}

Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

  • Vulnerability Details
  • TypeDOS

  • Vendor/TechVllm

  • SeverityMEDIUM

  • CVSS Score5.5

  • EPSS Probability0.06%

  • Known ExploitedNo
  • CVSS Vector
  • CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:P/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X
  • Impact Assessment
  • ConfidentialityLow
  • IntegrityNone
  • AvailabilityLow
  • CWE References
  • CWE-404
  • Technical References
  • GitHub VLLM Project Repository

  • GitHub VLLM Issue #37343

  • GitHub VLLM Pull Request #37594

  • Ingero Blog on VLLM Latency

  • VulDB Submission #814645

  • VulDB Vulnerability #365601

  • VulDB CTI for #365601
  • Related CVEs
  • CVE-2026-44223: Vllm Vllm DOS Vulnerability

  • CVE-2026-44222: Vllm Token Injection DoS Vulnerability

  • CVE-2026-34755: vLLM Engine DoS Vulnerability

  • CVE-2026-34756: vLLM OpenAI API Server DoS Vulnerability
Default Legacy - Prefooter | Experience the World’s Most Advanced Cybersecurity Platform

Experience the Most Advanced Cybersecurity Platform

See how the world’s most intelligent, autonomous cybersecurity platform can protect your organization today and into the future.

Try SentinelOne
  • Get Started
  • Get a Demo
  • Product Tour
  • Why SentinelOne
  • Pricing & Packaging
  • FAQ
  • Contact
  • Contact Us
  • Customer Support
  • SentinelOne Status
  • Language
  • Platform
  • Singularity Platform
  • Singularity Endpoint
  • Singularity Cloud
  • Singularity AI-SIEM
  • Singularity Identity
  • Singularity Marketplace
  • Purple AI
  • Services
  • Wayfinder TDR
  • SentinelOne GO
  • Technical Account Management
  • Support Services
  • Verticals
  • Energy
  • Federal Government
  • Finance
  • Healthcare
  • Higher Education
  • K-12 Education
  • Manufacturing
  • Retail
  • State and Local Government
  • Cybersecurity for SMB
  • Resources
  • Blog
  • Labs
  • Case Studies
  • Videos
  • Product Tours
  • Events
  • Cybersecurity 101
  • eBooks
  • Webinars
  • Whitepapers
  • Press
  • News
  • Ransomware Anthology
  • Company
  • About Us
  • Our Customers
  • Careers
  • Partners
  • Legal & Compliance
  • Security & Compliance
  • Investor Relations
  • S Foundation
  • S Ventures

©2026 SentinelOne, All Rights Reserved.

Privacy Notice Terms of Use

English