CVE-2026-34755: vLLM Engine DoS Vulnerability

CVE-2026-34755 Overview

A resource exhaustion vulnerability exists in vLLM, an inference and serving engine for large language models (LLMs). The vulnerability is located in the VideoMediaIO.load_base64() method within vllm/multimodal/media/video.py. This method processes video/jpeg data URLs by splitting them by comma to extract individual JPEG frames but fails to enforce a frame count limit. The num_frames parameter (default: 32), which is enforced by the load_bytes() code path, is completely bypassed in the video/jpeg base64 path. This allows an attacker to send a single API request containing thousands of comma-separated base64-encoded JPEG frames, causing the server to decode all frames into memory and crash with an Out-of-Memory (OOM) condition.

Critical Impact
An authenticated attacker can crash vLLM inference servers through memory exhaustion by sending a single malicious API request containing excessive base64-encoded JPEG frames, leading to denial of service.

Affected Products

vLLM versions 0.7.0 through 0.18.x
vLLM deployments using multimodal video processing capabilities
Systems accepting video/jpeg data URLs via the vLLM API

Discovery Timeline

2026-04-06 - CVE CVE-2026-34755 published to NVD
2026-04-07 - Last updated in NVD database

Technical Details for CVE-2026-34755

Vulnerability Analysis

This vulnerability falls under CWE-770 (Allocation of Resources Without Limits or Throttling). The core issue lies in inconsistent input validation between two code paths handling video data. While the load_bytes() method properly enforces the num_frames limit (defaulting to 32 frames), the load_base64() method completely bypasses this validation when processing base64-encoded JPEG frames.

The vulnerable function parses comma-separated base64 data without any upper bound on the number of segments it will process. Each decoded JPEG frame consumes memory proportional to its decoded size, and when thousands of frames are submitted in a single request, the cumulative memory allocation rapidly exhausts available system memory.

Root Cause

The root cause is a missing bounds check in the VideoMediaIO.load_base64() method. When video data is submitted as a video/jpeg data URL containing base64-encoded frames, the method splits the input by comma and iterates through all resulting segments without enforcing the num_frames limit. This creates an asymmetric vulnerability where one code path (byte loading) is protected while another (base64 loading) is not, allowing attackers to exploit the unprotected path.

Attack Vector

The attack is network-based and requires low-privilege authentication to access the vLLM API. An attacker constructs a malicious API request containing a video/jpeg data URL with thousands of comma-separated base64-encoded JPEG frames. When the server processes this request, the load_base64() method decodes every frame into memory without limitation, causing memory exhaustion and server crash. The attack requires no user interaction and can be executed with a single HTTP request.

The exploitation mechanism involves:

An authenticated attacker crafts an API request targeting vLLM's multimodal video processing endpoint
The request contains a video/jpeg data URL with an excessive number of comma-separated base64-encoded JPEG frames (potentially thousands)
The VideoMediaIO.load_base64() method processes the request, splitting by comma and decoding all frames
Each decoded frame allocates memory, quickly exhausting available resources
The server crashes with an OOM error, causing denial of service

For technical details, refer to the GitHub Security Advisory.

Detection Methods for CVE-2026-34755

Indicators of Compromise

Unusually large API requests to vLLM endpoints containing video/jpeg data URLs
Rapid memory consumption spikes on vLLM server processes
Server crashes with OOM (Out-of-Memory) errors following multimodal API requests
API requests containing data URLs with abnormally high numbers of comma-separated base64 segments

Detection Strategies

Monitor API request sizes to vLLM multimodal endpoints and alert on requests exceeding normal thresholds
Implement memory consumption monitoring for vLLM server processes with alerts for rapid increases
Log and analyze requests containing video/jpeg data URLs for unusual patterns
Deploy request payload inspection to count comma-separated segments in base64 video data

Monitoring Recommendations

Configure application-level logging to capture request metadata for multimodal endpoints
Set up infrastructure monitoring to track memory utilization trends on vLLM servers
Implement rate limiting on API endpoints accepting video data to reduce attack surface
Enable crash dump analysis to identify OOM conditions related to video processing

How to Mitigate CVE-2026-34755

Immediate Actions Required

Upgrade vLLM to version 0.19.0 or later where the vulnerability is fixed
Implement API gateway request size limits to restrict excessively large video data payloads
Deploy memory limits on vLLM server processes to prevent complete system crashes
Monitor for exploitation attempts through unusual API traffic patterns

Patch Information

The vulnerability is fixed in vLLM version 0.19.0. The fix ensures that the num_frames limit is enforced consistently across all code paths, including the load_base64() method for processing video/jpeg data URLs. Organizations should upgrade to version 0.19.0 or later. For additional details, see the GitHub Security Advisory.

Workarounds

Implement API gateway rules to limit the size of incoming requests containing video data
Deploy request validation middleware to count and limit comma-separated segments in base64 video data
Configure memory limits (cgroups, container limits) to prevent single requests from exhausting system resources
Consider temporarily disabling multimodal video processing endpoints if not required

bash

# Example container memory limit configuration
# Limit vLLM container to 16GB to prevent complete system OOM
docker run --memory=16g --memory-swap=16g vllm/vllm-openai:latest

# Example systemd memory limit for vLLM service
# Add to vLLM service unit file
MemoryMax=16G
MemoryHigh=14G

CVE-2026-34755 Overview

Critical Impact
An authenticated attacker can crash vLLM inference servers through memory exhaustion by sending a single malicious API request containing excessive base64-encoded JPEG frames, leading to denial of service.

Affected Products

vLLM versions 0.7.0 through 0.18.x
vLLM deployments using multimodal video processing capabilities
Systems accepting video/jpeg data URLs via the vLLM API

Discovery Timeline

2026-04-06 - CVE CVE-2026-34755 published to NVD
2026-04-07 - Last updated in NVD database

Technical Details for CVE-2026-34755

Vulnerability Analysis

Root Cause

Attack Vector

The exploitation mechanism involves:

An authenticated attacker crafts an API request targeting vLLM's multimodal video processing endpoint
The request contains a video/jpeg data URL with an excessive number of comma-separated base64-encoded JPEG frames (potentially thousands)
The VideoMediaIO.load_base64() method processes the request, splitting by comma and decoding all frames
Each decoded frame allocates memory, quickly exhausting available resources
The server crashes with an OOM error, causing denial of service

For technical details, refer to the GitHub Security Advisory.

Detection Methods for CVE-2026-34755

Indicators of Compromise

Unusually large API requests to vLLM endpoints containing video/jpeg data URLs
Rapid memory consumption spikes on vLLM server processes
Server crashes with OOM (Out-of-Memory) errors following multimodal API requests
API requests containing data URLs with abnormally high numbers of comma-separated base64 segments

Detection Strategies

Monitor API request sizes to vLLM multimodal endpoints and alert on requests exceeding normal thresholds
Implement memory consumption monitoring for vLLM server processes with alerts for rapid increases
Log and analyze requests containing video/jpeg data URLs for unusual patterns
Deploy request payload inspection to count comma-separated segments in base64 video data

Monitoring Recommendations

Configure application-level logging to capture request metadata for multimodal endpoints
Set up infrastructure monitoring to track memory utilization trends on vLLM servers
Implement rate limiting on API endpoints accepting video data to reduce attack surface
Enable crash dump analysis to identify OOM conditions related to video processing

How to Mitigate CVE-2026-34755

Immediate Actions Required

Upgrade vLLM to version 0.19.0 or later where the vulnerability is fixed
Implement API gateway request size limits to restrict excessively large video data payloads
Deploy memory limits on vLLM server processes to prevent complete system crashes
Monitor for exploitation attempts through unusual API traffic patterns

Patch Information

Workarounds

Implement API gateway rules to limit the size of incoming requests containing video data
Deploy request validation middleware to count and limit comma-separated segments in base64 video data
Configure memory limits (cgroups, container limits) to prevent single requests from exhausting system resources
Consider temporarily disabling multimodal video processing endpoints if not required

bash

# Example container memory limit configuration
# Limit vLLM container to 16GB to prevent complete system OOM
docker run --memory=16g --memory-swap=16g vllm/vllm-openai:latest

# Example systemd memory limit for vLLM service
# Add to vLLM service unit file
MemoryMax=16G
MemoryHigh=14G

CVE-2026-34755: vLLM Engine DoS Vulnerability

CVE-2026-34755 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-34755

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-34755

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-34755

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2026-34755: vLLM Engine DoS Vulnerability

CVE-2026-34755 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-34755

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-34755

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-34755

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform