CVE-2026-34755 Overview
A resource exhaustion vulnerability exists in vLLM, an inference and serving engine for large language models (LLMs). The vulnerability is located in the VideoMediaIO.load_base64() method within vllm/multimodal/media/video.py. This method processes video/jpeg data URLs by splitting them by comma to extract individual JPEG frames but fails to enforce a frame count limit. The num_frames parameter (default: 32), which is enforced by the load_bytes() code path, is completely bypassed in the video/jpeg base64 path. This allows an attacker to send a single API request containing thousands of comma-separated base64-encoded JPEG frames, causing the server to decode all frames into memory and crash with an Out-of-Memory (OOM) condition.
Critical Impact
An authenticated attacker can crash vLLM inference servers through memory exhaustion by sending a single malicious API request containing excessive base64-encoded JPEG frames, leading to denial of service.
Affected Products
- vLLM versions 0.7.0 through 0.18.x
- vLLM deployments using multimodal video processing capabilities
- Systems accepting video/jpeg data URLs via the vLLM API
Discovery Timeline
- 2026-04-06 - CVE CVE-2026-34755 published to NVD
- 2026-04-07 - Last updated in NVD database
Technical Details for CVE-2026-34755
Vulnerability Analysis
This vulnerability falls under CWE-770 (Allocation of Resources Without Limits or Throttling). The core issue lies in inconsistent input validation between two code paths handling video data. While the load_bytes() method properly enforces the num_frames limit (defaulting to 32 frames), the load_base64() method completely bypasses this validation when processing base64-encoded JPEG frames.
The vulnerable function parses comma-separated base64 data without any upper bound on the number of segments it will process. Each decoded JPEG frame consumes memory proportional to its decoded size, and when thousands of frames are submitted in a single request, the cumulative memory allocation rapidly exhausts available system memory.
Root Cause
The root cause is a missing bounds check in the VideoMediaIO.load_base64() method. When video data is submitted as a video/jpeg data URL containing base64-encoded frames, the method splits the input by comma and iterates through all resulting segments without enforcing the num_frames limit. This creates an asymmetric vulnerability where one code path (byte loading) is protected while another (base64 loading) is not, allowing attackers to exploit the unprotected path.
Attack Vector
The attack is network-based and requires low-privilege authentication to access the vLLM API. An attacker constructs a malicious API request containing a video/jpeg data URL with thousands of comma-separated base64-encoded JPEG frames. When the server processes this request, the load_base64() method decodes every frame into memory without limitation, causing memory exhaustion and server crash. The attack requires no user interaction and can be executed with a single HTTP request.
The exploitation mechanism involves:
- An authenticated attacker crafts an API request targeting vLLM's multimodal video processing endpoint
- The request contains a video/jpeg data URL with an excessive number of comma-separated base64-encoded JPEG frames (potentially thousands)
- The VideoMediaIO.load_base64() method processes the request, splitting by comma and decoding all frames
- Each decoded frame allocates memory, quickly exhausting available resources
- The server crashes with an OOM error, causing denial of service
For technical details, refer to the GitHub Security Advisory.
Detection Methods for CVE-2026-34755
Indicators of Compromise
- Unusually large API requests to vLLM endpoints containing video/jpeg data URLs
- Rapid memory consumption spikes on vLLM server processes
- Server crashes with OOM (Out-of-Memory) errors following multimodal API requests
- API requests containing data URLs with abnormally high numbers of comma-separated base64 segments
Detection Strategies
- Monitor API request sizes to vLLM multimodal endpoints and alert on requests exceeding normal thresholds
- Implement memory consumption monitoring for vLLM server processes with alerts for rapid increases
- Log and analyze requests containing video/jpeg data URLs for unusual patterns
- Deploy request payload inspection to count comma-separated segments in base64 video data
Monitoring Recommendations
- Configure application-level logging to capture request metadata for multimodal endpoints
- Set up infrastructure monitoring to track memory utilization trends on vLLM servers
- Implement rate limiting on API endpoints accepting video data to reduce attack surface
- Enable crash dump analysis to identify OOM conditions related to video processing
How to Mitigate CVE-2026-34755
Immediate Actions Required
- Upgrade vLLM to version 0.19.0 or later where the vulnerability is fixed
- Implement API gateway request size limits to restrict excessively large video data payloads
- Deploy memory limits on vLLM server processes to prevent complete system crashes
- Monitor for exploitation attempts through unusual API traffic patterns
Patch Information
The vulnerability is fixed in vLLM version 0.19.0. The fix ensures that the num_frames limit is enforced consistently across all code paths, including the load_base64() method for processing video/jpeg data URLs. Organizations should upgrade to version 0.19.0 or later. For additional details, see the GitHub Security Advisory.
Workarounds
- Implement API gateway rules to limit the size of incoming requests containing video data
- Deploy request validation middleware to count and limit comma-separated segments in base64 video data
- Configure memory limits (cgroups, container limits) to prevent single requests from exhausting system resources
- Consider temporarily disabling multimodal video processing endpoints if not required
# Example container memory limit configuration
# Limit vLLM container to 16GB to prevent complete system OOM
docker run --memory=16g --memory-swap=16g vllm/vllm-openai:latest
# Example systemd memory limit for vLLM service
# Add to vLLM service unit file
MemoryMax=16G
MemoryHigh=14G
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


