CVE-2026-22773: vLLM Inference Engine DoS Vulnerability

CVE-2026-22773 Overview

CVE-2026-22773 is a Denial of Service vulnerability affecting vLLM, a popular inference and serving engine for large language models (LLMs). The vulnerability allows authenticated users to crash vLLM engine instances serving multimodal models that utilize the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This triggers a tensor dimension mismatch that results in an unhandled runtime error, causing complete server termination.

Critical Impact
Authenticated attackers can cause complete denial of service to vLLM inference servers by submitting malicious image payloads, disrupting AI/ML workloads and potentially affecting production LLM services.

Affected Products

vLLM versions 0.6.4 through 0.11.x (prior to 0.12.0)
vLLM deployments using Idefics3 vision model implementation
Multimodal LLM serving configurations

Discovery Timeline

2026-01-10 - CVE CVE-2026-22773 published to NVD
2026-01-13 - Last updated in NVD database

Technical Details for CVE-2026-22773

Vulnerability Analysis

This vulnerability exists in the image processing pipeline of vLLM's Idefics3 vision model implementation. When processing multimodal inputs that combine text and images, the Idefics3 model expects image tensors with specific dimensional requirements. The vulnerability stems from improper allocation of resources without limits (CWE-770), where the system fails to validate image dimensions before tensor operations.

When a malformed 1x1 pixel image is submitted to the inference endpoint, the vision model's tensor processing logic encounters a dimension mismatch during the image embedding phase. This mismatch triggers an unhandled runtime exception that propagates up the call stack, bypassing any error recovery mechanisms and causing the entire vLLM server process to terminate.

The attack is particularly impactful because vLLM is designed to handle high-throughput inference workloads, meaning a single malicious request can disrupt service for all concurrent users and queued requests.

Root Cause

The root cause is insufficient input validation in the Idefics3 vision model's image preprocessing pipeline. The code assumes incoming images meet minimum dimensional requirements for tensor operations without explicitly validating these constraints. When boundary-case images (such as 1x1 pixel images) are processed, the resulting tensor shapes are incompatible with downstream operations, causing the runtime error.

The underlying issue is classified as CWE-770 (Allocation of Resources Without Limits or Throttling), as the system fails to properly constrain and validate the image input resources before processing them in tensor operations.

Attack Vector

The attack can be executed remotely over the network by any authenticated user with access to the vLLM inference API. The attacker needs to:

Identify a vLLM deployment serving a multimodal model with Idefics3 vision capabilities
Craft a valid API request containing a 1x1 pixel image payload
Submit the request to the multimodal inference endpoint

The vulnerability manifests when the Idefics3 vision model attempts to process the malformed image, resulting in a tensor dimension mismatch during the embedding generation phase. This causes an unhandled runtime exception that terminates the server process. For technical implementation details, refer to the GitHub Security Advisory.

Detection Methods for CVE-2026-22773

Indicators of Compromise

Unexpected vLLM server process terminations or crashes
Error logs containing tensor dimension mismatch exceptions related to image processing
API requests containing unusually small image payloads (particularly 1x1 pixel images)
Repeated server restarts following multimodal inference requests

Detection Strategies

Monitor vLLM server logs for unhandled runtime errors in the Idefics3 vision model components
Implement request logging to capture image dimensions before processing
Deploy application-level health checks to detect unexpected server terminations
Analyze API traffic patterns for requests with minimal image payloads targeting multimodal endpoints

Monitoring Recommendations

Configure alerting for vLLM process crashes or unexpected restarts
Implement log aggregation to correlate tensor-related exceptions with incoming request payloads
Monitor inference API latency spikes that may indicate service degradation before crashes
Track request patterns from individual users for anomalous small image submissions

How to Mitigate CVE-2026-22773

Immediate Actions Required

Upgrade vLLM to version 0.12.0 or later immediately
Review access controls to restrict multimodal inference endpoints to trusted users
Implement request rate limiting on API endpoints as a temporary protective measure
Deploy health monitoring to enable rapid restart of crashed instances

Patch Information

The vulnerability has been patched in vLLM version 0.12.0. Organizations should upgrade to this version or later to remediate the vulnerability. The patch adds proper validation of image dimensions before tensor processing operations, ensuring that malformed images are rejected with an appropriate error response rather than causing server crashes.

For detailed patch information, see the GitHub Security Advisory.

Workarounds

Implement input validation at the API gateway level to reject images below minimum dimensional thresholds
Deploy vLLM instances behind a reverse proxy that filters requests with malformed image payloads
Use container orchestration with automatic restart policies to minimize downtime from crashes
Consider temporarily disabling Idefics3 vision model support if not required for production workloads

bash

# Configuration example - Upgrade vLLM to patched version
pip install --upgrade vllm>=0.12.0

# Verify installed version
pip show vllm | grep Version

CVE-2026-22773 Overview

Critical Impact
Authenticated attackers can cause complete denial of service to vLLM inference servers by submitting malicious image payloads, disrupting AI/ML workloads and potentially affecting production LLM services.

Affected Products

vLLM versions 0.6.4 through 0.11.x (prior to 0.12.0)
vLLM deployments using Idefics3 vision model implementation
Multimodal LLM serving configurations

Discovery Timeline

2026-01-10 - CVE CVE-2026-22773 published to NVD
2026-01-13 - Last updated in NVD database

Technical Details for CVE-2026-22773

Vulnerability Analysis

Root Cause

Attack Vector

The attack can be executed remotely over the network by any authenticated user with access to the vLLM inference API. The attacker needs to:

Identify a vLLM deployment serving a multimodal model with Idefics3 vision capabilities
Craft a valid API request containing a 1x1 pixel image payload
Submit the request to the multimodal inference endpoint

Detection Methods for CVE-2026-22773

Indicators of Compromise

Unexpected vLLM server process terminations or crashes
Error logs containing tensor dimension mismatch exceptions related to image processing
API requests containing unusually small image payloads (particularly 1x1 pixel images)
Repeated server restarts following multimodal inference requests

Detection Strategies

Monitor vLLM server logs for unhandled runtime errors in the Idefics3 vision model components
Implement request logging to capture image dimensions before processing
Deploy application-level health checks to detect unexpected server terminations
Analyze API traffic patterns for requests with minimal image payloads targeting multimodal endpoints

Monitoring Recommendations

Configure alerting for vLLM process crashes or unexpected restarts
Implement log aggregation to correlate tensor-related exceptions with incoming request payloads
Monitor inference API latency spikes that may indicate service degradation before crashes
Track request patterns from individual users for anomalous small image submissions

How to Mitigate CVE-2026-22773

Immediate Actions Required

Upgrade vLLM to version 0.12.0 or later immediately
Review access controls to restrict multimodal inference endpoints to trusted users
Implement request rate limiting on API endpoints as a temporary protective measure
Deploy health monitoring to enable rapid restart of crashed instances

Patch Information

For detailed patch information, see the GitHub Security Advisory.

Workarounds

Implement input validation at the API gateway level to reject images below minimum dimensional thresholds
Deploy vLLM instances behind a reverse proxy that filters requests with malformed image payloads
Use container orchestration with automatic restart policies to minimize downtime from crashes
Consider temporarily disabling Idefics3 vision model support if not required for production workloads

bash

# Configuration example - Upgrade vLLM to patched version
pip install --upgrade vllm>=0.12.0

# Verify installed version
pip show vllm | grep Version

CVE-2026-22773: vLLM Inference Engine DoS Vulnerability

CVE-2026-22773 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-22773

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-22773

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-22773

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform

CVE-2026-22773: vLLM Inference Engine DoS Vulnerability

CVE-2026-22773 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-22773

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-22773

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-22773

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform