CVE-2026-24779: Vllm Vllm SSRF Vulnerability

CVE-2026-24779 Overview

CVE-2026-24779 is a Server-Side Request Forgery (SSRF) vulnerability in vLLM, an inference and serving engine for large language models (LLMs). The flaw resides in the MediaConnector class within the multimodal feature set. The load_from_url and load_from_url_async methods fetch media from user-supplied URLs and rely on two different Python parsing libraries to enforce host restrictions. Because these parsers interpret backslashes differently, attackers can bypass the host allowlist and coerce the vLLM server into issuing arbitrary requests against internal network resources. All versions prior to 0.14.1 are affected.

Critical Impact
An authenticated attacker can pivot from a compromised vLLM pod to internal services, scan internal networks, interact with sibling pods in llm-d deployments, and falsify management endpoint data such as KV cache state.

Affected Products

vLLM versions prior to 0.14.1
vLLM deployments using the multimodal MediaConnector interface
Containerized llm-d environments running vulnerable vLLM pods

Discovery Timeline

2026-01-27 - CVE-2026-24779 published to NVD
2026-01-30 - Last updated in NVD database

Technical Details for CVE-2026-24779

Vulnerability Analysis

The vulnerability is classified as Server-Side Request Forgery [CWE-918]. vLLM's MediaConnector exposes two URL-loading methods used to retrieve images and other media supplied by API clients. Before fetching a URL, the connector validates the hostname against a restriction list to prevent requests to internal addresses. The validation step and the actual fetch step parse the URL using two different libraries, creating a parser-differential condition.

Specifically, the host extraction during validation and the host used during the outbound HTTP request disagree on how to handle backslash characters embedded in the authority component. An attacker submits a URL where the validator sees an allowed external host while the underlying HTTP client resolves and connects to an internal target. This produces an authenticated SSRF reachable from any client able to submit multimodal inputs.

Root Cause

The root cause is inconsistent URL parsing between urllib.parse.urlparse (used in validation paths) and the HTTP client stack that performs the actual connection. Backslash handling diverges between the parsers, so the hostname examined during security checks does not match the hostname used to dispatch the request. The fix standardizes parsing on urllib3.util.parse_url.

Attack Vector

Exploitation requires the ability to send a multimodal request to a vLLM endpoint. The attacker crafts a URL whose authority component contains a backslash separating an attacker-controlled allowed host from an internal target. The validator extracts the allowed host while the HTTP client connects to the internal target. In an llm-d deployment, the attacker can target management endpoints to falsely report metrics such as KV cache state, induce scheduling errors, or enumerate cluster-internal services.

python

# Security patch in vllm/connections.py
 from collections.abc import Mapping, MutableMapping
 from pathlib import Path
-from urllib.parse import urlparse
 
 import aiohttp
 import requests
+from urllib3.util import parse_url
 
 from vllm.version import __version__ as VLLM_VERSION
# Source: https://github.com/vllm-project/vllm/commit/f46d576c54fb8aeec5fc70560e850bed38ef17d7

The patch replaces urlparse with urllib3.util.parse_url so that validation and request dispatch use the same parser, eliminating the backslash-handling divergence.

python

# Security patch in vllm/envs.py
     try:
         return int(port)
     except ValueError as err:
-        from urllib.parse import urlparse
+        from urllib3.util import parse_url
 
-        parsed = urlparse(port)
+        parsed = parse_url(port)
         if parsed.scheme:
             raise ValueError(
                 f"VLLM_PORT '{port}' appears to be a URI. "
# Source: https://github.com/vllm-project/vllm/commit/f46d576c54fb8aeec5fc70560e850bed38ef17d7

Detection Methods for CVE-2026-24779

Indicators of Compromise

Multimodal API requests containing URLs with backslash (\) characters in the authority component or between hostnames
Outbound connections from vLLM pods to RFC1918, link-local, or cluster-internal addresses such as 169.254.0.0/16, 10.0.0.0/8, or kube-apiserver
Unexpected requests from vLLM pods to llm-d management endpoints or sibling pod metrics interfaces
Anomalous KV cache state reports or scheduler metric inconsistencies originating from a single pod

Detection Strategies

Inspect HTTP request payloads to the vLLM /v1/chat/completions and multimodal endpoints for URLs containing backslashes or mixed-encoding authority components.
Compare DNS resolutions and TCP destinations from vLLM pods against an allowlist of expected external media hosts.
Audit application logs for MediaConnector fetches that resolve to private IP ranges.

Monitoring Recommendations

Enable egress traffic logging at the pod or service-mesh layer for all vLLM workloads.
Alert on vLLM pods initiating connections to the Kubernetes API server, cluster DNS, or other in-cluster services not in their baseline.
Track the vLLM version in deployment manifests and flag any instance running below 0.14.1.

How to Mitigate CVE-2026-24779

Immediate Actions Required

Upgrade all vLLM instances to version 0.14.1 or later, which contains the parser unification fix.
Restrict network egress from vLLM pods using NetworkPolicies or service-mesh rules so that only required external media hosts are reachable.
Require authentication on vLLM endpoints and restrict multimodal input submission to trusted clients.

Patch Information

The fix is published in vLLM 0.14.1 via commit f46d576c54fb8aeec5fc70560e850bed38ef17d7 and pull request #32746. The patch replaces urllib.parse.urlparse with urllib3.util.parse_url in vllm/connections.py and vllm/envs.py, ensuring consistent host parsing between validation and request dispatch. Full details are available in the GitHub Security Advisory GHSA-qh4c-xf7m-gxfc.

Workarounds

Disable multimodal URL ingestion if not required, forcing clients to upload media payloads directly instead of by reference.
Place vLLM behind an egress proxy that re-validates destination hostnames and blocks private address ranges.
Apply Kubernetes NetworkPolicies that deny vLLM pod access to llm-d management endpoints, cluster DNS, and the Kubernetes API server.

bash

# Kubernetes NetworkPolicy denying vLLM pod egress to internal cluster ranges
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: vllm-egress-restrict
spec:
  podSelector:
    matchLabels:
      app: vllm
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 10.0.0.0/8
        - 172.16.0.0/12
        - 192.168.0.0/16
        - 169.254.0.0/16
EOF