CVE-2025-30165 Overview
CVE-2025-30165 is an insecure deserialization vulnerability [CWE-502] in vLLM, an inference and serving engine for large language models. The flaw affects multi-node deployments using the V0 engine, where secondary hosts open a ZeroMQ SUB socket and connect to an XPUB socket on the primary host. Data received on the SUB socket is deserialized using Python pickle, which allows arbitrary code execution when malicious payloads are processed. The V1 engine is not affected, and V0 has been disabled by default since v0.8.0. The vLLM maintainers have decided not to release a patch and instead recommend network isolation as the mitigation.
Critical Impact
An attacker with adjacent network access can execute arbitrary code on secondary vLLM hosts by delivering a crafted pickle payload to the SUB socket, enabling lateral movement across an entire multi-node deployment.
Affected Products
- vLLM multi-node deployments using the V0 engine
- vLLM configurations using tensor parallelism across multiple hosts
- vLLM versions where V0 engine is explicitly enabled (default behavior prior to v0.8.0)
Discovery Timeline
- 2025-05-06 - CVE-2025-30165 published to NVD
- 2025-07-31 - Last updated in NVD database
Technical Details for CVE-2025-30165
Vulnerability Analysis
The vulnerability resides in vLLM's distributed device communicator code, specifically in the shared memory broadcast logic that handles inter-host messaging. In a multi-node V0 deployment, secondary vLLM hosts establish a ZeroMQ SUB socket connection to an XPUB socket on the primary host. Each message received over this channel is passed directly to Python's pickle.loads() for deserialization.
Python pickle is well-documented as unsafe for untrusted input because the deserialization process can invoke arbitrary callables through __reduce__ methods. An attacker who can deliver bytes to the SUB socket on a secondary host gains remote code execution under the privileges of the vLLM process.
Root Cause
The root cause is the use of pickle to deserialize network-received data without authentication, integrity verification, or use of a safe serialization format such as JSON or protobuf. The trust boundary assumes that any traffic arriving on the SUB socket originates from a legitimate primary host, but ZeroMQ provides no built-in authentication on this channel in the default configuration.
Attack Vector
Exploitation requires the attacker to deliver a malicious pickle payload to a secondary host's SUB socket. Two practical paths exist. First, an attacker who has already compromised the primary vLLM host can pivot to all secondary hosts by sending crafted messages over the existing XPUB channel. Second, an attacker on the adjacent network can use ARP cache poisoning or similar Layer 2 redirection to impersonate the primary host and deliver the payload directly. The vLLM maintainers reference the relevant code paths in vllm/distributed/device_communicators/shm_broadcast.py in the GitHub Security Advisory GHSA-9pcc-gvx5-r5wm.
// No verified exploit code is published.
// The vulnerable pattern follows the form:
// socket.recv() -> pickle.loads(data) -> arbitrary code execution
// See the vendor advisory for the exact source locations.
Detection Methods for CVE-2025-30165
Indicators of Compromise
- Unexpected child processes spawned by the vLLM Python worker on secondary nodes, such as shells, network tools, or interpreters.
- Outbound network connections originating from vLLM hosts to addresses outside the cluster's defined peer set.
- ARP table anomalies on cluster subnets, including duplicate MAC-to-IP mappings or rapid ARP cache updates for the primary vLLM host address.
Detection Strategies
- Monitor for pickle.loads invocations on data sourced from network sockets in Python-based ML serving stacks.
- Inspect ZeroMQ traffic patterns between vLLM nodes for unexpected publishers or messages from non-peer source addresses.
- Alert on process lineage where the vLLM worker process forks unexpected binaries, indicating successful deserialization-driven code execution.
Monitoring Recommendations
- Enable host-based runtime monitoring on every node in the vLLM cluster to capture process execution, file writes, and network connection events.
- Centralize ARP and DHCP logs from switches serving the inference cluster subnet to detect Layer 2 redirection attempts.
- Track vLLM version and engine configuration (V0 vs V1) across all nodes to identify hosts still exposed to this vulnerability pattern.
How to Mitigate CVE-2025-30165
Immediate Actions Required
- Migrate all multi-node vLLM deployments to the V1 engine, which is the default since v0.8.0 and is not affected by this vulnerability.
- Isolate vLLM inter-node communication on a dedicated, trusted network segment with no exposure to user workloads or general infrastructure traffic.
- Restrict access to ZeroMQ ports used by vLLM through host firewalls, allowing only known cluster peer IP addresses.
Patch Information
The vLLM maintainers have decided not to release a code fix for this issue. The rationale is that the V0 engine has been disabled by default since v0.8.0 and the fix would be invasive. The official guidance, documented in the GitHub Security Advisory GHSA-9pcc-gvx5-r5wm, is to ensure the deployment environment runs on a secure network when the V0 multi-host pattern is still in use.
Workarounds
- Disable the V0 engine and run vLLM under the V1 engine to remove the vulnerable code path entirely.
- Deploy static ARP entries for the primary vLLM host on each secondary node to prevent ARP cache poisoning attacks on the cluster subnet.
- Where multi-host tensor parallelism is required, place all vLLM nodes inside an encrypted overlay network or VPN to authenticate peers at the network layer.
# Example: pin the primary vLLM host MAC address on each secondary node
# Replace <primary_ip> and <primary_mac> with your cluster values
sudo arp -s <primary_ip> <primary_mac>
# Restrict the ZeroMQ port to known peers only (example using iptables)
sudo iptables -A INPUT -p tcp --dport 5555 -s <primary_ip> -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 5555 -j DROP
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


