CVE-2026-7301: Lmsys Sglang RCE Vulnerability

CVE-2026-7301 Overview

CVE-2026-7301 is an insecure deserialization vulnerability [CWE-502] in SGLang, the LLM serving runtime maintained by LMSYS. The multimodal generation runtime scheduler exposes a ZeroMQ ROUTER socket that binds to 0.0.0.0 by default. The socket processes incoming messages through a sink that calls pickle.loads() without validation. Unauthenticated attackers can send a crafted pickle payload over the network to achieve remote code execution on the host running the scheduler. The flaw affects SGLang version 0.5.10 and is reachable whenever the scheduler is exposed to untrusted networks.

Critical Impact
Unauthenticated remote attackers can execute arbitrary code on any SGLang inference host whose scheduler socket is reachable over the network.

Affected Products

LMSYS SGLang 0.5.10
SGLang multimodal generation runtime scheduler
Deployments exposing the scheduler ROUTER socket on 0.0.0.0

Discovery Timeline

2026-05-18 - CVE-2026-7301 published to the National Vulnerability Database (NVD)
2026-05-19 - Last updated in NVD database

Technical Details for CVE-2026-7301

Vulnerability Analysis

SGLang's multimodal generation runtime uses a scheduler process that communicates with worker components through a ZeroMQ ROUTER socket. Two design decisions combine to create remote code execution exposure. First, the socket binds to 0.0.0.0, accepting connections from any network interface rather than a loopback or restricted address. Second, the message sink hands raw inbound bytes to Python's pickle.loads() for deserialization.

Python pickle deserialization is documented as unsafe for untrusted input. The pickle format supports opcodes such as REDUCE that instantiate arbitrary callables during unpickling. An attacker can construct a payload that triggers execution of operating system commands the moment the scheduler decodes the message.

Root Cause

The root cause is the combination of an unauthenticated network listener and unsafe deserialization on its inputs. Treating the inter-process channel as trusted while binding it to all interfaces removes the network boundary that the design assumed. No authentication, signing, or schema validation gates the pickle stream before pickle.loads() runs.

Attack Vector

The attack vector is network-based and requires no privileges or user interaction. An attacker locates an SGLang scheduler exposed on the default ZeroMQ port, opens a DEALER connection to the ROUTER socket, and transmits a malicious pickle frame. Upon receipt, the sink calls pickle.loads(), executing the embedded __reduce__ payload as the SGLang process user. Cloud-hosted inference servers and GPU clusters that bind the scheduler to public interfaces are directly exploitable.

No verified public exploit code is currently catalogued. Technical details are described in the AntiProof blog on three RCEs in SGLang and the SGLang GitHub repository.

Detection Methods for CVE-2026-7301

Indicators of Compromise

SGLang scheduler processes spawning unexpected child processes such as sh, bash, python, or curl.
Outbound network connections from the SGLang host to attacker-controlled infrastructure following inbound traffic on the scheduler port.
ZeroMQ ROUTER sockets bound to 0.0.0.0 on hosts running SGLang 0.5.10.
Anomalous pickle opcodes such as c__builtin__\nexec or cposix\nsystem in captured scheduler traffic.

Detection Strategies

Inventory SGLang deployments and identify any scheduler instance whose listening socket is reachable from non-loopback addresses.
Inspect process trees for pickle.loads() execution paths that lead to shell or interpreter children under the SGLang service account.
Correlate network telemetry with process telemetry to flag inbound connections on scheduler ports followed by new process creation events.

Monitoring Recommendations

Alert on any new outbound connection initiated by the SGLang scheduler process to non-cluster destinations.
Monitor for binds to 0.0.0.0 by Python processes serving model inference workloads.
Log and review ZeroMQ traffic for non-cluster source addresses contacting the scheduler.

How to Mitigate CVE-2026-7301

Immediate Actions Required

Restrict the SGLang scheduler ROUTER socket to 127.0.0.1 or a private cluster interface using firewall rules or host-level binding configuration.
Place SGLang inference hosts behind a network policy that denies all ingress on scheduler ports from outside the trusted compute network.
Audit cloud security groups, Kubernetes NetworkPolicy objects, and ingress controllers for any rules that expose SGLang ports to the internet.
Rotate credentials and inspect for post-exploitation artifacts on any host whose scheduler was previously exposed.

Patch Information

No vendor advisory or fixed version is listed in the NVD record at the time of publication. Monitor the SGLang GitHub repository for an updated release that replaces pickle.loads() with a safe serialization format and changes the default bind address. Until a patched release is available, treat all SGLang 0.5.10 deployments as vulnerable.

Workarounds

Override the scheduler bind address to 127.0.0.1 and route worker traffic through an authenticated reverse proxy or SSH tunnel.
Run SGLang inside a network namespace or container with no external interfaces attached to the scheduler port.
Apply host firewall rules with iptables or nftables to drop traffic to the scheduler port from all addresses except known worker hosts.

bash

# Restrict scheduler port to loopback only (replace 30000 with your scheduler port)
sudo iptables -A INPUT -p tcp --dport 30000 ! -s 127.0.0.1 -j DROP
sudo iptables -A INPUT -p tcp --dport 30000 -s 127.0.0.1 -j ACCEPT

# Verify no SGLang process is binding to 0.0.0.0
ss -tlnp | grep -E 'python|sglang'