CVE-2026-7304: Lmsys Sglang RCE Vulnerability

CVE-2026-7304 Overview

CVE-2026-7304 is an unauthenticated remote code execution vulnerability in the SGLang multimodal generation runtime maintained by LMSYS. When the runtime is launched with the --enable-custom-logit-processor option, Python objects submitted by remote clients are passed to dill.loads() and deserialized without validation. An attacker who can reach the inference endpoint can craft a malicious serialized payload that runs arbitrary code in the server process. The flaw is classified under [CWE-502: Deserialization of Untrusted Data].

Critical Impact
Unauthenticated attackers with network access to an SGLang server started with --enable-custom-logit-processor can execute arbitrary Python code on the host, leading to full compromise of the model-serving infrastructure.

Affected Products

LMSYS SGLang 0.5.10
SGLang deployments started with the --enable-custom-logit-processor flag
Multimodal generation runtime exposed over the network

Discovery Timeline

2026-05-18 - CVE-2026-7304 published to NVD
2026-05-19 - Last updated in NVD database

Technical Details for CVE-2026-7304

Vulnerability Analysis

SGLang is a serving runtime for large language and multimodal models. The runtime exposes a feature that allows clients to supply a custom logit processor, a callable that adjusts token probabilities during generation. When the operator starts the server with --enable-custom-logit-processor, SGLang accepts a serialized Python object from the request payload and reconstructs it with dill.loads().

dill extends Python's pickle protocol and inherits the same security properties. Any deserialization of attacker-controlled bytes can invoke __reduce__ methods that execute arbitrary code in the interpreter. SGLang performs no signature check, allow-list, or sandboxing around the call, so the inference worker runs whatever the attacker encodes.

Root Cause

The root cause is insecure deserialization of untrusted input. The custom logit processor feature trusts that callers will send benign Python callables. dill.loads() is not a safe parser and is not designed for untrusted data. Combined with the lack of authentication on the inference endpoint, every request body becomes a code-execution primitive.

Attack Vector

The attack is network-based and requires no authentication or user interaction. An attacker sends a generation request to the SGLang HTTP endpoint with a malicious pickled object in the logit-processor field. The server deserializes the payload, executes the embedded __reduce__ gadget, and the attacker gains code execution as the SGLang process user. Typical post-exploitation steps include exfiltrating model weights, stealing API keys from environment variables, and pivoting into the GPU cluster.

For a full write-up of the deserialization path and related issues, see the AntiProof analysis of three RCEs in SGLang and the SGLang Python source tree.

Detection Methods for CVE-2026-7304

Indicators of Compromise

SGLang processes spawning child processes such as sh, bash, python, curl, or wget that are not part of normal inference workflows.
Outbound network connections from inference hosts to unfamiliar IP addresses shortly after a generation request.
HTTP request bodies to SGLang endpoints containing base64-encoded blobs that begin with pickle opcodes such as gASV or \\x80\\x04.
Modifications to ~/.ssh/authorized_keys, cron entries, or systemd units on model-serving hosts.

Detection Strategies

Inspect SGLang launch arguments across the fleet for the --enable-custom-logit-processor flag and treat any match as in-scope for incident response.
Apply network IDS signatures that flag pickle magic bytes (\\x80\\x02, \\x80\\x04, \\x80\\x05) inside JSON fields posted to inference APIs.
Correlate process-creation telemetry with the SGLang parent PID to surface unexpected interpreter or shell invocations.

Monitoring Recommendations

Forward SGLang stdout, stderr, and access logs to a centralized analytics platform and alert on tracebacks referencing dill, pickle, or _reconstructor.
Monitor GPU hosts for unsigned binaries executing from /tmp, /dev/shm, or the SGLang working directory.
Track egress from inference subnets and alert on connections to non-allowlisted destinations.

How to Mitigate CVE-2026-7304

Immediate Actions Required

Restart any SGLang instance running with --enable-custom-logit-processor without that flag until a patched version is deployed.
Place SGLang endpoints behind an authenticated reverse proxy and restrict ingress to known client subnets.
Rotate credentials, API tokens, and model artifacts accessible to any SGLang host that may have been exposed.

Patch Information

No fixed version is listed in the NVD record at the time of publication. Track the SGLang upstream repository for releases that replace dill.loads() with a safe serialization scheme, and review the AntiProof advisory for vendor coordination updates.

Workarounds

Disable the custom logit processor feature in all deployments and remove --enable-custom-logit-processor from launch scripts and container manifests.
Run SGLang under a low-privilege service account with restrictive filesystem and network policies, for example using seccomp or AppArmor profiles.
Terminate TLS at an authenticating gateway and reject requests whose payloads contain raw pickle byte sequences.

bash

# Configuration example: launch SGLang without the vulnerable flag
python -m sglang.launch_server \
  --model-path /models/llava \
  --host 127.0.0.1 \
  --port 30000
# Do NOT add: --enable-custom-logit-processor