CVE-2026-3059 Overview
CVE-2026-3059 is an unauthenticated remote code execution vulnerability in the SGLang large language model serving framework maintained by LMSYS. The flaw resides in the multimodal generation module, where the ZeroMQ (ZMQ) broker deserializes incoming messages using Python's pickle.loads() without any authentication or integrity checks. Any attacker with network access to the ZMQ socket can submit a crafted pickle payload and execute arbitrary code in the context of the SGLang process. The vulnerability is tracked under [CWE-502: Deserialization of Untrusted Data].
Critical Impact
Unauthenticated attackers can achieve remote code execution on hosts running the SGLang multimodal generation service, compromising model serving infrastructure and any data accessible to the process.
Affected Products
- LMSYS SGLang multimodal generation module (versions prior to v0.5.10)
- Deployments exposing the SGLang scheduler ZMQ broker on reachable interfaces
- Multimodal inference pipelines built on affected SGLang releases
Discovery Timeline
- 2026-03-12 - CVE-2026-3059 published to the National Vulnerability Database (NVD)
- 2026-04-07 - Last updated in NVD database
Technical Details for CVE-2026-3059
Vulnerability Analysis
SGLang is a serving framework for large language models and multimodal models. Its multimodal generation runtime uses a ZMQ-based scheduler to dispatch work between client and worker processes. The scheduler client component, implemented in python/sglang/multimodal_gen/runtime/scheduler_client.py, reads messages from a ZMQ socket and reconstructs Python objects with pickle.loads().
Pickle is not a safe deserialization format. Any object reconstruction can invoke arbitrary callables through the __reduce__ protocol. Because the ZMQ broker accepts connections without authentication, message signing, or transport-level access control, anyone who can reach the listening socket controls what the server deserializes.
An attacker who reaches the broker can deliver a pickle payload whose __reduce__ method returns a tuple referencing os.system, subprocess.Popen, or any other callable. Deserialization triggers execution in the SGLang worker context, yielding code execution with the privileges of the serving process.
Root Cause
The root cause is unsafe use of pickle.loads() on data received from an untrusted network source. The ZMQ broker channel was treated as an internal trust boundary, but no binding restriction, authentication layer (such as CurveZMQ), or safer serialization format (such as JSON or protobuf) was enforced.
Attack Vector
Exploitation requires network reachability to the SGLang multimodal scheduler ZMQ endpoint. No user interaction or prior authentication is needed. The attacker crafts a malicious pickle object, transmits it through a ZMQ client connection, and the broker deserializes it on receipt. Deployments that bind the broker to 0.0.0.0 or expose it through container networking, Kubernetes services, or shared inference clusters are directly reachable. See the SGLang Security Advisory GHSA-3cp7-c6q2-94xr and the Orca Security RCE analysis for technical details.
Detection Methods for CVE-2026-3059
Indicators of Compromise
- Unexpected child processes spawned by the SGLang Python worker, such as shells, curl, wget, or interpreters launched outside normal model serving flows
- Outbound network connections from SGLang processes to unfamiliar hosts shortly after inbound ZMQ traffic
- New files written under the SGLang working directory or /tmp that contain shell scripts, loaders, or cron entries
- ZMQ traffic to the multimodal scheduler port from clients outside the documented inference cluster
Detection Strategies
- Inspect process trees for python or SGLang workers spawning /bin/sh, bash, python -c, or other interpreters as child processes
- Monitor for anomalous deserialization-related stack frames in application logs, including unhandled exceptions originating in pickle or scheduler_client.py
- Apply network analytics to flag external connections reaching the ZMQ scheduler port on inference hosts
Monitoring Recommendations
- Forward SGLang stdout, stderr, and crash logs to a centralized logging pipeline and alert on tracebacks referencing pickle.loads
- Capture process telemetry on GPU and inference nodes to baseline expected child processes and detect deviations
- Track ingress traffic to model serving ports and alert on connections from non-allowlisted source addresses
How to Mitigate CVE-2026-3059
Immediate Actions Required
- Upgrade SGLang to release v0.5.10 or later, which contains the fix shipped in pull request #20904
- Restrict network reachability of the multimodal scheduler ZMQ broker to loopback or a trusted internal segment until patches are applied
- Rotate any credentials, API keys, or model artifacts that were accessible from hosts running an unpatched SGLang instance
- Audit inference hosts for the indicators of compromise listed above before returning them to service
Patch Information
The maintainers released SGLang v0.5.10 addressing the unsafe deserialization path in the multimodal scheduler client. Review the fix in pull request #20904 and the GHSA-3cp7-c6q2-94xr advisory for the full list of changes.
Workarounds
- Bind the ZMQ broker to 127.0.0.1 and route all inference traffic through an authenticated reverse proxy when an immediate upgrade is not possible
- Enforce host firewall rules or Kubernetes NetworkPolicies that allow only known client IPs to reach the scheduler port
- Run SGLang workers as a low-privilege account inside a hardened container to limit blast radius if exploitation occurs
- Place the inference cluster behind a mutually authenticated transport such as a service mesh with mTLS until the patched release is deployed
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


