CVE-2026-7669: SGLang HuggingFace RCE Vulnerability

CVE-2026-7669 Overview

CVE-2026-7669 is a code injection vulnerability in sgl-project SGLang up to version 0.5.9. The flaw resides in the get_tokenizer function within python/sglang/srt/utils/hf_transformers_utils.py, which handles HuggingFace transformer tokenizer loading. When a caller passes trust_remote_code=False, SGLang silently re-invokes AutoTokenizer.from_pretrained with trust_remote_code=True, overriding the explicit security setting. A model repository containing a malicious tokenizer.py referenced via auto_map in tokenizer_config.json will execute arbitrary Python in the SGLang process. The weakness is classified under [CWE-74] (Improper Neutralization of Special Elements in Output).

Critical Impact
Loading an attacker-controlled HuggingFace model in SGLang executes arbitrary Python code in the inference process, even when callers explicitly set trust_remote_code=False.

Affected Products

sgl-project SGLang versions up to and including 0.5.9
Deployments using HuggingFace transformers==5.3.0 (pinned in pyproject.toml)
Both tokenizer_mode="auto" and tokenizer_mode="slow" configurations

Discovery Timeline

2026-05-02 - CVE-2026-7669 published to NVD
2026-05-05 - Last updated in NVD database

Technical Details for CVE-2026-7669

Vulnerability Analysis

The vulnerability emerges from an interaction between SGLang's tokenizer loading logic and HuggingFace transformers v5. When get_tokenizer() requests a tokenizer with trust_remote_code=False, transformers v5 returns a TokenizersBackend instance as the generic fallback for tokenizer classes not present in its registry. SGLang treats this fallback as a failure and retries the call with trust_remote_code=True to recover. This silent escalation overrides the caller's explicit security boundary without emitting any log line or warning. Because transformers==5.3.0 is pinned in pyproject.toml, every current SGLang release exhibits the behavior. The exploit is public, and the vendor did not respond to early disclosure outreach.

Root Cause

The root cause is an unsafe fallback path that re-issues the tokenizer load with elevated trust when the first attempt does not return a recognized tokenizer class. The retry ignores the security intent encoded in the original trust_remote_code=False argument. The condition triggering the retry is reachable for any tokenizer class HuggingFace v5 routes through TokenizersBackend, which is the generic catch-all path.

Attack Vector

An attacker publishes a HuggingFace model repository containing a tokenizer_config.json with an auto_map entry pointing at a malicious tokenizer.py. When an SGLang operator or downstream service loads that model identifier, the second AutoTokenizer.from_pretrained call honors auto_map and imports the attacker's Python module. Code execution occurs in the SGLang process context, with access to model weights, GPU memory, environment secrets, and any network reachability the inference host has. The attack is network-reachable but requires the target to load a specific model, contributing to the high attack complexity rating.

No verified exploit code is reproduced here. See the GitHub PoC Repository and VulDB Vulnerability #360817 for technical artifacts.

Detection Methods for CVE-2026-7669

Indicators of Compromise

Unexpected child processes or outbound network connections originating from the SGLang Python process after a model load
Presence of auto_map entries in tokenizer_config.json of cached HuggingFace models under ~/.cache/huggingface/
Loaded modules in the SGLang process with paths inside HuggingFace cache directories rather than site-packages
Filesystem writes or credential access from the inference worker shortly after a new model identifier is requested

Detection Strategies

Audit all SGLang model load requests and correlate the model repository identifier against an allowlist of trusted publishers
Inspect tokenizer_config.json for any auto_map keys before permitting a model into the serving environment
Hook or instrument AutoTokenizer.from_pretrained to log the effective trust_remote_code value and alert on True when the caller passed False
Monitor for Python import events sourced from cache paths using EDR or eBPF-based file-execution telemetry

Monitoring Recommendations

Forward SGLang stdout, stderr, and Python audit hook events into a centralized log pipeline for retention and analytics
Alert on any process spawned by the inference worker that is not in a known-good baseline (shell, curl, wget, ssh)
Track egress connections from inference hosts to non-HuggingFace destinations during model bootstrap windows

How to Mitigate CVE-2026-7669

Immediate Actions Required

Restrict SGLang deployments to load only models from a vetted internal registry or specific allowlisted HuggingFace repositories
Run SGLang inference workers as unprivileged users inside containers with read-only filesystems and no outbound internet beyond model registries
Pre-fetch and audit tokenizer artifacts in an isolated environment, rejecting any model whose tokenizer_config.json contains an auto_map entry
Block or proxy huggingface.co traffic from production inference hosts and serve approved models from an internal mirror

Patch Information

No vendor patch has been published for SGLang at the time of NVD disclosure. The vendor was contacted prior to public disclosure but did not respond. Track the VulDB Vulnerability #360817 entry and the SGLang project for upstream fixes, and pin to a fixed release once available.

Workarounds

Patch get_tokenizer locally to remove the fallback that re-invokes AutoTokenizer.from_pretrained with trust_remote_code=True
Downgrade transformers below v5 if compatibility allows, since the TokenizersBackend fallback path is the trigger
Wrap AutoTokenizer.from_pretrained with a monkeypatch that forces trust_remote_code=False regardless of internal callers
Strip auto_map from any cached tokenizer_config.json before loading

bash

# Configuration example: scan cached tokenizer configs for auto_map abuse
find ~/.cache/huggingface -name 'tokenizer_config.json' \
  -exec grep -l '"auto_map"' {} \;

# Run SGLang with a hardened wrapper that pins trust_remote_code=False
export HF_HUB_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
python -m sglang.launch_server --model-path /opt/models/approved/llama-3 \
  --tokenizer-mode slow

CVE-2026-7669 Overview

Critical Impact
Loading an attacker-controlled HuggingFace model in SGLang executes arbitrary Python code in the inference process, even when callers explicitly set trust_remote_code=False.

Affected Products

sgl-project SGLang versions up to and including 0.5.9
Deployments using HuggingFace transformers==5.3.0 (pinned in pyproject.toml)
Both tokenizer_mode="auto" and tokenizer_mode="slow" configurations

Discovery Timeline

2026-05-02 - CVE-2026-7669 published to NVD
2026-05-05 - Last updated in NVD database

Technical Details for CVE-2026-7669

Vulnerability Analysis

Root Cause

Attack Vector

No verified exploit code is reproduced here. See the GitHub PoC Repository and VulDB Vulnerability #360817 for technical artifacts.

Detection Methods for CVE-2026-7669

Indicators of Compromise

Unexpected child processes or outbound network connections originating from the SGLang Python process after a model load
Presence of auto_map entries in tokenizer_config.json of cached HuggingFace models under ~/.cache/huggingface/
Loaded modules in the SGLang process with paths inside HuggingFace cache directories rather than site-packages
Filesystem writes or credential access from the inference worker shortly after a new model identifier is requested

Detection Strategies

Audit all SGLang model load requests and correlate the model repository identifier against an allowlist of trusted publishers
Inspect tokenizer_config.json for any auto_map keys before permitting a model into the serving environment
Hook or instrument AutoTokenizer.from_pretrained to log the effective trust_remote_code value and alert on True when the caller passed False
Monitor for Python import events sourced from cache paths using EDR or eBPF-based file-execution telemetry

Monitoring Recommendations

Forward SGLang stdout, stderr, and Python audit hook events into a centralized log pipeline for retention and analytics
Alert on any process spawned by the inference worker that is not in a known-good baseline (shell, curl, wget, ssh)
Track egress connections from inference hosts to non-HuggingFace destinations during model bootstrap windows

How to Mitigate CVE-2026-7669

Immediate Actions Required

Restrict SGLang deployments to load only models from a vetted internal registry or specific allowlisted HuggingFace repositories
Run SGLang inference workers as unprivileged users inside containers with read-only filesystems and no outbound internet beyond model registries
Pre-fetch and audit tokenizer artifacts in an isolated environment, rejecting any model whose tokenizer_config.json contains an auto_map entry
Block or proxy huggingface.co traffic from production inference hosts and serve approved models from an internal mirror

Patch Information

Workarounds

Patch get_tokenizer locally to remove the fallback that re-invokes AutoTokenizer.from_pretrained with trust_remote_code=True
Downgrade transformers below v5 if compatibility allows, since the TokenizersBackend fallback path is the trigger
Wrap AutoTokenizer.from_pretrained with a monkeypatch that forces trust_remote_code=False regardless of internal callers
Strip auto_map from any cached tokenizer_config.json before loading

bash

# Configuration example: scan cached tokenizer configs for auto_map abuse
find ~/.cache/huggingface -name 'tokenizer_config.json' \
  -exec grep -l '"auto_map"' {} \;

# Run SGLang with a hardened wrapper that pins trust_remote_code=False
export HF_HUB_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
python -m sglang.launch_server --model-path /opt/models/approved/llama-3 \
  --tokenizer-mode slow

CVE-2026-7669: SGLang HuggingFace RCE Vulnerability

CVE-2026-7669 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-7669

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-7669

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-7669

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2026-7669: SGLang HuggingFace RCE Vulnerability

CVE-2026-7669 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-7669

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-7669

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-7669

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform