CVE-2026-5760 Overview
CVE-2026-5760 is a critical Remote Code Execution (RCE) vulnerability affecting SGLang's reranking endpoint (/v1/rerank). The vulnerability occurs when a model file containing a malicious tokenizer.chat_template is loaded, as the Jinja2 chat templates are rendered using an unsandboxed jinja2.Environment(). This allows attackers to execute arbitrary code on systems running vulnerable versions of SGLang.
Critical Impact
Unauthenticated remote attackers can achieve full code execution on affected SGLang servers by exploiting unsandboxed Jinja2 template rendering, potentially leading to complete system compromise.
Affected Products
- SGLang versions utilizing the /v1/rerank endpoint with unsandboxed Jinja2 template rendering
Discovery Timeline
- 2026-04-20 - CVE-2026-5760 published to NVD
- 2026-04-20 - Last updated in NVD database
Technical Details for CVE-2026-5760
Vulnerability Analysis
This vulnerability is classified under CWE-94 (Improper Control of Generation of Code - Code Injection). The core issue lies in how SGLang processes Jinja2 templates within model tokenizer configurations. When a model file is loaded containing a malicious tokenizer.chat_template, the template is rendered without proper sandboxing. Jinja2's default Environment() class provides access to Python internals through template syntax, enabling attackers to break out of the template context and execute arbitrary Python code.
The vulnerability is particularly dangerous because it can be triggered remotely through the /v1/rerank API endpoint without requiring any authentication. An attacker who can supply or manipulate model files can embed malicious Jinja2 template code that executes when the model is loaded and processed.
Root Cause
The root cause is the use of an unsandboxed jinja2.Environment() for rendering chat templates from model tokenizer files. Jinja2's standard environment allows template authors to access Python objects, traverse object hierarchies, and ultimately execute arbitrary code through special attributes and methods. Without sandboxing mechanisms like jinja2.sandbox.SandboxedEnvironment, malicious templates can exploit Python's introspection capabilities to escape the template context.
Attack Vector
The attack is network-based and requires no privileges or user interaction. An attacker can exploit this vulnerability by:
- Crafting a malicious model file containing a weaponized tokenizer.chat_template with Jinja2 Server-Side Template Injection (SSTI) payloads
- Having the SGLang server load this malicious model (through model repository poisoning, man-in-the-middle attacks during model downloads, or direct model file manipulation)
- Triggering the reranking endpoint (/v1/rerank) which causes the malicious template to be rendered
- Achieving arbitrary code execution through Jinja2 SSTI techniques that access Python runtime functions
The vulnerability can be exploited through common Jinja2 SSTI techniques that leverage Python's object model to access dangerous functions like os.system(), subprocess.Popen(), or similar code execution primitives. For detailed technical information, refer to the GitHub RCE PoC Repository.
Detection Methods for CVE-2026-5760
Indicators of Compromise
- Unusual subprocess spawning from SGLang server processes
- Suspicious network connections originating from the SGLang application
- Unexpected file system modifications in directories accessible to the SGLang process
- Anomalous model files containing complex or obfuscated Jinja2 template syntax
Detection Strategies
- Monitor for suspicious Jinja2 template patterns in model configuration files, particularly those containing Python object traversal syntax (e.g., __class__, __mro__, __subclasses__, __globals__)
- Implement network-level monitoring for outbound connections from SGLang servers to unexpected destinations
- Deploy endpoint detection rules to identify process injection or child process spawning from the SGLang application
Monitoring Recommendations
- Enable detailed logging for all requests to the /v1/rerank endpoint
- Implement file integrity monitoring for model directories and tokenizer configuration files
- Configure alerts for process execution patterns indicative of command injection exploitation
How to Mitigate CVE-2026-5760
Immediate Actions Required
- Review all loaded model files for suspicious tokenizer.chat_template content
- Restrict network access to the /v1/rerank endpoint to trusted sources only
- Implement model file validation and integrity checking before loading
- Consider disabling the reranking endpoint if not actively required
Patch Information
Organizations should monitor the SGLang project for security updates that implement sandboxed Jinja2 template rendering. The fix should involve replacing jinja2.Environment() with jinja2.sandbox.SandboxedEnvironment() or implementing equivalent protections. Review the CERT Vulnerability Advisory for official guidance and updated patch information.
Workarounds
- Deploy a web application firewall (WAF) to filter requests to the /v1/rerank endpoint
- Implement strict model file provenance controls, only loading models from verified and trusted sources
- Run SGLang in a containerized environment with restricted capabilities and network isolation
- Apply principle of least privilege to the SGLang process, limiting its access to system resources
# Example: Restricting network access to rerank endpoint with iptables
iptables -A INPUT -p tcp --dport 8080 -m string --string "/v1/rerank" --algo kmp -j DROP
# Note: Adjust port and rules based on your deployment configuration
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


