CVE-2025-9959: Smolagents Sandbox Escape RCE Vulnerability

CVE-2025-9959 Overview

CVE-2025-9959 is a code injection vulnerability affecting HuggingFace's smolagents library. The vulnerability stems from incomplete validation of Python dunder (double underscore) attributes, which allows an attacker to escape from the Local Python execution environment sandbox enforced by smolagents. Exploitation requires a Prompt Injection attack to trick the AI agent into generating and executing malicious code, potentially leading to arbitrary code execution outside the sandboxed environment.

Critical Impact
An attacker can leverage prompt injection to escape the Python sandbox and execute arbitrary code on the host system, potentially compromising confidentiality and integrity of the underlying infrastructure.

Affected Products

HuggingFace smolagents (versions prior to patch)

Discovery Timeline

2025-09-03 - CVE-2025-9959 published to NVD
2026-04-15 - Last updated in NVD database

Technical Details for CVE-2025-9959

Vulnerability Analysis

This vulnerability is classified under CWE-94 (Improper Control of Generation of Code - Code Injection). The smolagents library provides a Local Python execution environment that is intended to sandbox code executed by AI agents. However, the sandbox implementation contains an incomplete validation mechanism for Python dunder attributes (special attributes with double underscores like __class__, __bases__, __subclasses__, etc.).

Python dunder attributes provide powerful introspection capabilities that can be abused to traverse the object hierarchy and access restricted modules or functions. When the sandbox fails to properly validate all dunder attribute access patterns, an attacker can craft payloads that leverage these attributes to break out of the restricted execution environment.

Root Cause

The root cause of this vulnerability lies in the incomplete blocklist or allowlist implementation for Python dunder attributes within the smolagents sandbox. The sandbox validation logic fails to account for all possible attribute traversal techniques that can be used to access prohibited objects or modules. This is a common pitfall in Python sandbox implementations, as the language's dynamic nature and rich introspection capabilities make it difficult to fully restrict code execution.

Attack Vector

The attack requires an initial prompt injection to manipulate the AI agent into generating malicious code. Once the agent is tricked into creating code that utilizes specific dunder attribute traversal patterns, the malicious code can escape the sandbox. The attack is network-accessible and requires user interaction (the victim must interact with a compromised prompt or malicious input). Upon successful exploitation, an attacker could achieve code execution with the privileges of the process running the smolagents library.

The exploitation flow typically involves:

Crafting a malicious prompt that instructs the agent to generate specific Python code
The agent generates code containing dunder attribute traversal patterns
The sandbox fails to detect the malicious patterns during validation
The code executes and escapes the sandbox, gaining access to restricted functionality

For detailed technical information about the exploitation mechanism, see the JFrog Vulnerability Report.

Detection Methods for CVE-2025-9959

Indicators of Compromise

Unusual Python code patterns in agent logs containing dunder attribute chains (e.g., __class__.__bases__.__subclasses__)
Evidence of prompt injection attempts in user input logs
Unexpected process spawning or file system access from the smolagents process
Anomalous network connections originating from the agent execution environment

Detection Strategies

Implement input validation and sanitization for all prompts submitted to AI agents
Monitor agent-generated code for suspicious dunder attribute access patterns
Deploy runtime application self-protection (RASP) solutions to detect sandbox escape attempts
Enable verbose logging for the smolagents execution environment to capture code execution traces

Monitoring Recommendations

Configure alerting for code execution events containing known sandbox escape patterns
Monitor process behavior for the smolagents application using endpoint detection and response (EDR) solutions
Implement anomaly detection for file system and network activity from AI agent processes
Review agent interaction logs regularly for signs of prompt injection attacks

How to Mitigate CVE-2025-9959

Immediate Actions Required

Update smolagents to the latest patched version that addresses the dunder attribute validation issue
Review and audit any AI agent deployments using smolagents for signs of compromise
Implement additional input validation layers for prompts submitted to agents
Consider temporarily disabling or restricting access to the Local Python execution environment until patched

Patch Information

A fix for this vulnerability has been merged into the smolagents repository. Organizations should update to the patched version as referenced in the GitHub Pull Request #1551. Review the pull request for specific implementation details of the fix.

Workarounds

Implement strict prompt filtering to detect and block potential prompt injection attempts before they reach the agent
Deploy network segmentation to isolate AI agent execution environments from critical infrastructure
Use containerization or virtual machines to provide an additional layer of isolation for the execution environment
Disable the Local Python executor and use alternative execution backends if available

bash

# Example: Restrict smolagents process with containerization
docker run --rm \
  --security-opt=no-new-privileges \
  --cap-drop=ALL \
  --read-only \
  --network=none \
  your-smolagents-container