CVE-2025-32375 Overview
CVE-2025-32375 is an insecure deserialization vulnerability affecting BentoML, a popular Python library for building online serving systems optimized for AI applications and model inference. The vulnerability exists in BentoML's runner server component, where improper handling of deserialized data allows attackers to execute arbitrary code on vulnerable systems. By crafting malicious POST requests with specific headers and parameters, unauthenticated remote attackers can achieve remote code execution, gaining initial access and potentially exposing sensitive information on the target server.
Critical Impact
This insecure deserialization vulnerability enables unauthenticated remote code execution on BentoML runner servers, potentially compromising AI/ML infrastructure and exposing sensitive model data and server information.
Affected Products
- BentoML versions prior to 1.4.8
- BentoML runner server deployments
- AI/ML serving infrastructure using vulnerable BentoML versions
Discovery Timeline
- 2025-04-09 - CVE-2025-32375 published to NVD
- 2025-04-22 - Last updated in NVD database
Technical Details for CVE-2025-32375
Vulnerability Analysis
This vulnerability falls under CWE-502 (Deserialization of Untrusted Data), a well-known class of security weaknesses that can lead to severe consequences including remote code execution. The BentoML runner server processes incoming POST requests without adequately validating or sanitizing the deserialized data, creating an opportunity for attackers to inject malicious payloads.
In Python-based applications, insecure deserialization commonly occurs when using modules like pickle or similar serialization libraries that can execute arbitrary code during the deserialization process. When an attacker controls the data being deserialized, they can craft payloads that execute system commands, establish reverse shells, or perform other malicious actions upon deserialization.
The impact of successful exploitation is significant—attackers gain the ability to execute arbitrary code with the privileges of the BentoML service, potentially compromising the entire AI/ML serving infrastructure, accessing trained models, and pivoting to other systems within the network.
Root Cause
The root cause of CVE-2025-32375 is the insecure handling of user-supplied data during deserialization in BentoML's runner server. The application fails to properly validate or restrict the types of objects that can be deserialized from incoming requests. This allows attackers to embed malicious serialized objects that execute arbitrary code when processed by the server's deserialization routines.
Attack Vector
The attack is conducted over the network and requires no authentication or user interaction. An attacker can exploit this vulnerability by sending specially crafted POST requests to the BentoML runner server endpoint. The malicious payload is embedded in specific headers and parameters, which are then processed by the vulnerable deserialization code.
The exploitation process involves:
- Identifying a BentoML runner server exposed to the network
- Crafting a malicious serialized payload designed for Python deserialization
- Sending the payload via POST request with specific headers and parameters
- The server deserializes the malicious data, triggering arbitrary code execution
For detailed technical information about this vulnerability, refer to the GitHub Security Advisory GHSA-7v4r-c989-xh26.
Detection Methods for CVE-2025-32375
Indicators of Compromise
- Unusual POST requests to BentoML runner server endpoints with malformed or suspicious headers
- Unexpected child processes spawned by the BentoML service
- Anomalous network connections initiated from the BentoML server to external addresses
- Evidence of serialized Python objects in request payloads (base64-encoded pickle data patterns)
Detection Strategies
- Monitor HTTP traffic to BentoML endpoints for requests containing suspicious serialized data patterns
- Implement application-level logging to capture all deserialization operations and flag anomalies
- Deploy Web Application Firewall (WAF) rules to detect and block known Python deserialization payloads
- Use endpoint detection and response (EDR) solutions to identify unauthorized code execution from the BentoML process
Monitoring Recommendations
- Enable verbose logging on BentoML runner servers and forward logs to a centralized SIEM platform
- Set up alerts for unexpected process creation or network connections from BentoML service accounts
- Monitor for signs of data exfiltration or lateral movement originating from ML infrastructure
- Implement file integrity monitoring on BentoML installation directories
How to Mitigate CVE-2025-32375
Immediate Actions Required
- Upgrade BentoML to version 1.4.8 or later immediately
- Audit network exposure of BentoML runner servers and restrict access using firewalls or network segmentation
- Review logs for any signs of exploitation attempts targeting BentoML endpoints
- Implement network-level access controls to limit who can communicate with runner server endpoints
Patch Information
BentoML has addressed this vulnerability in version 1.4.8. Organizations should upgrade to this version or later to remediate the insecure deserialization issue. The patch information and security advisory are available at the BentoML GitHub Security Advisory.
Workarounds
- Place BentoML runner servers behind a reverse proxy with strict request filtering
- Implement network segmentation to isolate ML serving infrastructure from untrusted networks
- Use authentication mechanisms at the network or application layer to restrict access to runner endpoints
- Consider deploying a WAF with rules to detect and block serialized Python object payloads
# Network isolation example using iptables
# Restrict access to BentoML runner server (default port) to trusted networks only
iptables -A INPUT -p tcp --dport 3000 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 3000 -j DROP
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


