CVE-2025-32375: BentoML Insecure Deserialization RCE Flaw

CVE-2025-32375 Overview

CVE-2025-32375 is an insecure deserialization vulnerability affecting BentoML, a popular Python library for building online serving systems optimized for AI applications and model inference. The vulnerability exists in BentoML's runner server component, where improper handling of deserialized data allows attackers to execute arbitrary code on vulnerable systems. By crafting malicious POST requests with specific headers and parameters, unauthenticated remote attackers can achieve remote code execution, gaining initial access and potentially exposing sensitive information on the target server.

Critical Impact
This insecure deserialization vulnerability enables unauthenticated remote code execution on BentoML runner servers, potentially compromising AI/ML infrastructure and exposing sensitive model data and server information.

Affected Products

BentoML versions prior to 1.4.8
BentoML runner server deployments
AI/ML serving infrastructure using vulnerable BentoML versions

Discovery Timeline

2025-04-09 - CVE-2025-32375 published to NVD
2025-04-22 - Last updated in NVD database

Technical Details for CVE-2025-32375

Vulnerability Analysis

This vulnerability falls under CWE-502 (Deserialization of Untrusted Data), a well-known class of security weaknesses that can lead to severe consequences including remote code execution. The BentoML runner server processes incoming POST requests without adequately validating or sanitizing the deserialized data, creating an opportunity for attackers to inject malicious payloads.

In Python-based applications, insecure deserialization commonly occurs when using modules like pickle or similar serialization libraries that can execute arbitrary code during the deserialization process. When an attacker controls the data being deserialized, they can craft payloads that execute system commands, establish reverse shells, or perform other malicious actions upon deserialization.

The impact of successful exploitation is significant—attackers gain the ability to execute arbitrary code with the privileges of the BentoML service, potentially compromising the entire AI/ML serving infrastructure, accessing trained models, and pivoting to other systems within the network.

Root Cause

The root cause of CVE-2025-32375 is the insecure handling of user-supplied data during deserialization in BentoML's runner server. The application fails to properly validate or restrict the types of objects that can be deserialized from incoming requests. This allows attackers to embed malicious serialized objects that execute arbitrary code when processed by the server's deserialization routines.

Attack Vector

The attack is conducted over the network and requires no authentication or user interaction. An attacker can exploit this vulnerability by sending specially crafted POST requests to the BentoML runner server endpoint. The malicious payload is embedded in specific headers and parameters, which are then processed by the vulnerable deserialization code.

The exploitation process involves:

Identifying a BentoML runner server exposed to the network
Crafting a malicious serialized payload designed for Python deserialization
Sending the payload via POST request with specific headers and parameters
The server deserializes the malicious data, triggering arbitrary code execution

For detailed technical information about this vulnerability, refer to the GitHub Security Advisory GHSA-7v4r-c989-xh26.

Detection Methods for CVE-2025-32375

Indicators of Compromise

Unusual POST requests to BentoML runner server endpoints with malformed or suspicious headers
Unexpected child processes spawned by the BentoML service
Anomalous network connections initiated from the BentoML server to external addresses
Evidence of serialized Python objects in request payloads (base64-encoded pickle data patterns)

Detection Strategies

Monitor HTTP traffic to BentoML endpoints for requests containing suspicious serialized data patterns
Implement application-level logging to capture all deserialization operations and flag anomalies
Deploy Web Application Firewall (WAF) rules to detect and block known Python deserialization payloads
Use endpoint detection and response (EDR) solutions to identify unauthorized code execution from the BentoML process

Monitoring Recommendations

Enable verbose logging on BentoML runner servers and forward logs to a centralized SIEM platform
Set up alerts for unexpected process creation or network connections from BentoML service accounts
Monitor for signs of data exfiltration or lateral movement originating from ML infrastructure
Implement file integrity monitoring on BentoML installation directories

How to Mitigate CVE-2025-32375

Immediate Actions Required

Upgrade BentoML to version 1.4.8 or later immediately
Audit network exposure of BentoML runner servers and restrict access using firewalls or network segmentation
Review logs for any signs of exploitation attempts targeting BentoML endpoints
Implement network-level access controls to limit who can communicate with runner server endpoints

Patch Information

BentoML has addressed this vulnerability in version 1.4.8. Organizations should upgrade to this version or later to remediate the insecure deserialization issue. The patch information and security advisory are available at the BentoML GitHub Security Advisory.

Workarounds

Place BentoML runner servers behind a reverse proxy with strict request filtering
Implement network segmentation to isolate ML serving infrastructure from untrusted networks
Use authentication mechanisms at the network or application layer to restrict access to runner endpoints
Consider deploying a WAF with rules to detect and block serialized Python object payloads

bash

# Network isolation example using iptables
# Restrict access to BentoML runner server (default port) to trusted networks only
iptables -A INPUT -p tcp --dport 3000 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 3000 -j DROP

CVE-2025-32375 Overview

Critical Impact
This insecure deserialization vulnerability enables unauthenticated remote code execution on BentoML runner servers, potentially compromising AI/ML infrastructure and exposing sensitive model data and server information.

Affected Products

BentoML versions prior to 1.4.8
BentoML runner server deployments
AI/ML serving infrastructure using vulnerable BentoML versions

Discovery Timeline

2025-04-09 - CVE-2025-32375 published to NVD
2025-04-22 - Last updated in NVD database

Technical Details for CVE-2025-32375

Vulnerability Analysis

Root Cause

Attack Vector

The exploitation process involves:

Identifying a BentoML runner server exposed to the network
Crafting a malicious serialized payload designed for Python deserialization
Sending the payload via POST request with specific headers and parameters
The server deserializes the malicious data, triggering arbitrary code execution

For detailed technical information about this vulnerability, refer to the GitHub Security Advisory GHSA-7v4r-c989-xh26.

Detection Methods for CVE-2025-32375

Indicators of Compromise

Unusual POST requests to BentoML runner server endpoints with malformed or suspicious headers
Unexpected child processes spawned by the BentoML service
Anomalous network connections initiated from the BentoML server to external addresses
Evidence of serialized Python objects in request payloads (base64-encoded pickle data patterns)

Detection Strategies

Monitor HTTP traffic to BentoML endpoints for requests containing suspicious serialized data patterns
Implement application-level logging to capture all deserialization operations and flag anomalies
Deploy Web Application Firewall (WAF) rules to detect and block known Python deserialization payloads
Use endpoint detection and response (EDR) solutions to identify unauthorized code execution from the BentoML process

Monitoring Recommendations

Enable verbose logging on BentoML runner servers and forward logs to a centralized SIEM platform
Set up alerts for unexpected process creation or network connections from BentoML service accounts
Monitor for signs of data exfiltration or lateral movement originating from ML infrastructure
Implement file integrity monitoring on BentoML installation directories

How to Mitigate CVE-2025-32375

Immediate Actions Required

Upgrade BentoML to version 1.4.8 or later immediately
Audit network exposure of BentoML runner servers and restrict access using firewalls or network segmentation
Review logs for any signs of exploitation attempts targeting BentoML endpoints
Implement network-level access controls to limit who can communicate with runner server endpoints

Patch Information

Workarounds

Place BentoML runner servers behind a reverse proxy with strict request filtering
Implement network segmentation to isolate ML serving infrastructure from untrusted networks
Use authentication mechanisms at the network or application layer to restrict access to runner endpoints
Consider deploying a WAF with rules to detect and block serialized Python object payloads

bash

# Network isolation example using iptables
# Restrict access to BentoML runner server (default port) to trusted networks only
iptables -A INPUT -p tcp --dport 3000 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 3000 -j DROP

CVE-2025-32375: BentoML Insecure Deserialization RCE Flaw

CVE-2025-32375 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-32375

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-32375

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-32375

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2025-32375: BentoML Insecure Deserialization RCE Flaw

CVE-2025-32375 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-32375

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-32375

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-32375

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform