CVE-2025-15031 Overview
A path traversal vulnerability exists in MLflow's pyfunc extraction process that allows for arbitrary file writes due to improper handling of tar archive entries. The vulnerability stems from the use of tarfile.extractall without proper path validation, enabling attackers to craft malicious tar.gz files containing .. (dot-dot-slash) sequences or absolute paths to escape the intended extraction directory. This vulnerability poses significant risk in multi-tenant environments or any deployment where untrusted artifacts may be ingested.
Critical Impact
Successful exploitation can lead to arbitrary file overwrites on the target system, potentially enabling remote code execution through strategic file placement such as overwriting configuration files, cron jobs, or application binaries.
Affected Products
- MLflow (latest version and potentially earlier versions)
- MLflow pyfunc model extraction component
- Environments ingesting untrusted MLflow artifacts
Discovery Timeline
- 2026-03-18 - CVE CVE-2025-15031 published to NVD
- 2026-03-19 - Last updated in NVD database
Technical Details for CVE-2025-15031
Vulnerability Analysis
This vulnerability is classified as CWE-22 (Path Traversal) and affects the artifact extraction functionality within MLflow's pyfunc module. The core issue lies in the unsafe use of Python's tarfile.extractall() method without implementing proper validation of archive member paths before extraction.
When MLflow processes tar.gz archives containing pyfunc models, it fails to sanitize file paths within the archive. An attacker can craft a malicious archive with entries containing path traversal sequences (../) or absolute paths, allowing files to be written outside the intended extraction directory.
The vulnerability requires adjacent network access for exploitation, meaning an attacker must have some level of network proximity to the target system or the ability to submit artifacts to a shared MLflow instance. In multi-tenant MLflow deployments where users can upload model artifacts, this vulnerability becomes particularly dangerous as a malicious user could compromise the underlying infrastructure.
Root Cause
The root cause is the direct use of tarfile.extractall() without implementing path validation checks on archive members. Python's tarfile module does not inherently prevent extraction of files with malicious paths, leaving this responsibility to the application developer. The MLflow pyfunc extraction code fails to validate that extracted file paths remain within the intended destination directory, violating the principle of secure archive handling.
Attack Vector
The attack requires adjacent network access, meaning the attacker needs to be able to submit malicious tar.gz archives to an MLflow instance. The attack flow involves:
- The attacker crafts a malicious tar.gz archive containing entries with path traversal sequences (e.g., ../../../../etc/cron.d/malicious) or absolute paths
- The malicious archive is uploaded to MLflow as a pyfunc model artifact
- When the model is loaded or the archive is extracted, the tarfile.extractall() function processes the malicious entries
- Files are written outside the intended extraction directory, potentially overwriting critical system files or placing executable content in sensitive locations
The vulnerability does not require authentication in default configurations, though deployment-specific access controls may apply. Exploitation does not require user interaction once the malicious artifact is submitted.
Detection Methods for CVE-2025-15031
Indicators of Compromise
- Unexpected file modifications in system directories such as /etc, /var, or application configuration paths
- Tar archive extractions writing to paths outside designated MLflow artifact directories
- Presence of files with path traversal patterns in uploaded MLflow model archives
- Unusual cron jobs, startup scripts, or configuration file modifications coinciding with MLflow model operations
Detection Strategies
- Monitor file system activity during MLflow model extraction operations for writes outside expected directories
- Implement file integrity monitoring (FIM) on critical system directories
- Analyze uploaded tar.gz archives for path traversal patterns before processing
- Review MLflow server logs for model artifact upload and extraction events from untrusted sources
Monitoring Recommendations
- Deploy endpoint detection and response (EDR) solutions to monitor for suspicious file write patterns
- Configure audit logging for file operations in directories where MLflow extracts artifacts
- Implement network segmentation to limit adjacent network access to MLflow deployments
- Establish baselines for normal MLflow artifact extraction behavior to detect anomalies
How to Mitigate CVE-2025-15031
Immediate Actions Required
- Restrict artifact upload capabilities to trusted users only
- Implement input validation on all uploaded archives before extraction
- Isolate MLflow instances processing untrusted artifacts in sandboxed environments
- Review recently uploaded model artifacts for signs of malicious path traversal sequences
Patch Information
Refer to the Huntr Security Bounty for the latest information on patches and remediation guidance from the security researchers who discovered this vulnerability. Monitor MLflow's official repository for security updates addressing this path traversal issue.
Workarounds
- Implement a wrapper around tar extraction that validates all member paths stay within the destination directory
- Use containerization to limit the impact of arbitrary file writes
- Deploy MLflow in read-only file systems where feasible, with only specific directories writable
- Enable strict access controls on directories accessible from MLflow extraction paths
The recommended approach for safe tar extraction involves checking each archive member before extraction:
# Safe tar extraction pattern - validate paths before extractall
import os
import tarfile
def safe_extract(tar_path, destination):
with tarfile.open(tar_path, 'r:gz') as tar:
for member in tar.getmembers():
member_path = os.path.join(destination, member.name)
abs_destination = os.path.abspath(destination)
abs_member_path = os.path.abspath(member_path)
# Ensure extracted path is within destination
if not abs_member_path.startswith(abs_destination + os.sep):
raise Exception(f"Attempted path traversal: {member.name}")
tar.extractall(destination)
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


