CVE-2023-3765: Lfprojects Mlflow Path Traversal Flaw

CVE-2023-3765 Overview

CVE-2023-3765 is a critical Absolute Path Traversal vulnerability affecting GitHub repository mlflow/mlflow prior to version 2.5.0. This vulnerability allows unauthenticated remote attackers to traverse the file system and potentially read, write, or delete arbitrary files on the affected system. Given MLflow's widespread use as an open-source platform for managing machine learning lifecycles, this vulnerability presents significant risk to organizations using affected versions.

Critical Impact
This path traversal vulnerability enables attackers to bypass directory restrictions and access arbitrary files on the system, potentially leading to complete system compromise, data exfiltration, or remote code execution through file manipulation.

Affected Products

LF Projects MLflow versions prior to 2.5.0
Deployments running on Microsoft Windows operating systems
Any MLflow installation utilizing the vulnerable PyFuncBackend CLI functionality

Discovery Timeline

2023-07-19 - CVE-2023-3765 published to NVD
2024-11-21 - Last updated in NVD database

Technical Details for CVE-2023-3765

Vulnerability Analysis

This vulnerability falls under CWE-36 (Absolute Path Traversal), where the application fails to properly sanitize user-supplied input before using it in file path operations. The root cause lies in the PyFuncBackend CLI component, which accepts path parameters without adequate validation, allowing attackers to specify absolute paths that escape the intended directory structure.

The vulnerability is particularly dangerous because it can be exploited remotely over the network without any authentication or user interaction. The scope is changed, meaning a successful exploit can affect resources beyond the vulnerable component itself, potentially compromising the entire host system and any data accessible to the MLflow process.

Root Cause

The vulnerability originates in the mlflow/pyfunc/backend.py module, where input parameters such as --input-path and --output-path are passed to file operations without proper sanitization. Prior to the fix, the application failed to validate whether supplied paths were constrained to the expected working directory, allowing absolute paths to be specified that could traverse outside the application's sandboxed environment.

The security patch introduces proper input handling through the shlex module and refactors the prediction backend to use a safer argument parsing mechanism that prevents path manipulation attacks.

Attack Vector

The attack vector is network-based, requiring no authentication or user interaction. An attacker can exploit this vulnerability by sending malicious requests to the MLflow server with crafted path parameters containing absolute file paths. This allows:

Reading sensitive configuration files and credentials
Writing malicious content to arbitrary locations
Overwriting critical system or application files
Potential remote code execution through file manipulation

The security patch addresses this by restructuring the PyFuncBackend to use a dedicated subprocess script with proper argument parsing:

python

"""
This script should be executed in a fresh python interpreter process using `subprocess`.
"""
import argparse

from mlflow.pyfunc.scoring_server import _predict


def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--model-uri", required=True)
    parser.add_argument("--input-path", required=False)
    parser.add_argument("--output-path", required=False)
    parser.add_argument("--content-type", required=True)
    return parser.parse_args()


def main():
    args = parse_args()
    _predict(
        model_uri=args.model_uri,
        input_path=args.input_path if args.input_path else None,
        output_path=args.output_path if args.output_path else None,
        content_type=args.content_type,
    )


if __name__ == "__main__":
    main()

Source: GitHub Commit Details

The patch also adds the shlex module for proper shell command parsing:

python

 import pathlib
 import subprocess
 import posixpath
+import shlex
 import sys
 import warnings
 import ctypes

Source: GitHub Commit Details

Detection Methods for CVE-2023-3765

Indicators of Compromise

Unusual file access patterns in MLflow server logs showing absolute path references
Unexpected read or write operations to system files outside the MLflow working directory
Web server access logs containing path traversal sequences (e.g., ../, absolute paths like /etc/ or C:\)
Anomalous process behavior from MLflow subprocess operations accessing sensitive directories

Detection Strategies

Implement file integrity monitoring (FIM) on critical system directories to detect unauthorized access or modifications
Configure web application firewalls (WAF) to detect and block requests containing path traversal patterns
Enable detailed logging for MLflow API endpoints and monitor for suspicious path parameters
Deploy endpoint detection and response (EDR) solutions to identify abnormal file system access from MLflow processes

Monitoring Recommendations

Set up alerts for file access attempts outside the designated MLflow data directories
Monitor network traffic to MLflow servers for requests with unusually long or suspicious path parameters
Implement log aggregation and analysis to correlate potential exploitation attempts across multiple systems
Regularly audit MLflow server access logs for patterns indicative of directory traversal attempts

How to Mitigate CVE-2023-3765

Immediate Actions Required

Upgrade MLflow to version 2.5.0 or later immediately
Audit existing MLflow deployments to identify any instances running vulnerable versions
Review server logs for any historical evidence of exploitation attempts
Implement network segmentation to limit MLflow server access to authorized users and systems only

Patch Information

The vulnerability has been addressed in MLflow version 2.5.0. The fix involves restructuring the PyFuncBackend to use a dedicated subprocess script with proper argument parsing through the argparse module, along with the addition of the shlex module for safe shell command handling.

Patch Commit:6dde93758d42455cb90ef324407919ed67668b9b

For detailed information, refer to the GitHub Commit Details and the Huntr Bounty Report.

Workarounds

If immediate patching is not possible, restrict network access to MLflow servers to trusted IP addresses only
Implement a reverse proxy with input validation rules to filter out malicious path parameters
Run MLflow with minimal file system permissions to limit the impact of successful exploitation
Consider containerizing MLflow deployments with restricted volume mounts to contain potential path traversal attacks

bash

# Example: Restricting MLflow network access using firewall rules
# Allow access only from trusted network (adjust IP range as needed)
iptables -A INPUT -p tcp --dport 5000 -s 10.0.0.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -j DROP

# Verify MLflow version
pip show mlflow | grep Version

# Upgrade MLflow to patched version
pip install --upgrade mlflow>=2.5.0

CVE-2023-3765 Overview

Critical Impact
This path traversal vulnerability enables attackers to bypass directory restrictions and access arbitrary files on the system, potentially leading to complete system compromise, data exfiltration, or remote code execution through file manipulation.

Affected Products

LF Projects MLflow versions prior to 2.5.0
Deployments running on Microsoft Windows operating systems
Any MLflow installation utilizing the vulnerable PyFuncBackend CLI functionality

Discovery Timeline

2023-07-19 - CVE-2023-3765 published to NVD
2024-11-21 - Last updated in NVD database

Technical Details for CVE-2023-3765

Vulnerability Analysis

Root Cause

The security patch introduces proper input handling through the shlex module and refactors the prediction backend to use a safer argument parsing mechanism that prevents path manipulation attacks.

Attack Vector

Reading sensitive configuration files and credentials
Writing malicious content to arbitrary locations
Overwriting critical system or application files
Potential remote code execution through file manipulation

The security patch addresses this by restructuring the PyFuncBackend to use a dedicated subprocess script with proper argument parsing:

python

"""
This script should be executed in a fresh python interpreter process using `subprocess`.
"""
import argparse

from mlflow.pyfunc.scoring_server import _predict


def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--model-uri", required=True)
    parser.add_argument("--input-path", required=False)
    parser.add_argument("--output-path", required=False)
    parser.add_argument("--content-type", required=True)
    return parser.parse_args()


def main():
    args = parse_args()
    _predict(
        model_uri=args.model_uri,
        input_path=args.input_path if args.input_path else None,
        output_path=args.output_path if args.output_path else None,
        content_type=args.content_type,
    )


if __name__ == "__main__":
    main()

Source: GitHub Commit Details

The patch also adds the shlex module for proper shell command parsing:

python

 import pathlib
 import subprocess
 import posixpath
+import shlex
 import sys
 import warnings
 import ctypes

Source: GitHub Commit Details

Detection Methods for CVE-2023-3765

Indicators of Compromise

Unusual file access patterns in MLflow server logs showing absolute path references
Unexpected read or write operations to system files outside the MLflow working directory
Web server access logs containing path traversal sequences (e.g., ../, absolute paths like /etc/ or C:\)
Anomalous process behavior from MLflow subprocess operations accessing sensitive directories

Detection Strategies

Implement file integrity monitoring (FIM) on critical system directories to detect unauthorized access or modifications
Configure web application firewalls (WAF) to detect and block requests containing path traversal patterns
Enable detailed logging for MLflow API endpoints and monitor for suspicious path parameters
Deploy endpoint detection and response (EDR) solutions to identify abnormal file system access from MLflow processes

Monitoring Recommendations

Set up alerts for file access attempts outside the designated MLflow data directories
Monitor network traffic to MLflow servers for requests with unusually long or suspicious path parameters
Implement log aggregation and analysis to correlate potential exploitation attempts across multiple systems
Regularly audit MLflow server access logs for patterns indicative of directory traversal attempts

How to Mitigate CVE-2023-3765

Immediate Actions Required

Upgrade MLflow to version 2.5.0 or later immediately
Audit existing MLflow deployments to identify any instances running vulnerable versions
Review server logs for any historical evidence of exploitation attempts
Implement network segmentation to limit MLflow server access to authorized users and systems only

Patch Information

Patch Commit:6dde93758d42455cb90ef324407919ed67668b9b

For detailed information, refer to the GitHub Commit Details and the Huntr Bounty Report.

Workarounds

If immediate patching is not possible, restrict network access to MLflow servers to trusted IP addresses only
Implement a reverse proxy with input validation rules to filter out malicious path parameters
Run MLflow with minimal file system permissions to limit the impact of successful exploitation
Consider containerizing MLflow deployments with restricted volume mounts to contain potential path traversal attacks

bash

# Example: Restricting MLflow network access using firewall rules
# Allow access only from trusted network (adjust IP range as needed)
iptables -A INPUT -p tcp --dport 5000 -s 10.0.0.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -j DROP

# Verify MLflow version
pip show mlflow | grep Version

# Upgrade MLflow to patched version
pip install --upgrade mlflow>=2.5.0

CVE-2023-3765: Lfprojects Mlflow Path Traversal Flaw

CVE-2023-3765 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2023-3765

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2023-3765

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2023-3765

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2023-3765: Lfprojects Mlflow Path Traversal Flaw

CVE-2023-3765 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2023-3765

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2023-3765

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2023-3765

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform