CVE-2023-2780: Mlflow Path Traversal Vulnerability

CVE-2023-2780 Overview

CVE-2023-2780 is a path traversal vulnerability in MLflow, the open-source platform for the machine learning lifecycle, prior to version 2.3.1. This vulnerability allows attackers to use backslash path traversal sequences (\..\..\filename) to access arbitrary files on the system hosting MLflow instances. The flaw exists in how MLflow handles file URI model version sources, enabling unauthorized file system access through malicious path manipulation.

Critical Impact
This path traversal vulnerability enables unauthenticated remote attackers to read arbitrary files on the MLflow server, potentially exposing sensitive machine learning models, configuration files, credentials, and system data.

Affected Products

MLflow versions prior to 2.3.1
LF Projects MLflow (all distributions before the security patch)
Self-hosted MLflow tracking server deployments

Discovery Timeline

2023-05-17 - CVE-2023-2780 published to NVD
2024-11-21 - Last updated in NVD database

Technical Details for CVE-2023-2780

Vulnerability Analysis

This path traversal vulnerability (CWE-29) stems from improper validation of file paths when MLflow processes model version sources. The platform failed to adequately sanitize file URI inputs, allowing attackers to break out of intended directory boundaries using backslash-encoded traversal sequences. When a malicious file URI is provided as a model version source, the application processes the path without proper validation, granting access to files outside the intended model storage directory.

The vulnerability is particularly dangerous in MLflow deployments because the tracking server often has access to sensitive ML artifacts, experiment data, and potentially cloud credentials used for model storage backends. An attacker could leverage this to exfiltrate trained models, hyperparameters, or access credentials stored in configuration files.

Root Cause

The root cause lies in MLflow's handling of file URIs as model version sources without adequate path validation. Prior to the fix, the application accepted file URIs that could contain directory traversal sequences, allowing users to specify paths that escape the intended model artifact directory. The lack of input sanitization on the source parameter for model versions enabled arbitrary file system access.

Attack Vector

The attack is network-based and requires no authentication or user interaction. An attacker can craft a malicious request to the MLflow tracking server specifying a file URI with path traversal sequences as a model version source. The server processes this URI and returns content from arbitrary file system locations accessible to the MLflow process.

The following patch demonstrates how MLflow addressed this vulnerability by introducing a security control to disable file URI model version sources by default:

python

MLFLOW_DEFAULT_PREDICTION_DEVICE = _EnvironmentVariable(
    "MLFLOW_DEFAULT_PREDICTION_DEVICE", str, None
)

#: Specifies whether or not to allow using a file URI as a model version source.
#: Please be aware that setting this environment variable to True is potentially risky
#: because it can allow access to arbitrary files on the specified filesystem
#: (default: ``False``).
MLFLOW_ALLOW_FILE_URI_AS_MODEL_VERSION_SOURCE = _BooleanEnvironmentVariable(
    "MLFLOW_ALLOW_FILE_URI_AS_MODEL_VERSION_SOURCE", False
)

Source: GitHub Commit Details

The handler modification adds the necessary import and check for file URIs:

python

from mlflow.utils.proto_json_utils import message_to_json, parse_dict
from mlflow.utils.validation import _validate_batch_log_api_req
from mlflow.utils.string_utils import is_string_type
from mlflow.utils.uri import is_local_uri, is_file_uri
from mlflow.utils.file_utils import local_file_uri_to_path
from mlflow.tracking.registry import UnsupportedModelRegistryStoreURIException
from mlflow.environment_variables import MLFLOW_ALLOW_FILE_URI_AS_MODEL_VERSION_SOURCE

_logger = logging.getLogger(__name__)
_tracking_store = None

Source: GitHub Commit Details

Detection Methods for CVE-2023-2780

Indicators of Compromise

HTTP requests to MLflow API endpoints containing backslash-encoded path traversal sequences (\..\..\)
Requests to model registry endpoints with file:// URI schemes containing unusual paths
Access log entries showing requests attempting to access files outside the MLflow artifact directory
Unexpected file read operations by the MLflow process targeting system configuration files

Detection Strategies

Monitor MLflow access logs for requests containing path traversal patterns such as \..\, /../, or URL-encoded variants
Implement network-level detection rules for HTTP requests containing file:// URIs with traversal sequences
Deploy application-layer firewalls to block requests with suspicious path patterns targeting MLflow endpoints
Configure SentinelOne to alert on anomalous file access patterns from the MLflow server process

Monitoring Recommendations

Enable verbose logging on MLflow tracking servers to capture all model version source parameters
Set up alerts for file system access events from the MLflow process to sensitive directories like /etc/, /home/, or Windows system paths
Monitor for unusual outbound data transfers from MLflow server hosts that may indicate data exfiltration
Implement file integrity monitoring on MLflow deployment hosts to detect unauthorized configuration changes

How to Mitigate CVE-2023-2780

Immediate Actions Required

Upgrade MLflow to version 2.3.1 or later immediately
If upgrade is not immediately possible, ensure the MLFLOW_ALLOW_FILE_URI_AS_MODEL_VERSION_SOURCE environment variable is not set to True
Review MLflow access logs for evidence of exploitation attempts
Restrict network access to MLflow tracking servers to trusted hosts and users only

Patch Information

The vulnerability has been addressed in MLflow version 2.3.1. The fix introduces a new environment variable MLFLOW_ALLOW_FILE_URI_AS_MODEL_VERSION_SOURCE which defaults to False, effectively disabling the ability to use file URIs as model version sources. Organizations should apply the patch by upgrading to MLflow 2.3.1 or later. The security commit is available at GitHub Commit Details. Additional details about the vulnerability disclosure can be found at the Huntr Bug Bounty Information page.

Workarounds

Deploy MLflow behind a reverse proxy that filters requests containing path traversal sequences
Implement network segmentation to limit MLflow server access to trusted internal networks only
Run MLflow with minimal file system permissions using a dedicated service account with restricted access
Use container isolation for MLflow deployments to limit the blast radius of potential file system access

bash

# Configuration example
# Ensure file URI model version sources are disabled (default behavior in 2.3.1+)
export MLFLOW_ALLOW_FILE_URI_AS_MODEL_VERSION_SOURCE=False

# Run MLflow with restricted permissions
mlflow server --host 127.0.0.1 --port 5000 --backend-store-uri postgresql://user:pass@db/mlflow

# For containerized deployments, use read-only file systems where possible
docker run --read-only -v /mlflow/artifacts:/mlflow/artifacts:rw mlflow/mlflow