CVE-2026-4137: MLflow Insecure Permissions RCE Vulnerability

CVE-2026-4137 Overview

CVE-2026-4137 is an insecure file permissions vulnerability in mlflow/mlflow versions prior to 3.11.0. The get_or_create_nfs_tmp_dir() function in mlflow/utils/file_utils.py creates temporary directories with world-writable permissions (0o777). The _create_model_downloading_tmp_dir() function in mlflow/pyfunc/__init__.py creates directories with group-writable permissions (0o770). Local attackers can tamper with cloudpickle-serialized model artifacts and achieve arbitrary code execution when those artifacts are deserialized via cloudpickle.load(). The issue continues the vulnerability class addressed in CVE-2025-10279, which was only partially fixed [CWE-378].

Critical Impact
Local attackers on shared NFS environments such as Databricks can replace serialized model artifacts and execute arbitrary code in the context of users loading those models.

Affected Products

mlflow/mlflow versions prior to 3.11.0
Databricks deployments where NFS is enabled by default
Spark UDF workloads relying on MLflow temporary directories

Discovery Timeline

2026-05-18 - CVE-2026-4137 published to NVD
2026-05-19 - Last updated in NVD database

Technical Details for CVE-2026-4137

Vulnerability Analysis

MLflow creates working directories used to stage model artifacts before they are loaded into Python processes. To support Spark User-Defined Function (UDF) access across processes, the project chose overly permissive modes. get_or_create_nfs_tmp_dir() applied chmod 0o777 to NFS-backed temporary directories, allowing any local user to write to them. _create_model_downloading_tmp_dir() applied chmod 0o770, granting any group member the same write capability.

MLflow uses cloudpickle to serialize and deserialize Python model objects. When a writable staging directory holds a cloudpickle artifact, an attacker can overwrite it with a malicious payload. Loading the model invokes cloudpickle.load(), which executes attacker-controlled code in the victim process.

Root Cause

The root cause is improper permission assignment on temporary directories [CWE-378]. The developers widened permissions beyond what Spark UDF cross-process access required. The fix in CVE-2025-10279 reduced exposure but did not address both vulnerable code paths.

Attack Vector

A local user on the shared host or NFS mount enumerates MLflow staging directories. The attacker replaces a cloudpickle artifact with a crafted object whose __reduce__ method executes arbitrary commands. When another user or service loads the model, code executes under that account.

python

# Patch in mlflow/utils/file_utils.py
    else:
        tmp_nfs_dir = tempfile.mkdtemp(dir=nfs_root_dir)
        # mkdtemp creates a directory with permission 0o700
-        # change it to be 0o777 to ensure it can be seen in spark UDF
-        os.chmod(tmp_nfs_dir, 0o777)
+        # For Spark UDFs, we need to make it accessible to other processes
+        # Use 0o750 (owner: rwx, group: r-x, others: None) instead of 0o777
+        os.chmod(tmp_nfs_dir, 0o750)
        atexit.register(shutil.rmtree, tmp_nfs_dir, ignore_errors=True)

    return tmp_nfs_dir

Source: GitHub Commit 1dcbb0c

python

# Patch in mlflow/pyfunc/__init__.py
    tmp_model_dir = tempfile.mkdtemp(dir=root_model_cache_dir)
    # mkdtemp creates a directory with permission 0o700
-    # change it to be 0o770 to ensure it can be seen in spark UDF
-    os.chmod(tmp_model_dir, 0o770)
+    # For Spark UDFs, we need to make it accessible to other processes
+    # Use 0o750 (owner: rwx, group: r-x, others: None) instead of 0o770
+    os.chmod(tmp_model_dir, 0o750)
    return tmp_model_dir

Source: GitHub Commit 1dcbb0c

Detection Methods for CVE-2026-4137

Indicators of Compromise

Temporary directories under MLflow NFS roots with mode 0o777 or 0o770
Unexpected modifications to cloudpickle .pkl artifacts inside MLflow staging directories
Python processes spawning shells or network connections immediately after cloudpickle.load() calls
MLflow model load events originating from users other than the artifact owner

Detection Strategies

Audit file modes on MLflow temporary directories using find <nfs_root> -type d -perm /o+w
Monitor inotify or audit events for writes to MLflow cache directories by users who are not the directory owner
Compare cloudpickle artifact hashes against expected values from the MLflow tracking server
Inspect process trees for child processes of Python workers loading MLflow models

Monitoring Recommendations

Log all directory creation events from MLflow processes and alert when permissions exceed 0o750
Track Spark UDF invocations that load remote artifacts and correlate with file modification timestamps
Enable Linux audit rules on NFS mount points used by MLflow and Databricks

How to Mitigate CVE-2026-4137

Immediate Actions Required

Upgrade mlflow to version 3.11.0 or later on all hosts and Databricks clusters
Inventory existing MLflow temporary directories and tighten permissions to 0o750
Restrict NFS mount access to trusted users and groups only
Validate cloudpickle artifacts against trusted hashes before loading

Patch Information

The fix in commit 1dcbb0c2fbd1f446c328830e601ca13a28219b8a replaces 0o777 and 0o770chmod calls with 0o750 in both mlflow/utils/file_utils.py and mlflow/pyfunc/__init__.py. Owners retain full access, groups gain read and execute, and other users have no access. See the GitHub Commit Update and the Huntr Bounty Report for details.

Workarounds

Manually apply chmod 0o750 to MLflow NFS and model cache directories after creation
Run MLflow workloads under dedicated service accounts with isolated group memberships
Disable NFS-backed temporary directories where Spark UDF support is not required
Use signed model registries and verify artifact integrity before deserialization

bash

# Configuration example: tighten MLflow temp directory permissions
MLFLOW_NFS_ROOT=/mnt/mlflow-nfs
find "$MLFLOW_NFS_ROOT" -type d -perm /o+w -exec chmod 0750 {} \;
find "$MLFLOW_NFS_ROOT" -type d -perm /g+w -exec chmod 0750 {} \;
pip install --upgrade 'mlflow>=3.11.0'