CVE-2024-37059: MLflow Platform RCE Vulnerability

CVE-2024-37059 Overview

CVE-2024-37059 is an insecure deserialization vulnerability in the MLflow machine learning lifecycle platform. The flaw affects MLflow versions 0.5.0 and newer. When a user interacts with a maliciously crafted PyTorch model uploaded to MLflow, the platform deserializes untrusted data and executes attacker-controlled code on the host system. The issue is tracked under CWE-502 (Deserialization of Untrusted Data). Exploitation requires user interaction with the malicious model artifact but no authentication on the network path.

Critical Impact
A maliciously uploaded PyTorch model executes arbitrary code on the end user's system when loaded, leading to full compromise of MLflow client environments and any credentials or data accessible to them.

Affected Products

MLflow version 0.5.0 and newer (lfprojects:mlflow)
PyTorch model loading workflows within MLflow
Any client or server environment that downloads and deserializes MLflow-hosted PyTorch artifacts

Discovery Timeline

2024-06-04 - CVE-2024-37059 published to NVD
2025-02-03 - Last updated in NVD database

Technical Details for CVE-2024-37059

Vulnerability Analysis

MLflow is an open-source platform for managing the machine learning lifecycle, including experiment tracking, model registry, and model deployment. The PyTorch model flavor in MLflow relies on torch.load(), which internally uses Python's pickle module to reconstruct model objects from serialized files. Pickle deserialization is well-documented as unsafe when applied to untrusted input because the format permits arbitrary Python object construction through the __reduce__ protocol.

An attacker who can upload a model to an MLflow tracking server, or who can convince a user to load a model from an attacker-controlled location, can embed a Python payload inside the serialized PyTorch artifact. When a victim retrieves and loads the model, the embedded payload executes in the victim's process with full user privileges. Successful exploitation yields code execution, credential theft, lateral movement into MLOps pipelines, and tampering with downstream model training or inference.

Root Cause

The root cause is the use of Python pickle-based deserialization on model artifacts that originate from untrusted users. MLflow treats uploaded model files as data, but pickle decoding is effectively code execution. No sandboxing, signature verification, or allow-listing of permitted classes is performed before the artifact is reconstructed.

Attack Vector

The attack vector is network-based. An attacker uploads a weaponized PyTorch model to a shared MLflow instance or distributes a model file through a third-party channel. When a data scientist, training pipeline, or inference service loads the model with mlflow.pytorch.load_model(), the embedded __reduce__ gadget triggers and runs the attacker payload. User interaction is required in the form of loading the model.

No verified public exploit code is available. The vulnerability mechanism follows the standard pickle deserialization pattern: a crafted object overrides __reduce__ to return a callable such as os.system together with attacker-supplied arguments, executed automatically by the pickle loader. Refer to the HiddenLayer Security Advisory for additional technical context.

Detection Methods for CVE-2024-37059

Indicators of Compromise

Unexpected child processes spawned by Python interpreters running MLflow client code, such as shells, curl, or wget.
PyTorch model files (.pt, .pth, data.pkl inside MLflow model directories) containing pickle opcodes referencing os.system, subprocess, posix.system, or builtins.exec.
New or modified models in the MLflow registry uploaded by unverified or anonymous accounts.

Detection Strategies

Statically scan MLflow artifact stores for pickle files that reference dangerous reduce targets using tools such as picklescan or fickling.
Monitor process trees originating from mlflow, python -m mlflow, and torch.load for outbound network connections or local persistence activity.
Compare uploaded model artifact hashes against an allow-list of approved, signed models.

Monitoring Recommendations

Enable audit logging on the MLflow tracking server for all artifact upload, download, and registry mutation events.
Forward MLflow access logs and host telemetry from data science workstations and training nodes into a central detection pipeline.
Alert on mlflow.pytorch.load_model invocations that immediately precede unusual file writes, credential file access, or egress traffic.

How to Mitigate CVE-2024-37059

Immediate Actions Required

Upgrade MLflow to a fixed version published after the June 2024 advisory and rebuild any container images that bundle vulnerable releases.
Restrict write access to the MLflow model registry and artifact store to authenticated, trusted users only.
Audit existing PyTorch artifacts in the registry and quarantine any models with unknown provenance.

Patch Information

The MLflow maintainers addressed the issue in releases following the HiddenLayer Security Advisory. Operators should track the MLflow GitHub releases and apply the latest stable version. Verify the installed version with pip show mlflow and confirm it is newer than the advisory's fix release.

Workarounds

Place MLflow tracking servers behind authenticated reverse proxies and disable anonymous artifact uploads.
Load PyTorch models inside isolated, ephemeral containers with no network egress and no access to production credentials.
Replace pickle-based serialization with safer formats such as safetensors for model weights where workflows allow.

bash

# Configuration example
pip install --upgrade mlflow
python -c "import mlflow; print(mlflow.__version__)"

# Scan PyTorch artifacts for unsafe pickle opcodes before loading
pip install fickling
fickling --check-safety path/to/model/data.pkl