CVE-2024-37052 Overview
CVE-2024-37052 is an insecure deserialization vulnerability in the MLflow machine learning lifecycle platform. The flaw affects MLflow versions 1.1.0 and newer. An attacker can upload a malicious scikit-learn model that executes arbitrary code when an end user loads or interacts with it. The vulnerability is tracked under CWE-502: Deserialization of Untrusted Data. Successful exploitation gives the attacker code execution in the context of the user processing the model, enabling credential theft, lateral movement, and supply-chain compromise of downstream ML pipelines.
Critical Impact
Loading an attacker-controlled scikit-learn model artifact results in arbitrary code execution on the end user's system, compromising confidentiality, integrity, and availability.
Affected Products
- MLflow (lfprojects) version 1.1.0 and newer
- Self-hosted MLflow tracking servers and model registries
- Downstream consumers that load scikit-learn flavor models via MLflow
Discovery Timeline
- 2024-06-04 - CVE-2024-37052 published to the NVD
- 2025-02-03 - Last updated in the NVD database
Technical Details for CVE-2024-37052
Vulnerability Analysis
MLflow packages models using flavor-specific serialization formats. For the scikit-learn flavor, MLflow relies on Python's pickle (or cloudpickle) to serialize and deserialize model objects. When a user calls mlflow.sklearn.load_model() or invokes a saved model through mlflow.pyfunc.load_model(), MLflow unpickles the artifact without validating its contents.
Python pickle deserialization invokes the __reduce__ method of objects during loading. An attacker who controls the serialized payload can craft an object whose __reduce__ returns a callable such as os.system or subprocess.Popen together with attacker-chosen arguments. The interpreter executes that callable as part of the deserialization process.
Because MLflow is commonly used in collaborative environments and shared model registries, a malicious model published by one user runs in the context of any data scientist, CI/CD job, or inference service that loads it.
Root Cause
The root cause is the use of unsafe Python serialization formats for untrusted model artifacts. MLflow treats stored model files as trusted input and passes them directly to pickle.load. No signature verification, sandboxing, or allowlist of safe classes is enforced before object reconstruction.
Attack Vector
The attack vector is network-based and requires user interaction. An attacker uploads a poisoned scikit-learn model to a shared MLflow tracking server, model registry, or public artifact repository. When a victim loads the model, the embedded payload executes. Refer to the HiddenLayer Security Advisory for the full technical write-up.
// No verified public proof-of-concept code is available.
// The vulnerability mechanism is described in prose above.
// See the HiddenLayer advisory for technical details.
Detection Methods for CVE-2024-37052
Indicators of Compromise
- Unexpected child processes such as sh, bash, python, or curl spawned by MLflow worker, Jupyter, or model-serving processes
- Outbound network connections from data science workstations or inference containers to unknown hosts shortly after model load operations
- New or modified files inside MLflow artifact stores containing oversized or obfuscated model.pkl payloads
- Anomalous read access to credential files (~/.aws/credentials, ~/.ssh/, cloud metadata endpoints) from Python interpreters loading models
Detection Strategies
- Inspect MLmodel and model.pkl artifacts for unexpected imports, large opcode streams, or references to os, subprocess, or posix modules using tools such as pickletools
- Monitor MLflow audit logs for model uploads from untrusted accounts followed by load events on production systems
- Alert on process lineage where Python processes loading MLflow artifacts spawn shells or networking utilities
Monitoring Recommendations
- Enable verbose logging on MLflow tracking servers and forward events to a centralized SIEM for correlation
- Baseline normal child-process and outbound network behavior for ML training and serving workloads, then alert on deviations
- Track model registry promotion events and require human approval before models move into production stages
How to Mitigate CVE-2024-37052
Immediate Actions Required
- Upgrade MLflow to a fixed release and review the HiddenLayer Security Advisory for version guidance
- Restrict write access to the MLflow tracking server and model registry to authenticated, trusted users only
- Audit existing scikit-learn model artifacts for unknown publishers and remove untrusted entries
- Isolate MLflow model-loading workloads in containers or VMs with no access to production credentials or secrets
Patch Information
Apply the latest MLflow security release from the lfprojects MLflow project. Validate that the running version is above the patched release on every tracking server, worker, and client environment before resuming model loads from shared sources.
Workarounds
- Do not load MLflow models from untrusted or unauthenticated sources
- Place MLflow tracking servers behind authentication and authorization controls, blocking anonymous uploads
- Run model-loading code in sandboxed environments such as ephemeral containers with network egress restrictions and no production secrets mounted
- Use static analysis on pickle payloads with pickletools.dis before loading models in interactive sessions
# Inspect a suspicious model artifact before loading
python -c "import pickletools; pickletools.dis(open('model.pkl','rb'))" | head -100
# Run MLflow model loads inside a restricted container
docker run --rm --network=none --read-only \
-v "$PWD/model:/model:ro" \
python:3.11-slim python -c "import mlflow.sklearn; mlflow.sklearn.load_model('/model')"
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


