CVE-2026-10803 Overview
CVE-2026-10803 is a weak hash algorithm vulnerability [CWE-327] affecting MLflow versions up to 3.10.0. The flaw resides in the mlflow.data.digest_utils function within mlflow/data/digest_utils.py, which is part of the Dataset Digest Computation component. The use of a cryptographically weak hash function allows local attackers to potentially produce collisions or undermine integrity checks performed on dataset digests. The exploit has been published, but the MLflow project has not yet responded to a pull request submitted to address the issue.
Critical Impact
A local attacker may exploit the weak hash algorithm to compromise dataset integrity verification in MLflow, though exploitation is rated as high complexity and difficult.
Affected Products
- MLflow versions up to and including 3.10.0
- mlflow/data/digest_utils.py Dataset Digest Computation component
- Machine learning pipelines relying on MLflow dataset digest verification
Discovery Timeline
- 2026-06-04 - CVE-2026-10803 published to NVD
- 2026-06-04 - Last updated in NVD database
Technical Details for CVE-2026-10803
Vulnerability Analysis
The vulnerability stems from the use of a cryptographically weak hash algorithm within MLflow's dataset digest computation routine. MLflow computes digests for datasets to track lineage and verify data integrity across machine learning experiments. When this digest relies on a weak hash function, an attacker with local access can craft data that yields identical digests, undermining the trust placed in MLflow tracking metadata.
The issue is classified under [CWE-327] (Use of a Broken or Risky Cryptographic Algorithm). Exploitation requires local access and is rated as high complexity. The exploit has been published, but practical impact is limited because attackers must already have local privileges on the MLflow host.
The MLflow maintainers were notified through a pull request, but as of publication no fix has been merged. Organizations running MLflow should treat dataset digests as advisory rather than as cryptographic integrity guarantees.
Root Cause
The root cause is the selection of a weak hash function inside the mlflow.data.digest_utils module. Weak hash algorithms produce predictable outputs that are susceptible to collision attacks, eliminating the integrity guarantee that strong cryptographic hashes such as SHA-256 provide.
Attack Vector
An attacker with local access to an MLflow environment can supply or manipulate datasets to produce colliding digests. This may permit substitution of malicious datasets while preserving the original digest values recorded in MLflow tracking. The attack vector is local, and exploitation is considered difficult.
No verified proof-of-concept code is published in the references. Technical details are documented in the GitHub Issue #22419 and the proposed fix in GitHub Pull Request #22420.
Detection Methods for CVE-2026-10803
Indicators of Compromise
- Unexpected modifications to dataset files where MLflow-recorded digests remain unchanged
- Discrepancies between dataset content hashes computed with strong algorithms (SHA-256) and MLflow-reported digests
- Unauthorized local access to systems hosting MLflow tracking servers or artifact stores
Detection Strategies
- Audit MLflow installations to identify deployments running versions up to and including 3.10.0
- Implement parallel integrity verification using SHA-256 or stronger hashes alongside MLflow digests
- Review access logs for the MLflow tracking server to identify unauthorized local activity
Monitoring Recommendations
- Monitor file integrity on dataset directories referenced by MLflow experiments
- Log and review all dataset registration and modification events within MLflow
- Track changes to mlflow/data/digest_utils.py and related modules across deployments
How to Mitigate CVE-2026-10803
Immediate Actions Required
- Restrict local access to MLflow hosts to trusted users only
- Treat MLflow dataset digests as non-authoritative for security-sensitive integrity checks
- Supplement MLflow digests with independent SHA-256 verification for critical datasets
Patch Information
No official patch has been released by the MLflow project. A community-submitted fix is available in GitHub Pull Request #22420, but it has not been merged. Track the MLflow Repository for upstream resolution.
Workarounds
- Apply the changes proposed in the pending pull request after independent review and testing
- Compute and store secondary SHA-256 digests for all datasets registered in MLflow
- Limit MLflow tracking server access through network segmentation and host-level controls
- Review the VulDB CVE-2026-10803 entry for additional context on exploitation risk
# Independent SHA-256 verification example
sha256sum /path/to/dataset.csv > /path/to/dataset.csv.sha256
# Compare against stored value before consuming the dataset
sha256sum -c /path/to/dataset.csv.sha256
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


