CVE-2024-3568 Overview
The huggingface/transformers library is vulnerable to arbitrary code execution through deserialization of untrusted data within the load_repo_checkpoint() function of the TFPreTrainedModel() class. Attackers can execute arbitrary code and commands by crafting a malicious serialized payload, exploiting the use of pickle.load() on data from potentially untrusted sources. This vulnerability allows for remote code execution (RCE) by deceiving victims into loading a seemingly harmless checkpoint during a normal training process, thereby enabling attackers to execute arbitrary code on the targeted machine.
Critical Impact
This vulnerability enables remote code execution through malicious model checkpoints, potentially compromising machine learning environments, training pipelines, and any systems that load untrusted Huggingface Transformers model checkpoints.
Affected Products
- Huggingface Transformers (all versions prior to the security patch)
Discovery Timeline
- 2024-04-10 - CVE-2024-3568 published to NVD
- 2025-10-10 - Last updated in NVD database
Technical Details for CVE-2024-3568
Vulnerability Analysis
This insecure deserialization vulnerability (CWE-502) exists within the load_repo_checkpoint() function of the TFPreTrainedModel() class in the Huggingface Transformers library. The vulnerability stems from the unsafe use of Python's pickle.load() function when loading model checkpoints from potentially untrusted sources.
Python's pickle module is inherently insecure when handling untrusted data because it can execute arbitrary code during the deserialization process. When a user loads a model checkpoint that has been maliciously crafted by an attacker, the pickle deserialization process can trigger code execution without any additional user interaction beyond the initial load operation.
The attack scenario is particularly dangerous in machine learning contexts where researchers and developers frequently download and use pre-trained models from various sources. An attacker could distribute a malicious checkpoint file disguised as a legitimate model, and any user who loads this checkpoint would unknowingly execute the attacker's payload.
Root Cause
The root cause of this vulnerability is the use of Python's pickle.load() function to deserialize checkpoint data without proper validation or sanitization. The pickle module is designed to serialize and deserialize Python objects, but it lacks any security mechanisms to prevent malicious payloads from being executed during deserialization.
When load_repo_checkpoint() processes a checkpoint file, it trusts the serialized data implicitly, allowing attackers to embed arbitrary Python code within the pickle payload. This code executes with the same privileges as the application loading the checkpoint, enabling full system compromise in many scenarios.
Attack Vector
The attack vector is network-based and requires user interaction. An attacker must first create a malicious checkpoint file containing a crafted pickle payload with arbitrary code. The attacker then distributes this checkpoint through various means such as:
- Uploading to model repositories
- Sharing through collaborative platforms
- Distributing via social engineering tactics
When a victim downloads and loads this checkpoint during their normal model training or inference workflow, the malicious payload executes automatically. The attack exploits the trust relationship between machine learning practitioners and shared model resources.
The vulnerability is exploited through crafting a malicious pickle payload that, when deserialized by the load_repo_checkpoint() function, executes arbitrary code. The attack leverages Python's __reduce__ method to embed system commands within the serialized object. When the victim loads what appears to be a legitimate model checkpoint, the embedded code executes with the permissions of the running process. For detailed technical information, see the Huntr Bounty Report.
Detection Methods for CVE-2024-3568
Indicators of Compromise
- Unexpected system commands or processes spawned during model checkpoint loading operations
- Network connections initiated immediately after loading model checkpoints from untrusted sources
- Suspicious modifications to system files or configurations following model loading activities
- Unusual resource consumption patterns during checkpoint deserialization
Detection Strategies
- Monitor for calls to pickle.load() or pickle.loads() with data originating from external or untrusted sources
- Implement file integrity monitoring on checkpoint directories to detect unauthorized modifications
- Deploy application-level logging for all model loading operations in Transformers-based applications
- Analyze network traffic for suspicious outbound connections following model loading events
Monitoring Recommendations
- Implement behavioral analysis to detect anomalous process spawning during ML pipeline operations
- Configure security tooling to alert on execution of shell commands from Python ML processes
- Monitor for file system changes in standard checkpoint storage locations
- Review audit logs for unexpected privilege escalation attempts in ML environments
How to Mitigate CVE-2024-3568
Immediate Actions Required
- Update the Huggingface Transformers library to the patched version immediately
- Audit all model checkpoints currently in use for potential compromise
- Restrict checkpoint loading to verified, trusted sources only
- Implement network segmentation for ML training environments
Patch Information
Huggingface has released a security patch to address this vulnerability. The fix is available in commit 693667b8ac8138b83f8adb6522ddaf42fa07c125. Organizations should update their Transformers installations to include this commit or any subsequent release containing the fix.
Workarounds
- Avoid loading model checkpoints from untrusted or unverified sources until patches are applied
- Implement checkpoint verification mechanisms using cryptographic signatures before loading
- Run ML training environments in isolated containers or sandboxed environments to limit potential impact
- Consider using safer serialization alternatives like safetensors where possible
# Update Huggingface Transformers to the latest patched version
pip install --upgrade transformers
# Verify the installation includes the security fix
pip show transformers
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


