CVE-2026-31214 Overview
CVE-2026-31214 is an insecure deserialization vulnerability [CWE-502] in the torch-checkpoint-shrink.py script from the ml-engineering project. The vulnerable code exists in commit 0099885db36a8f06556efe1faf552518852cb1e0. The script calls torch.load() on PyTorch checkpoint files without setting weights_only=True. This allows the underlying pickle module to deserialize arbitrary Python objects supplied by an attacker. A remote attacker who delivers a crafted .pt checkpoint file can achieve arbitrary code execution as the user running the script. Machine learning workflows that consume third-party checkpoints face direct exposure.
Critical Impact
Loading an attacker-supplied PyTorch checkpoint triggers arbitrary code execution in the context of the user running torch-checkpoint-shrink.py.
Affected Products
- ml-engineering project, torch-checkpoint-shrink.py script
- Commit 0099885db36a8f06556efe1faf552518852cb1e0 and earlier revisions containing the unsafe torch.load() call
- Any downstream tooling or pipeline that invokes this script on untrusted .pt files
Discovery Timeline
- 2026-05-12 - CVE-2026-31214 published to NVD
- 2026-05-13 - Last updated in NVD database
Technical Details for CVE-2026-31214
Vulnerability Analysis
The vulnerability lives in the checkpoint loading path of torch-checkpoint-shrink.py. The script uses torch.load() to read PyTorch checkpoint files with the .pt extension. PyTorch's torch.load() defaults to using Python's pickle module for deserialization when weights_only=True is not explicitly set. The pickle format permits embedded objects whose constructors execute arbitrary Python code during unpickling. An attacker who controls a checkpoint file can therefore run code on the host that processes it. The attack vector is network-reachable because checkpoint files are routinely downloaded from model hubs, shared via collaboration platforms, or pulled from artifact registries.
Root Cause
The root cause is the absence of the weights_only=True parameter on the torch.load() invocation at line 57 of the script. Without that flag, PyTorch falls back to the legacy pickle-based loader. The legacy loader does not constrain which classes or callables can be reconstructed during deserialization. Trust is implicitly granted to whoever produced the checkpoint.
Attack Vector
An attacker crafts a malicious .pt file containing a pickle payload that defines a __reduce__ method returning a callable such as os.system or subprocess.Popen with attacker-controlled arguments. The attacker distributes the file through a model repository, a shared bucket, or a download link. When a victim runs torch-checkpoint-shrink.py against the file, torch.load() invokes the pickle deserializer, which executes the embedded callable. Code runs with the privileges of the user executing the script. See the GitHub Script Example for the affected call site.
Detection Methods for CVE-2026-31214
Indicators of Compromise
- Unexpected child processes spawned by Python interpreters running torch-checkpoint-shrink.py, such as shells, curl, wget, or package managers
- Outbound network connections from Python processes immediately after a checkpoint load operation
- New files written under home directories, /tmp, or model cache paths during checkpoint processing
- .pt files originating from untrusted sources or with anomalous size relative to declared model dimensions
Detection Strategies
- Hunt process trees where python parents non-ML utilities like /bin/sh, bash, nc, or powershell during checkpoint loading windows
- Inspect .pt files for suspicious opcodes by scanning pickle streams for GLOBAL references to modules such as os, subprocess, posix, or builtins
- Flag torch.load() call sites in code review and CI scanning that omit weights_only=True
- Correlate file download events for .pt artifacts with subsequent process and network activity on the same host
Monitoring Recommendations
- Enable command-line and process-creation auditing on ML training and inference hosts
- Log all file reads of .pt and .pth artifacts and the user account performing the load
- Alert on Python processes initiating outbound TCP connections to non-allowlisted destinations
- Track integrity hashes of checkpoint files against a known-good registry before consumption
How to Mitigate CVE-2026-31214
Immediate Actions Required
- Stop running torch-checkpoint-shrink.py against checkpoints from untrusted or unverified sources
- Patch the script locally by adding weights_only=True to the torch.load() call at line 57
- Audit the repository for additional torch.load() usages that omit the safe-loading flag
- Quarantine .pt files received from external collaborators until they are scanned and validated
Patch Information
No upstream patch is currently referenced in the NVD record. Apply a local fix by modifying the torch.load() invocation to pass weights_only=True, which restricts deserialization to tensor data and rejects arbitrary Python objects. PyTorch versions 2.6 and later enable this behavior by default, but the script's explicit call pattern should still set the parameter for clarity and backward compatibility. Track the upstream repository at the ml-engineering project for any forthcoming fix.
Workarounds
- Run the script inside an ephemeral container or sandbox with no network egress and no access to credentials
- Pre-validate checkpoint files by loading them with weights_only=True in an isolated process before any conversion step
- Restrict execution to dedicated service accounts without write access to source code or secrets
- Enforce checksum verification against a trusted manifest for every .pt artifact entering the pipeline
# Configuration example: safe-load wrapper
python -c "import torch; torch.load('checkpoint.pt', map_location='cpu', weights_only=True)"
: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


