CVE-2026-31218: Optimate Project RCE Vulnerability

CVE-2026-31218 Overview

CVE-2026-31218 is an insecure deserialization vulnerability [CWE-502] in the _load_model() function of the neural_magic_training.py script in the nebuly-ai optimate project. The affected commit is a6d302f912b481c94370811af6b11402f51d377f dated 2024-07-21. The function loads a model state dictionary from a state_dict.pt file using torch.load() without setting the weights_only=True security parameter. This permits the Pickle module to deserialize arbitrary Python objects from attacker-controlled files. A remote attacker who supplies a malicious state_dict.pt via the --model argument achieves arbitrary code execution on the victim system.

Critical Impact
Arbitrary code execution on systems loading attacker-supplied model files through Optimate's training script.

Affected Products

nebuly-ai optimate project at commit a6d302f912b481c94370811af6b11402f51d377f (2024-07-21)
neural_magic_training.py script invoking torch.load() without weights_only=True
Downstream forks or deployments embedding the vulnerable _load_model() function

Discovery Timeline

2026-05-12 - CVE-2026-31218 published to NVD
2026-05-15 - Last updated in NVD database

Technical Details for CVE-2026-31218

Vulnerability Analysis

The vulnerability resides in the _load_model() function of neural_magic_training.py, part of the Optimate machine learning optimization project. PyTorch's torch.load() defaults to using the Python pickle module for deserialization. Pickle is not a secure serialization format because it allows arbitrary callables to execute during object reconstruction via the __reduce__ protocol.

When _load_model() reads a state_dict.pt file, it invokes torch.load() without passing weights_only=True. PyTorch added weights_only as a safer loading mode that restricts unpickling to a small allowlist of tensor types. Without this flag, any Python object embedded in the file is reconstructed, including objects whose constructors run shell commands or import arbitrary modules.

The --model command-line argument controls the directory from which the file is read, so an attacker who can place or substitute a state_dict.pt payload in that location triggers code execution under the privileges of the user running the script.

Root Cause

The root cause is the absence of the weights_only=True argument on the torch.load() call inside _load_model(). PyTorch documentation explicitly warns that loading untrusted checkpoints without this flag is unsafe. The application also lacks integrity verification on model files, so swapped or tampered checkpoints are accepted without challenge.

Attack Vector

Exploitation requires the victim to run the training script and point --model at a directory containing the malicious checkpoint. Distribution channels include public model hubs, shared storage, supply-chain compromise of model repositories, or social engineering. Because Optimate is typically run on workstations or GPU servers with broad filesystem and network access, code execution often yields immediate access to training data, credentials, and adjacent services.

No verified proof-of-concept code is published. The vulnerability follows the well-documented pickle deserialization pattern in which a crafted object implements __reduce__ to return a callable such as os.system together with attacker-chosen arguments. Refer to the GitHub Repository for Optimate and Notion CVE-2026-31218 Details for additional context.

Detection Methods for CVE-2026-31218

Indicators of Compromise

Unexpected child processes spawned by Python interpreters running neural_magic_training.py, such as shells, curl, wget, or python -c invocations
state_dict.pt files sourced from untrusted repositories, mirrors, or user uploads
Outbound network connections initiated immediately after a model load operation
New persistence artifacts, SSH keys, or cron entries created by the user running Optimate

Detection Strategies

Static scanning of pickle files for opcodes such as GLOBAL, REDUCE, and INST referencing dangerous modules like os, subprocess, posix, or builtins
Code review and SAST rules flagging torch.load() calls missing weights_only=True
Endpoint behavioral detection that correlates Python process execution with subsequent shell or network activity

Monitoring Recommendations

Log all command-line invocations of training scripts and capture the --model path argument
Alert on Python processes writing to autostart locations or executing interactive shells
Monitor egress from GPU and training hosts for connections to non-corporate destinations

How to Mitigate CVE-2026-31218

Immediate Actions Required

Stop loading state_dict.pt files from untrusted sources until the script is patched
Modify local copies of _load_model() to pass weights_only=True to torch.load()
Audit existing model directories for unexpected or recently modified checkpoint files
Restrict the user account running training jobs so deserialization cannot reach sensitive data or credentials

Patch Information

No official patched release is listed in the CVE record. Track the GitHub Repository for Optimate for upstream fixes. As an interim fix, apply a local code change that sets weights_only=True on every torch.load() call, or replace pickle-based checkpoints with safer formats such as safetensors.

Workarounds

Convert trusted checkpoints to the safetensors format and refuse to load .pt files
Verify checkpoint integrity with cryptographic signatures or hashes before loading
Run training jobs inside containers or sandboxes with no network egress and read-only access to source code
Use a separate, unprivileged service account for model loading and training

bash

# Configuration example: enforce safe loading and isolate the training process
# 1. Patch the call site to require weights-only deserialization
#    Replace:  state = torch.load(path)
#    With:     state = torch.load(path, weights_only=True)

# 2. Verify checkpoint integrity before use
sha256sum -c state_dict.pt.sha256 || exit 1

# 3. Run training in an isolated container with no outbound network
docker run --rm \
  --network=none \
  --read-only \
  --tmpfs /tmp \
  -v "$PWD/models:/models:ro" \
  -u 1001:1001 \
  optimate:patched \
  python neural_magic_training.py --model /models/trusted