CVE-2026-31252: CosyVoice RCE Vulnerability

CVE-2026-31252 Overview

CVE-2026-31252 is an insecure deserialization vulnerability [CWE-502] in CosyVoice, an open-source voice synthesis framework from FunAudioLLM. The flaw exists through commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e and affects the model loading component. CosyVoice calls torch.load() on model weight files such as llm.pt, flow.pt, and hift.pt without setting weights_only=True. This default behavior allows the underlying pickle module to deserialize arbitrary Python objects. An attacker who supplies a malicious model directory can execute arbitrary code on a victim's system when the CosyVoice Web UI loads the files. The issue is tracked under [CWE-94] (Improper Control of Code Generation).

Critical Impact
Loading an attacker-supplied CosyVoice model directory results in arbitrary code execution in the context of the user running the Web UI.

Affected Products

CosyVoice framework through commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e
CosyVoice Web UI instances loading user-provided model directories
Deployments using torch.load() against llm.pt, flow.pt, or hift.pt files

Discovery Timeline

2026-05-11 - CVE-2026-31252 published to NVD
2026-05-12 - Last updated in NVD database

Technical Details for CVE-2026-31252

Vulnerability Analysis

CosyVoice loads model weights with PyTorch's torch.load() function. By default, torch.load() deserializes the file using Python's pickle module, which can instantiate arbitrary classes and invoke arbitrary callables encoded inside the file. PyTorch introduced the weights_only=True parameter specifically to restrict deserialization to safe tensor types, but CosyVoice does not enable it.

When the Web UI is pointed at a model directory, the framework iterates through expected files such as llm.pt, flow.pt, and hift.pt and deserializes each one. A crafted pickle stream containing a __reduce__ method runs attacker-controlled code at load time. Execution occurs with the privileges of the user running the Web UI process.

The attack requires local access and user interaction, as the victim must launch the Web UI and select the malicious directory. The scope is changed because code executes outside the model loader's intended boundary.

Root Cause

The root cause is the use of torch.load() without weights_only=True. This permits pickle-based object reconstruction during model deserialization, which is a documented unsafe operation when loading untrusted files.

Attack Vector

An attacker distributes a model directory through unofficial model hubs, file shares, or social engineering. The directory contains weight files that look legitimate but embed a malicious pickle payload. When the victim opens the CosyVoice Web UI and configures it to load the directory, the payload executes during deserialization. No additional privilege escalation primitive is required for code execution within the user's account.

The vulnerability manifests during the torch.load() call on the tampered .pt file. See the GitHub PoC Repository and Notion CVE Details for technical references.

Detection Methods for CVE-2026-31252

Indicators of Compromise

Unexpected child processes spawned by the Python interpreter hosting the CosyVoice Web UI during model load.
.pt files originating from untrusted sources placed in CosyVoice model directories.
Outbound network connections initiated by the Web UI process shortly after a new model directory is selected.
Modifications to user startup files, cron entries, or shell profiles immediately after launching CosyVoice.

Detection Strategies

Inspect pickle streams inside .pt files for GLOBAL, REDUCE, or BUILD opcodes referencing modules like os, subprocess, or builtins.
Hunt for invocations of torch.load() in source code without an explicit weights_only=True argument.
Correlate Web UI process telemetry with file read events targeting unknown llm.pt, flow.pt, or hift.pt paths.

Monitoring Recommendations

Log and alert on process trees where the CosyVoice Python process spawns shells, interpreters, or networking utilities.
Track filesystem writes by the Web UI process outside its working directory and model cache.
Audit downloads of CosyVoice model archives from non-official sources.

How to Mitigate CVE-2026-31252

Immediate Actions Required

Restrict CosyVoice Web UI usage to model directories from trusted, verified publishers.
Patch local installations to invoke torch.load() with weights_only=True for all weight files.
Run the CosyVoice Web UI under a low-privilege account isolated from sensitive data and credentials.
Block ingestion of model archives from untrusted file shares, forums, or third-party mirrors.

Patch Information

No official fixed commit is referenced in the published advisory at the time of writing. Monitor the CosyVoice GitHub repository for upstream changes that add weights_only=True to model loading calls or migrate to safetensors-based formats.

Workarounds

Manually modify model loading code to pass weights_only=True to every torch.load() call.
Convert trusted models to the safetensors format and load them with the safetensors library, which does not execute arbitrary code.
Execute the Web UI inside a container or sandbox with no network egress and read-only access to sensitive paths.
Verify cryptographic hashes of model files against publisher-provided values before loading.

bash

# Configuration example: enforce safe loading in CosyVoice forks
# Replace insecure calls such as:
#   state = torch.load(model_path)
# with:
#   state = torch.load(model_path, weights_only=True, map_location='cpu')

# Run the Web UI under a restricted user with no outbound network access
sudo useradd -r -s /usr/sbin/nologin cosyvoice
sudo -u cosyvoice python3 webui.py --model_dir /srv/cosyvoice/models