CVE-2026-31251: CosyVoice RCE Vulnerability

CVE-2026-31251 Overview

CVE-2026-31251 is an insecure deserialization vulnerability [CWE-502] affecting the CosyVoice speech synthesis project through commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e. The gRPC server component loads model files using torch.load() without setting the weights_only=True parameter. This permits unpickling of arbitrary Python objects from attacker-controlled model directories. When a victim launches the gRPC server pointing at a malicious model path, code embedded in the pickle stream executes during server initialization.

Critical Impact
Attackers who supply or substitute a model file in the directory loaded by the CosyVoice gRPC server gain arbitrary code execution on the host during startup.

Affected Products

CosyVoice (FunAudioLLM) through commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e
gRPC server component shipped with the CosyVoice repository
Any deployment loading untrusted CosyVoice model directories

Discovery Timeline

2026-05-11 - CVE-2026-31251 published to NVD
2026-05-12 - Last updated in NVD database

Technical Details for CVE-2026-31251

Vulnerability Analysis

The CosyVoice gRPC server initializes by loading a pretrained speech synthesis model from a directory specified at launch time. The loader invokes PyTorch's torch.load() function against files in that directory. Without the weights_only=True flag, torch.load() falls back to Python's pickle module to deserialize objects. Pickle deserialization invokes constructors and __reduce__ methods embedded in the stream, which can execute arbitrary Python during load.

An attacker who controls the contents of the model directory crafts a pickle payload that runs commands when the server starts. Execution occurs in the security context of the user running the gRPC service, which in many machine learning deployments is a privileged or GPU-enabled account.

Root Cause

The root cause is missing input validation on deserialized data. PyTorch documents weights_only=True as the safe loading mode for tensors and state dictionaries. CosyVoice omits this parameter, so the loader treats every file as a fully trusted pickle stream. The flaw is tracked under [CWE-20: Improper Input Validation] in addition to the deserialization weakness class.

Attack Vector

Exploitation requires the victim to start the gRPC server with a model path that contains an attacker-supplied file. Attack scenarios include supply-chain substitution of model weights downloaded from public hubs, shared filesystem tampering, and social engineering that convinces a developer to test a malicious fine-tune. No authentication to the gRPC service is required because the payload triggers during server startup, before any client request.

For exploitation mechanics, see the CosyVoice repository and the public CVE analysis.

Detection Methods for CVE-2026-31251

Indicators of Compromise

Unexpected child processes spawned by the Python interpreter hosting the CosyVoice gRPC server during startup
Model files in CosyVoice directories with recent modification timestamps from untrusted users or external downloads
Outbound network connections initiated by the server process before any gRPC client traffic arrives
Pickle files containing references to os.system, subprocess, posix.system, or builtins.exec opcodes

Detection Strategies

Scan .pt, .pth, and .bin files in model directories with pickle inspection tools such as picklescan or fickling
Audit Python process trees for CosyVoice servers that launch shells, package managers, or interpreters as children
Hash and compare model artifacts against vendor-published checksums before each server start

Monitoring Recommendations

Log all invocations of the CosyVoice gRPC entrypoint with the resolved model directory path and file hashes
Alert on torch.load calls without weights_only=True discovered through static analysis of CosyVoice forks
Forward process-creation and network-connection telemetry from inference hosts to a centralized analytics platform for retrospective review

How to Mitigate CVE-2026-31251

Immediate Actions Required

Restrict the CosyVoice gRPC server to load models only from directories under administrative control
Replace torch.load(path) calls with torch.load(path, weights_only=True) in local forks until an official patch is available
Verify cryptographic hashes of every model file against a trusted source before service startup
Run the gRPC server under a low-privilege, network-isolated service account

Patch Information

No vendor-supplied patch is referenced in the NVD entry as of the last modified date. Monitor the CosyVoice GitHub repository for upstream fixes and adopt the weights_only=True loader pattern in the interim.

Workarounds

Convert model artifacts to the safetensors format, which does not execute code during deserialization
Place model directories on read-only mounts writable only by trusted release pipelines
Block outbound network egress from inference hosts at the firewall to limit second-stage payloads

bash

# Configuration example: replace unsafe loader and run with least privilege
# 1. Patch the loader call in CosyVoice source
sed -i 's/torch.load(\(.*\))/torch.load(\1, weights_only=True)/g' cosyvoice/cli/model.py

# 2. Run the gRPC server as an unprivileged user with read-only model mount
sudo useradd -r -s /usr/sbin/nologin cosyvoice
sudo mount -o ro,bind /opt/models/cosyvoice /srv/cosyvoice/models
sudo -u cosyvoice python -m cosyvoice.bin.grpc_server \
  --model_dir /srv/cosyvoice/models