CVE-2024-12044: MMDetection RCE Vulnerability

CVE-2024-12044 Overview

CVE-2024-12044 is a remote code execution vulnerability in open-mmlab/mmdetection version v3.3.0, an open-source object detection toolbox built on PyTorch. The flaw exists in the all_reduce_dict() distributed training API, which calls pickle.loads() on untrusted data without sanitization. An attacker on the distributed training network can broadcast a malicious serialized payload to trigger arbitrary code execution on participating nodes. The vulnerability is classified under CWE-502: Deserialization of Untrusted Data.

Critical Impact
Attackers can execute arbitrary code on distributed training workers by broadcasting a crafted pickle payload, compromising confidentiality, integrity, and availability of GPU training clusters.

Affected Products

open-mmlab/mmdetection version v3.3.0
Distributed training deployments using the all_reduce_dict() API
PyTorch-based training clusters that integrate the affected module

Discovery Timeline

2025-03-20 - CVE-2024-12044 published to the National Vulnerability Database (NVD)
2026-04-15 - Last updated in NVD database

Technical Details for CVE-2024-12044

Vulnerability Analysis

The vulnerability resides in the all_reduce_dict() function within the distributed training utilities of mmdetection. This API is used to synchronize dictionary objects across worker nodes during multi-GPU or multi-node training. To exchange complex Python objects across processes, the function serializes payloads with pickle and reconstructs them on receiving nodes using pickle.loads().

Python's pickle module executes arbitrary callables defined within the byte stream during deserialization. When pickle.loads() operates on attacker-controlled bytes, it can invoke __reduce__ methods that spawn subprocesses, write files, or import arbitrary modules. The mmdetection implementation performs no integrity verification, authentication, or type filtering before deserialization.

Deserialization runs in the context of the training process, which typically holds privileged access to GPU resources, model weights, training datasets, and credentials for cloud storage or experiment tracking platforms. The high EPSS percentile reflects the well-understood exploitability of pickle deserialization issues.

Root Cause

The root cause is the unconditional invocation of pickle.loads() on data received from peer processes in a distributed collective operation. The code treats inter-node communication as trusted, but a malicious or compromised peer, a network attacker positioned between nodes, or a poisoned training input pipeline can supply a crafted byte stream.

Attack Vector

An attacker who can inject data into the distributed training collective broadcasts a pickle payload whose __reduce__ returns a tuple invoking os.system, subprocess.Popen, or eval. When all_reduce_dict() deserializes the payload on each rank, the embedded callable executes immediately under the training user's identity. Exploitation requires no authentication when inter-node traffic is unprotected and no user interaction.

The vulnerability mechanism is described in the Huntr Bounty Report. No public proof-of-concept code is referenced in the enriched data.

Detection Methods for CVE-2024-12044

Indicators of Compromise

Unexpected child processes such as sh, bash, python -c, or curl spawned by the mmdetection training process during all_reduce collective operations
Outbound network connections initiated by training workers to non-allowlisted hosts shortly after distributed synchronization phases
Unusual file writes under home directories, /tmp, or model checkpoint paths originating from the training process
Modifications to Python site-packages or training scripts on worker nodes without an authorized deployment event

Detection Strategies

Hunt for pickle.loads invocations on tensors or buffers received over torch.distributed primitives within mmdetection code paths
Alert on training processes executing shell interpreters or network clients, which are not part of normal training behavior
Inspect distributed training logs for deserialization exceptions or rank-specific crashes that may indicate failed exploitation attempts

Monitoring Recommendations

Enable process lineage and command-line telemetry on all GPU training hosts and forward to a central analytics platform
Monitor egress traffic from training clusters against an allowlist of model registries, dataset stores, and experiment trackers
Capture file integrity events for the mmdetection installation directory and shared dataset volumes

How to Mitigate CVE-2024-12044

Immediate Actions Required

Restrict distributed training traffic to isolated network segments with mutual authentication between ranks, for example using TLS-protected NCCL or encrypted overlays
Run training workloads as unprivileged users in containers with read-only code mounts and no outbound internet access unless required
Audit any custom forks or downstream projects that import all_reduce_dict() and replace pickle with a safe serializer such as safetensors, msgpack, or JSON for dictionary synchronization

Patch Information

The enriched data does not reference a fixed version. Operators should consult the upstream open-mmlab/mmdetection repository and the Huntr Bounty Report for remediation status and apply the latest release once available. Until a patch is published, avoid running v3.3.0 in environments where peer nodes or network paths cannot be fully trusted.

Workarounds

Disable or avoid the all_reduce_dict() API and replace it with collective operations that exchange tensors only, never pickled Python objects
Place training nodes inside a dedicated VPC or VLAN with strict ingress and egress rules to prevent untrusted peers from joining the collective
Apply pod security policies or seccomp profiles that block execve of shell binaries from the training process

bash

# Example: restrict torch.distributed traffic to an isolated subnet
export MASTER_ADDR=10.42.0.1
export MASTER_PORT=29500
export NCCL_SOCKET_IFNAME=eth1   # private training NIC only
export GLOO_SOCKET_IFNAME=eth1
# Block all egress except dataset and registry endpoints
iptables -A OUTPUT -o eth1 -d 10.42.0.0/24 -j ACCEPT
iptables -A OUTPUT -o eth0 -j REJECT

CVE-2024-12044 Overview

Critical Impact
Attackers can execute arbitrary code on distributed training workers by broadcasting a crafted pickle payload, compromising confidentiality, integrity, and availability of GPU training clusters.

Affected Products

open-mmlab/mmdetection version v3.3.0
Distributed training deployments using the all_reduce_dict() API
PyTorch-based training clusters that integrate the affected module

Discovery Timeline

2025-03-20 - CVE-2024-12044 published to the National Vulnerability Database (NVD)
2026-04-15 - Last updated in NVD database

Technical Details for CVE-2024-12044

Vulnerability Analysis

Root Cause

Attack Vector

The vulnerability mechanism is described in the Huntr Bounty Report. No public proof-of-concept code is referenced in the enriched data.

Detection Methods for CVE-2024-12044

Indicators of Compromise

Unexpected child processes such as sh, bash, python -c, or curl spawned by the mmdetection training process during all_reduce collective operations
Outbound network connections initiated by training workers to non-allowlisted hosts shortly after distributed synchronization phases
Unusual file writes under home directories, /tmp, or model checkpoint paths originating from the training process
Modifications to Python site-packages or training scripts on worker nodes without an authorized deployment event

Detection Strategies

Hunt for pickle.loads invocations on tensors or buffers received over torch.distributed primitives within mmdetection code paths
Alert on training processes executing shell interpreters or network clients, which are not part of normal training behavior
Inspect distributed training logs for deserialization exceptions or rank-specific crashes that may indicate failed exploitation attempts

Monitoring Recommendations

Enable process lineage and command-line telemetry on all GPU training hosts and forward to a central analytics platform
Monitor egress traffic from training clusters against an allowlist of model registries, dataset stores, and experiment trackers
Capture file integrity events for the mmdetection installation directory and shared dataset volumes

How to Mitigate CVE-2024-12044

Immediate Actions Required

Restrict distributed training traffic to isolated network segments with mutual authentication between ranks, for example using TLS-protected NCCL or encrypted overlays
Run training workloads as unprivileged users in containers with read-only code mounts and no outbound internet access unless required
Audit any custom forks or downstream projects that import all_reduce_dict() and replace pickle with a safe serializer such as safetensors, msgpack, or JSON for dictionary synchronization

Patch Information

Workarounds

Disable or avoid the all_reduce_dict() API and replace it with collective operations that exchange tensors only, never pickled Python objects
Place training nodes inside a dedicated VPC or VLAN with strict ingress and egress rules to prevent untrusted peers from joining the collective
Apply pod security policies or seccomp profiles that block execve of shell binaries from the training process

bash

# Example: restrict torch.distributed traffic to an isolated subnet
export MASTER_ADDR=10.42.0.1
export MASTER_PORT=29500
export NCCL_SOCKET_IFNAME=eth1   # private training NIC only
export GLOO_SOCKET_IFNAME=eth1
# Block all egress except dataset and registry endpoints
iptables -A OUTPUT -o eth1 -d 10.42.0.0/24 -j ACCEPT
iptables -A OUTPUT -o eth0 -j REJECT

CVE-2024-12044: MMDetection RCE Vulnerability

CVE-2024-12044 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2024-12044

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2024-12044

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2024-12044

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform

CVE-2024-12044: MMDetection RCE Vulnerability

CVE-2024-12044 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2024-12044

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2024-12044

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2024-12044

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform