CVE-2025-2953: PyTorch Denial of Service Vulnerability

CVE-2025-2953 Overview

A denial of service vulnerability has been identified in PyTorch version 2.6.0+cu124 affecting the torch.mkldnn_max_pool2d function. This vulnerability, classified as problematic, can be exploited through local access to cause service disruption. The vulnerability has been publicly disclosed, though its real-world impact is still under debate. Notably, PyTorch's security policy explicitly warns users about the risks of loading unknown or untrusted models that may establish malicious effects.

Critical Impact
Local attackers with access to PyTorch environments can trigger a denial of service condition through manipulation of the torch.mkldnn_max_pool2d function, potentially disrupting machine learning workloads and model inference operations.

Affected Products

PyTorch 2.6.0+cu124 (CUDA-enabled version)
linuxfoundation pytorch

Discovery Timeline

2025-03-30 - CVE-2025-2953 published to NVD
2025-04-22 - Last updated in NVD database

Technical Details for CVE-2025-2953

Vulnerability Analysis

This vulnerability resides in the torch.mkldnn_max_pool2d function within PyTorch's Intel Math Kernel Library for Deep Neural Networks (MKL-DNN) integration layer. The function is responsible for performing max pooling operations on 2D tensors using optimized Intel MKL-DNN kernels. The vulnerability is classified under CWE-404 (Improper Resource Shutdown or Release), indicating that the function fails to properly manage resources during certain operations, leading to a denial of service condition.

The attack requires local access to the system running PyTorch, meaning an attacker must already have the ability to execute code in the target environment. This could occur in shared computing environments, machine learning platforms with multi-tenant access, or scenarios where untrusted models or code are executed. The exploitation leads to availability impact without compromising confidentiality or integrity of the system.

It's important to note that the PyTorch project's security policy acknowledges that running untrusted models can have unpredictable and potentially harmful effects. The project considers scenarios involving loading arbitrary untrusted models as outside their standard threat model.

Root Cause

The root cause is improper resource shutdown or release (CWE-404) in the torch.mkldnn_max_pool2d function. When specific inputs or conditions are provided to this function, it fails to properly release or manage system resources, leading to resource exhaustion or a denial of service state. This issue is specifically tied to the MKL-DNN optimized pathway for max pooling operations on 2D inputs.

Attack Vector

The attack vector is local, requiring the attacker to have direct access to the system or the ability to execute arbitrary PyTorch code. The vulnerability can be exploited by:

Crafting malicious input parameters to the torch.mkldnn_max_pool2d function
Loading an untrusted model that internally calls this function with triggering inputs
Executing Python code that invokes the vulnerable function with manipulated arguments

The vulnerability affects the availability of the PyTorch process, potentially causing crashes or resource exhaustion that impacts machine learning workloads.

The vulnerability manifests in the torch.mkldnn_max_pool2d function when processing certain inputs. For technical details, refer to the GitHub PyTorch Issue Tracker where the vulnerability was reported.

Detection Methods for CVE-2025-2953

Indicators of Compromise

Unexpected crashes or hangs in PyTorch processes, particularly during max pooling operations
Resource exhaustion symptoms such as memory spikes or CPU hangs associated with MKL-DNN operations
Error logs indicating failures in torch.mkldnn_max_pool2d function calls
Abnormal process termination in machine learning inference pipelines

Detection Strategies

Monitor PyTorch application logs for errors or exceptions related to mkldnn_max_pool2d operations
Implement resource monitoring for Python processes running PyTorch workloads to detect unusual memory or CPU patterns
Deploy application-level health checks that verify PyTorch inference services remain responsive
Use static analysis tools to identify code paths that invoke torch.mkldnn_max_pool2d with external or untrusted inputs

Monitoring Recommendations

Set up alerting for PyTorch process crashes or unexpected restarts in production environments
Monitor system resource utilization (CPU, memory) for processes using MKL-DNN operations
Implement timeout mechanisms for machine learning inference calls to detect hanging operations
Log and audit the sources of models being loaded in shared or multi-tenant ML environments

How to Mitigate CVE-2025-2953

Immediate Actions Required

Review and restrict access to PyTorch execution environments, ensuring only trusted users can run arbitrary code
Audit machine learning pipelines to identify and remove any untrusted or unverified models
Implement input validation and sanitization for any external inputs that may reach PyTorch functions
Consider isolating PyTorch workloads in containerized environments to limit blast radius of potential DoS attacks

Patch Information

At the time of this writing, no official patch has been released for this vulnerability. Users should monitor the GitHub PyTorch Issue Tracker for updates on remediation. The PyTorch project's Security Guidelines explicitly warns users about the risks of running untrusted models, which should be considered when assessing this vulnerability's impact on your environment.

Workarounds

Avoid loading untrusted or unverified PyTorch models in production environments, as recommended by PyTorch's security policy
If possible, use alternative pooling implementations that don't rely on MKL-DNN for critical workloads until a patch is available
Implement process isolation and resource limits for PyTorch workloads to contain potential DoS impacts
Consider using non-MKL-DNN backends for max pooling operations where performance requirements allow

bash

# Configuration example - Disable MKL-DNN optimizations as a workaround
export MKLDNN_VERBOSE=0
export TORCH_MKLDNN_ENABLED=0

# Alternative: Set within Python before importing torch
# import os
# os.environ['TORCH_MKLDNN_ENABLED'] = '0'

CVE-2025-2953 Overview

Critical Impact
Local attackers with access to PyTorch environments can trigger a denial of service condition through manipulation of the torch.mkldnn_max_pool2d function, potentially disrupting machine learning workloads and model inference operations.

Affected Products

PyTorch 2.6.0+cu124 (CUDA-enabled version)
linuxfoundation pytorch

Discovery Timeline

2025-03-30 - CVE-2025-2953 published to NVD
2025-04-22 - Last updated in NVD database

Technical Details for CVE-2025-2953

Vulnerability Analysis

Root Cause

Attack Vector

The attack vector is local, requiring the attacker to have direct access to the system or the ability to execute arbitrary PyTorch code. The vulnerability can be exploited by:

Crafting malicious input parameters to the torch.mkldnn_max_pool2d function
Loading an untrusted model that internally calls this function with triggering inputs
Executing Python code that invokes the vulnerable function with manipulated arguments

The vulnerability affects the availability of the PyTorch process, potentially causing crashes or resource exhaustion that impacts machine learning workloads.

Detection Methods for CVE-2025-2953

Indicators of Compromise

Unexpected crashes or hangs in PyTorch processes, particularly during max pooling operations
Resource exhaustion symptoms such as memory spikes or CPU hangs associated with MKL-DNN operations
Error logs indicating failures in torch.mkldnn_max_pool2d function calls
Abnormal process termination in machine learning inference pipelines

Detection Strategies

Monitor PyTorch application logs for errors or exceptions related to mkldnn_max_pool2d operations
Implement resource monitoring for Python processes running PyTorch workloads to detect unusual memory or CPU patterns
Deploy application-level health checks that verify PyTorch inference services remain responsive
Use static analysis tools to identify code paths that invoke torch.mkldnn_max_pool2d with external or untrusted inputs

Monitoring Recommendations

Set up alerting for PyTorch process crashes or unexpected restarts in production environments
Monitor system resource utilization (CPU, memory) for processes using MKL-DNN operations
Implement timeout mechanisms for machine learning inference calls to detect hanging operations
Log and audit the sources of models being loaded in shared or multi-tenant ML environments

How to Mitigate CVE-2025-2953

Immediate Actions Required

Review and restrict access to PyTorch execution environments, ensuring only trusted users can run arbitrary code
Audit machine learning pipelines to identify and remove any untrusted or unverified models
Implement input validation and sanitization for any external inputs that may reach PyTorch functions
Consider isolating PyTorch workloads in containerized environments to limit blast radius of potential DoS attacks

Patch Information

Workarounds

Avoid loading untrusted or unverified PyTorch models in production environments, as recommended by PyTorch's security policy
If possible, use alternative pooling implementations that don't rely on MKL-DNN for critical workloads until a patch is available
Implement process isolation and resource limits for PyTorch workloads to contain potential DoS impacts
Consider using non-MKL-DNN backends for max pooling operations where performance requirements allow

bash

# Configuration example - Disable MKL-DNN optimizations as a workaround
export MKLDNN_VERBOSE=0
export TORCH_MKLDNN_ENABLED=0

# Alternative: Set within Python before importing torch
# import os
# os.environ['TORCH_MKLDNN_ENABLED'] = '0'

CVE-2025-2953: PyTorch Denial of Service Vulnerability

CVE-2025-2953 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-2953

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-2953

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-2953

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2025-2953: PyTorch Denial of Service Vulnerability

CVE-2025-2953 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-2953

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-2953

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-2953

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform