Skip to main content
CVE Vulnerability Database
Vulnerability Database/CVE-2025-65891

CVE-2025-65891: OneFlow GPU Device DoS Vulnerability

CVE-2025-65891 is a denial of service flaw in OneFlow v0.9.0 caused by GPU device-ID validation issues. Attackers can crash systems using invalid device indices. This article covers technical details, impact, and mitigation.

Published:

CVE-2025-65891 Overview

A GPU device-ID validation flaw in OneFlow v0.9.0 allows attackers to trigger a Denial of Service (DoS) by invoking flow.cuda.get_device_properties() with an invalid or negative device index. This vulnerability stems from improper input validation (CWE-400: Uncontrolled Resource Consumption) when handling device index parameters in the CUDA device properties function.

Critical Impact

Attackers can remotely crash OneFlow-based machine learning applications by supplying malformed device index values, causing service disruption without requiring authentication.

Affected Products

  • OneFlow v0.9.0
  • OneFlow deep learning framework installations utilizing CUDA functionality
  • Applications and services built on the affected OneFlow version

Discovery Timeline

  • 2026-01-28 - CVE CVE-2025-65891 published to NVD
  • 2026-01-29 - Last updated in NVD database

Technical Details for CVE-2025-65891

Vulnerability Analysis

This vulnerability exists in the OneFlow deep learning framework's CUDA device property retrieval mechanism. When the flow.cuda.get_device_properties() function is called with an invalid or negative device index, the application fails to properly validate the input before attempting to access GPU device information. This lack of boundary checking allows an attacker to trigger application crashes or resource exhaustion, resulting in a Denial of Service condition.

The vulnerability is classified under CWE-400 (Uncontrolled Resource Consumption), indicating that improper handling of input values leads to resource exhaustion or system instability. The attack can be launched remotely without requiring user interaction or authentication privileges, making it particularly dangerous for exposed OneFlow deployments.

Root Cause

The root cause of this vulnerability is the absence of proper input validation for the device index parameter in the flow.cuda.get_device_properties() function. The function does not adequately check whether the provided device index falls within the valid range of available GPU devices before attempting to query device properties. When a negative value or an index exceeding the number of available devices is passed, the underlying CUDA operations fail catastrophically rather than returning an appropriate error, leading to application termination.

Attack Vector

The attack can be executed remotely over the network by any unauthenticated attacker who can invoke the vulnerable function. The attacker simply needs to call the flow.cuda.get_device_properties() function with a crafted negative or invalid device index value. This could occur through:

  • Direct API calls to exposed OneFlow services
  • Malicious inputs to machine learning inference endpoints
  • Crafted requests to applications that expose device enumeration functionality

The vulnerability requires no special privileges or user interaction, making it highly exploitable in environments where OneFlow services are network-accessible. Technical details and proof-of-concept information can be found in GitHub Issue #10661.

Detection Methods for CVE-2025-65891

Indicators of Compromise

  • Application crashes or unexpected termination of OneFlow-based services coinciding with CUDA device queries
  • Log entries showing invalid device index values being passed to flow.cuda.get_device_properties()
  • Repeated service restarts or watchdog triggers for OneFlow applications
  • Error messages related to CUDA device enumeration failures

Detection Strategies

  • Monitor application logs for calls to flow.cuda.get_device_properties() with negative or out-of-range index values
  • Implement input validation checks at API boundaries before device index values reach the vulnerable function
  • Deploy application-level monitoring to detect abnormal crash patterns in OneFlow processes
  • Set up alerts for repeated CUDA-related error conditions in production environments

Monitoring Recommendations

  • Enable verbose logging for CUDA device operations in OneFlow deployments
  • Implement rate limiting on API endpoints that accept device index parameters
  • Monitor system stability metrics for services utilizing OneFlow's CUDA functionality
  • Review access logs for suspicious patterns of device enumeration requests

How to Mitigate CVE-2025-65891

Immediate Actions Required

  • Implement input validation to reject negative or out-of-bounds device index values before passing to flow.cuda.get_device_properties()
  • Restrict network access to OneFlow services that expose CUDA device enumeration functionality
  • Monitor for updated versions of OneFlow that address this vulnerability
  • Review application code for any direct usage of the vulnerable function

Patch Information

As of the last NVD update on 2026-01-29, users should monitor the OneFlow GitHub Repository for security patches addressing this vulnerability. The issue has been tracked in GitHub Issue #10661, which may contain additional remediation information.

Workarounds

  • Wrap calls to flow.cuda.get_device_properties() with validation logic that checks device index bounds against flow.cuda.device_count()
  • Implement a proxy layer that sanitizes device index inputs before they reach the OneFlow framework
  • Isolate OneFlow services from untrusted network access until a patch is available
  • Consider deploying OneFlow in containerized environments with restart policies to minimize DoS impact
bash
# Configuration example - Input validation wrapper
# Validate device index before calling vulnerable function
# Example validation: Check if device_id >= 0 and device_id < cuda_device_count
# Reject requests with invalid device indices at the application layer
# Implement rate limiting on device enumeration endpoints

Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

Default Legacy - Prefooter | Experience the World’s Most Advanced Cybersecurity Platform

Experience the Most Advanced Cybersecurity Platform

See how the world’s most intelligent, autonomous cybersecurity platform can protect your organization today and into the future.