CVE-2025-65891 Overview
A GPU device-ID validation flaw in OneFlow v0.9.0 allows attackers to trigger a Denial of Service (DoS) by invoking flow.cuda.get_device_properties() with an invalid or negative device index. This vulnerability stems from improper input validation (CWE-400: Uncontrolled Resource Consumption) when handling device index parameters in the CUDA device properties function.
Critical Impact
Attackers can remotely crash OneFlow-based machine learning applications by supplying malformed device index values, causing service disruption without requiring authentication.
Affected Products
- OneFlow v0.9.0
- OneFlow deep learning framework installations utilizing CUDA functionality
- Applications and services built on the affected OneFlow version
Discovery Timeline
- 2026-01-28 - CVE CVE-2025-65891 published to NVD
- 2026-01-29 - Last updated in NVD database
Technical Details for CVE-2025-65891
Vulnerability Analysis
This vulnerability exists in the OneFlow deep learning framework's CUDA device property retrieval mechanism. When the flow.cuda.get_device_properties() function is called with an invalid or negative device index, the application fails to properly validate the input before attempting to access GPU device information. This lack of boundary checking allows an attacker to trigger application crashes or resource exhaustion, resulting in a Denial of Service condition.
The vulnerability is classified under CWE-400 (Uncontrolled Resource Consumption), indicating that improper handling of input values leads to resource exhaustion or system instability. The attack can be launched remotely without requiring user interaction or authentication privileges, making it particularly dangerous for exposed OneFlow deployments.
Root Cause
The root cause of this vulnerability is the absence of proper input validation for the device index parameter in the flow.cuda.get_device_properties() function. The function does not adequately check whether the provided device index falls within the valid range of available GPU devices before attempting to query device properties. When a negative value or an index exceeding the number of available devices is passed, the underlying CUDA operations fail catastrophically rather than returning an appropriate error, leading to application termination.
Attack Vector
The attack can be executed remotely over the network by any unauthenticated attacker who can invoke the vulnerable function. The attacker simply needs to call the flow.cuda.get_device_properties() function with a crafted negative or invalid device index value. This could occur through:
- Direct API calls to exposed OneFlow services
- Malicious inputs to machine learning inference endpoints
- Crafted requests to applications that expose device enumeration functionality
The vulnerability requires no special privileges or user interaction, making it highly exploitable in environments where OneFlow services are network-accessible. Technical details and proof-of-concept information can be found in GitHub Issue #10661.
Detection Methods for CVE-2025-65891
Indicators of Compromise
- Application crashes or unexpected termination of OneFlow-based services coinciding with CUDA device queries
- Log entries showing invalid device index values being passed to flow.cuda.get_device_properties()
- Repeated service restarts or watchdog triggers for OneFlow applications
- Error messages related to CUDA device enumeration failures
Detection Strategies
- Monitor application logs for calls to flow.cuda.get_device_properties() with negative or out-of-range index values
- Implement input validation checks at API boundaries before device index values reach the vulnerable function
- Deploy application-level monitoring to detect abnormal crash patterns in OneFlow processes
- Set up alerts for repeated CUDA-related error conditions in production environments
Monitoring Recommendations
- Enable verbose logging for CUDA device operations in OneFlow deployments
- Implement rate limiting on API endpoints that accept device index parameters
- Monitor system stability metrics for services utilizing OneFlow's CUDA functionality
- Review access logs for suspicious patterns of device enumeration requests
How to Mitigate CVE-2025-65891
Immediate Actions Required
- Implement input validation to reject negative or out-of-bounds device index values before passing to flow.cuda.get_device_properties()
- Restrict network access to OneFlow services that expose CUDA device enumeration functionality
- Monitor for updated versions of OneFlow that address this vulnerability
- Review application code for any direct usage of the vulnerable function
Patch Information
As of the last NVD update on 2026-01-29, users should monitor the OneFlow GitHub Repository for security patches addressing this vulnerability. The issue has been tracked in GitHub Issue #10661, which may contain additional remediation information.
Workarounds
- Wrap calls to flow.cuda.get_device_properties() with validation logic that checks device index bounds against flow.cuda.device_count()
- Implement a proxy layer that sanitizes device index inputs before they reach the OneFlow framework
- Isolate OneFlow services from untrusted network access until a patch is available
- Consider deploying OneFlow in containerized environments with restart policies to minimize DoS impact
# Configuration example - Input validation wrapper
# Validate device index before calling vulnerable function
# Example validation: Check if device_id >= 0 and device_id < cuda_device_count
# Reject requests with invalid device indices at the application layer
# Implement rate limiting on device enumeration endpoints
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

