CVE-2025-70999 Overview
A GPU device-ID validation flaw exists in the flow.cuda.get_device_capability() component of OneFlow v0.9.0. This vulnerability allows attackers to cause a Denial of Service (DoS) condition by providing a crafted device ID that bypasses input validation checks. The flaw stems from improper validation of user-supplied device identifiers before they are processed by the CUDA component.
Critical Impact
Attackers can remotely trigger service disruption by exploiting improper input validation in GPU device handling, potentially causing system crashes or resource exhaustion in machine learning infrastructure.
Affected Products
- OneFlow v0.9.0
- Systems utilizing flow.cuda.get_device_capability() component
- Machine learning pipelines dependent on OneFlow CUDA operations
Discovery Timeline
- 2026-01-28 - CVE CVE-2025-70999 published to NVD
- 2026-01-29 - Last updated in NVD database
Technical Details for CVE-2025-70999
Vulnerability Analysis
This vulnerability is classified under CWE-400 (Uncontrolled Resource Consumption), indicating that the flaw allows attackers to consume excessive resources through malicious input. The flow.cuda.get_device_capability() function fails to properly validate device ID parameters before attempting to query GPU capabilities, allowing specially crafted values to trigger unexpected behavior.
The vulnerability can be exploited over the network without requiring authentication or user interaction, making it accessible to remote attackers. When a malicious device ID is passed to the function, it can cause the application to enter an error state, consume excessive resources, or crash entirely, resulting in denial of service for legitimate users.
Root Cause
The root cause lies in insufficient input validation within the flow.cuda.get_device_capability() function. The code does not properly verify that the provided device ID corresponds to a valid, accessible GPU device before attempting operations. This allows out-of-range or malformed device IDs to be processed, triggering error conditions that lead to service disruption.
According to GitHub Issue #10660, the vulnerability was identified through boundary testing of the device ID parameter, revealing that negative values, excessively large integers, or specially formatted inputs can bypass existing validation logic.
Attack Vector
The attack can be conducted remotely over the network by any unauthenticated user who can interact with OneFlow's CUDA functions. The attacker crafts a malicious device ID value and passes it to the flow.cuda.get_device_capability() function. Since no special privileges or user interaction are required, the attack has a low barrier to entry.
The vulnerability mechanism involves passing invalid device identifiers to the CUDA capability query function. When the function receives a device ID that does not correspond to a valid GPU or falls outside expected ranges, it fails to handle the error gracefully, leading to resource exhaustion or application crash. Technical details and the original vulnerability report can be found in the GitHub Issue #10660 filed against the OneFlow repository.
Detection Methods for CVE-2025-70999
Indicators of Compromise
- Unexpected crashes or restarts of OneFlow-based applications
- Unusual error messages related to CUDA device queries in application logs
- Spike in failed GPU device capability requests
- Application memory or CPU usage anomalies during CUDA operations
Detection Strategies
- Monitor application logs for repeated errors from flow.cuda.get_device_capability() calls
- Implement input validation logging to capture malformed device ID attempts
- Set up alerts for abnormal patterns in CUDA API call failures
- Deploy network-level monitoring for suspicious requests targeting ML endpoints
Monitoring Recommendations
- Enable verbose logging for OneFlow CUDA operations in production environments
- Configure resource utilization alerts for systems running OneFlow workloads
- Implement rate limiting on APIs that expose CUDA functionality
- Regularly audit device ID parameters in application request logs
How to Mitigate CVE-2025-70999
Immediate Actions Required
- Review all code paths that call flow.cuda.get_device_capability() and implement input validation
- Restrict network access to OneFlow services where possible
- Implement input sanitization for any user-controllable device ID parameters
- Consider deploying a Web Application Firewall (WAF) to filter malicious requests
Patch Information
Users should monitor the OneFlow GitHub repository for official patches addressing this vulnerability. As of the last NVD update on 2026-01-29, no vendor advisory or patch has been published. Users are encouraged to check the OneFlow Homepage for security announcements.
Workarounds
- Validate device IDs against the actual number of available GPUs before calling flow.cuda.get_device_capability()
- Implement try-catch exception handling around CUDA device queries to prevent crash propagation
- Limit access to OneFlow APIs to trusted networks only
- Consider running OneFlow services in isolated containers with resource limits to contain DoS impact
# Example validation workaround
# Before calling flow.cuda.get_device_capability(), verify device ID bounds
# Check available GPU count and validate input accordingly
# Implementation will vary based on deployment environment
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

