CVE-2025-68793 Overview
CVE-2025-68793 is a Use-After-Free (UAF) vulnerability in the Linux kernel's AMD GPU driver (amdgpu) affecting the GPU recovery mechanism. The flaw exists due to a race condition between the scheduler timeout callback and the TDR (Timeout Detection and Recovery) work queue during GPU recovery operations. When the GPU recovery function calls drm_sched_stop() followed by drm_sched_start(), the TDR queue may free a job structure before the timeout callback completes, resulting in a UAF when accessing the pasid field.
Critical Impact
Local attackers may exploit this race condition to trigger memory corruption, potentially leading to privilege escalation, denial of service, or information disclosure on systems with AMD GPUs.
Affected Products
- Linux Kernel with AMD GPU driver (amdgpu)
- Systems utilizing AMD GPU hardware with affected kernel versions
- Workloads involving GPU recovery operations
Discovery Timeline
- 2026-01-13 - CVE CVE-2025-68793 published to NVD
- 2026-01-13 - Last updated in NVD database
Technical Details for CVE-2025-68793
Vulnerability Analysis
This vulnerability is classified as a Use-After-Free resulting from a Time-of-Check Time-of-Use (TOCTOU) race condition. The issue occurs within the amdgpu_device_gpu_recover() function in the AMD GPU kernel driver. During GPU recovery, the driver performs a sequence of operations that includes stopping the scheduler via drm_sched_stop() and later restarting it with drm_sched_start().
The restart operation triggers the TDR work queue, which may free job structures asynchronously. If the timeout callback is still executing and attempts to access the job->pasid field after the TDR queue has already freed the job, the driver reads from deallocated memory. This creates a classic UAF condition that can be exploited for memory corruption attacks.
The KASAN (Kernel Address Sanitizer) trace captured in the vulnerability report shows a slab-use-after-free occurring at the amdgpu_device_gpu_recover+0x968/0x990 offset, with a 4-byte read from freed memory at address ffff88b0ce3f794c.
Root Cause
The root cause is improper synchronization between the scheduler timeout callback and the TDR work queue in the GPU recovery path. The pasid (Process Address Space Identifier) field is accessed after the job structure may have been freed by the TDR queue. The fix involves caching the pasid value early in the recovery process before any operations that could result in the job being freed, thereby eliminating the race condition.
Attack Vector
Exploitation of this vulnerability requires local access to a system with an AMD GPU. An attacker could potentially trigger this race condition by:
- Initiating workloads that cause GPU hangs requiring recovery
- Timing attacks to exploit the window between drm_sched_start() and the timeout callback completion
- Manipulating GPU workload scheduling to increase the likelihood of triggering the race condition
The vulnerability is triggered through the drm_sched_job_timedout workqueue handler in the gpu_sched module, which is invoked during GPU timeout recovery scenarios. The race window exists between the job being freed by the TDR queue and the access to job->pasid in the recovery function.
Detection Methods for CVE-2025-68793
Indicators of Compromise
- KASAN slab-use-after-free reports in kernel logs referencing amdgpu_device_gpu_recover
- Kernel oops or panic events originating from AMD GPU driver functions
- Unexpected system crashes during GPU-intensive workloads or recovery operations
- Workqueue errors related to amdgpu-reset-dev or drm_sched_job_timedout
Detection Strategies
- Enable KASAN (Kernel Address Sanitizer) in development or testing environments to detect UAF conditions
- Monitor kernel logs for errors containing amdgpu_device_gpu_recover, amdgpu_job_timedout, or drm_sched_job_timedout
- Deploy kernel livepatch monitoring to track attempts to exploit race conditions in GPU drivers
- Implement system audit rules to log GPU recovery events and associated process context
Monitoring Recommendations
- Configure centralized logging to capture kernel messages with amdgpu or gpu_sched module references
- Set up alerts for KASAN reports or memory corruption warnings in production systems
- Monitor for unusual patterns of GPU recovery events that could indicate exploitation attempts
- Track kernel module loading events for amdgpu and gpu_sched to ensure patched versions are in use
How to Mitigate CVE-2025-68793
Immediate Actions Required
- Update the Linux kernel to a version containing the fix commit
- Apply vendor-provided kernel patches for distributions using affected kernel versions
- Monitor systems for signs of exploitation while patches are being deployed
- Consider temporarily reducing GPU workloads that may trigger recovery scenarios on critical systems
Patch Information
The vulnerability has been resolved in the Linux kernel through commits that cache the pasid value early in the recovery process to avoid accessing potentially freed job structures. The fix is available in the stable kernel tree:
The patch was cherry-picked from commit 20880a3fd5dd7bca1a079534cf6596bda92e107d. System administrators should update to kernel versions containing these commits or apply backported patches from their Linux distribution.
Workarounds
- No complete workaround is available; applying the kernel patch is the recommended remediation
- Reducing GPU-intensive workloads may decrease the likelihood of triggering GPU recovery scenarios
- Monitoring and logging GPU recovery events can help detect potential exploitation attempts
- Consider implementing additional access controls on systems with AMD GPUs to limit local attack surface
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

