CVE-2026-23213: Linux Kernel AMD PM DOS Vulnerability

CVE-2026-23213 Overview

A race condition vulnerability has been identified in the Linux kernel's AMD GPU driver (drm/amd/pm) that can lead to system instability during SMU (System Management Unit) Mode 1 reset operations. During the reset cycle, the ASIC becomes temporarily inaccessible via PCIe, but other driver components may still attempt to access MMIO (Memory-Mapped I/O) registers, resulting in incomplete PCIe transactions that can cause NMI (Non-Maskable Interrupt) panics or complete system hangs.

Critical Impact
Systems with AMD GPUs may experience kernel panics or complete system hangs when the GPU driver triggers a Mode 1 reset while other driver threads or interrupt handlers attempt concurrent MMIO register access.

Affected Products

Linux kernel with AMD GPU driver (drm/amd/pm module)
Systems with AMD graphics hardware utilizing SMU Mode 1 reset functionality
Linux distributions shipping affected kernel versions

Discovery Timeline

2026-02-18 - CVE CVE-2026-23213 published to NVD
2026-02-18 - Last updated in NVD database

Technical Details for CVE-2026-23213

Vulnerability Analysis

This vulnerability stems from a race condition in the AMD GPU power management subsystem within the Linux kernel. When the driver initiates a Mode 1 reset of the System Management Unit, the ASIC (Application-Specific Integrated Circuit) undergoes a reset cycle during which it becomes temporarily unavailable on the PCIe bus.

The core issue is that during this reset window, other driver components—including interrupt handlers and concurrent driver threads—may still attempt to access MMIO registers. Since the hardware is offline, these PCIe transactions cannot complete successfully. The result is uncompleted PCIe transactions that the system's NMI watchdog or PCIe error handling mechanisms interpret as fatal errors, leading to kernel panics or complete system hangs.

The fix introduces a no_hw_access flag that is set immediately after triggering the reset, signaling to other driver components that they should skip any register access attempts. A memory barrier (smp_mb()) ensures this flag update is visible across all CPU cores before the driver enters its sleep/wait state for the reset to complete.

Root Cause

The root cause is a missing synchronization mechanism between the SMU Mode 1 reset path and other driver components that perform hardware register access. Without proper coordination, there is a timing window where:

The reset is initiated and the ASIC goes offline
Other threads or interrupt handlers attempt MMIO access
PCIe transactions fail because the target device is unreachable
The system responds to these failures with NMI panics or hangs

The absence of a globally visible flag and appropriate memory barriers allowed this race condition to manifest, particularly on multi-core systems where different CPU cores might be executing driver code simultaneously.

Attack Vector

This vulnerability is triggered during normal GPU driver operations rather than through external attack vectors. The race condition occurs internally within the driver when Mode 1 reset is invoked. While not directly exploitable for remote code execution, it can cause denial of service through system instability.

The vulnerability manifests when concurrent access patterns occur during the reset window. The fix addresses this by implementing proper synchronization through the no_hw_access flag combined with memory barrier instructions to ensure visibility across all processor cores. For detailed implementation, see the kernel git commits referenced in the patch.

Detection Methods for CVE-2026-23213

Indicators of Compromise

Kernel panic messages referencing NMI watchdog timeouts during GPU operations
System logs showing PCIe transaction errors related to AMD GPU device
Crash dumps indicating faults in drm/amd/pm or SMU-related kernel functions
Unexpected system hangs occurring during GPU workload transitions or power management events

Detection Strategies

Monitor kernel logs (dmesg, /var/log/kern.log) for NMI panic messages or PCIe errors associated with AMD GPU devices
Implement crash dump collection to capture kernel oops/panic events for post-mortem analysis
Review system stability logs for patterns of hangs correlating with GPU activity or power state changes
Use SentinelOne Singularity platform to detect anomalous kernel behavior and crash patterns

Monitoring Recommendations

Enable kernel crash dump collection (kdump) to capture diagnostic information during system failures
Configure kernel logging to capture PCIe error events and driver debug messages
Implement system health monitoring to detect unexpected reboots or hangs in production environments
Deploy SentinelOne agents for real-time kernel integrity monitoring and threat detection

How to Mitigate CVE-2026-23213

Immediate Actions Required

Update to a patched Linux kernel version containing the fix (commit 7edb503fe4b6d67f47d8bb0dfafb8e699bb0f8a4)
Apply vendor-provided kernel updates from your Linux distribution
If updates cannot be applied immediately, consider reducing GPU workloads that may trigger Mode 1 resets
Review system stability and ensure crash dump mechanisms are in place to capture any occurrences

Patch Information

The vulnerability has been addressed through kernel patches that introduce the no_hw_access flag and appropriate memory barriers. The fix is available through multiple stable kernel branches:

Organizations should apply kernel updates from their Linux distribution vendor that incorporate these fixes.

Workarounds

Apply vendor-provided kernel patches as the primary mitigation
If patching is not immediately possible, avoid operations that may trigger GPU Mode 1 resets
Implement system monitoring to detect and respond to crash events quickly
Consider temporarily using alternative graphics drivers if available and compatible with your workload

bash

# Check current kernel version and update to patched version
uname -r

# For Debian/Ubuntu systems
sudo apt update && sudo apt upgrade linux-image-generic

# For RHEL/CentOS/Fedora systems
sudo dnf update kernel

# Reboot to apply the updated kernel
sudo reboot

CVE-2026-23213 Overview

Critical Impact
Systems with AMD GPUs may experience kernel panics or complete system hangs when the GPU driver triggers a Mode 1 reset while other driver threads or interrupt handlers attempt concurrent MMIO register access.

Affected Products

Linux kernel with AMD GPU driver (drm/amd/pm module)
Systems with AMD graphics hardware utilizing SMU Mode 1 reset functionality
Linux distributions shipping affected kernel versions

Discovery Timeline

2026-02-18 - CVE CVE-2026-23213 published to NVD
2026-02-18 - Last updated in NVD database

Technical Details for CVE-2026-23213

Vulnerability Analysis

Root Cause

The reset is initiated and the ASIC goes offline
Other threads or interrupt handlers attempt MMIO access
PCIe transactions fail because the target device is unreachable
The system responds to these failures with NMI panics or hangs

Attack Vector

Detection Methods for CVE-2026-23213

Indicators of Compromise

Kernel panic messages referencing NMI watchdog timeouts during GPU operations
System logs showing PCIe transaction errors related to AMD GPU device
Crash dumps indicating faults in drm/amd/pm or SMU-related kernel functions
Unexpected system hangs occurring during GPU workload transitions or power management events

Detection Strategies

Monitor kernel logs (dmesg, /var/log/kern.log) for NMI panic messages or PCIe errors associated with AMD GPU devices
Implement crash dump collection to capture kernel oops/panic events for post-mortem analysis
Review system stability logs for patterns of hangs correlating with GPU activity or power state changes
Use SentinelOne Singularity platform to detect anomalous kernel behavior and crash patterns

Monitoring Recommendations

Enable kernel crash dump collection (kdump) to capture diagnostic information during system failures
Configure kernel logging to capture PCIe error events and driver debug messages
Implement system health monitoring to detect unexpected reboots or hangs in production environments
Deploy SentinelOne agents for real-time kernel integrity monitoring and threat detection

How to Mitigate CVE-2026-23213

Immediate Actions Required

Update to a patched Linux kernel version containing the fix (commit 7edb503fe4b6d67f47d8bb0dfafb8e699bb0f8a4)
Apply vendor-provided kernel updates from your Linux distribution
If updates cannot be applied immediately, consider reducing GPU workloads that may trigger Mode 1 resets
Review system stability and ensure crash dump mechanisms are in place to capture any occurrences

Patch Information

The vulnerability has been addressed through kernel patches that introduce the no_hw_access flag and appropriate memory barriers. The fix is available through multiple stable kernel branches:

Organizations should apply kernel updates from their Linux distribution vendor that incorporate these fixes.

Workarounds

Apply vendor-provided kernel patches as the primary mitigation
If patching is not immediately possible, avoid operations that may trigger GPU Mode 1 resets
Implement system monitoring to detect and respond to crash events quickly
Consider temporarily using alternative graphics drivers if available and compatible with your workload

bash

# Check current kernel version and update to patched version
uname -r

# For Debian/Ubuntu systems
sudo apt update && sudo apt upgrade linux-image-generic

# For RHEL/CentOS/Fedora systems
sudo dnf update kernel

# Reboot to apply the updated kernel
sudo reboot

CVE-2026-23213: Linux Kernel AMD PM DOS Vulnerability

CVE-2026-23213 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-23213

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-23213

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-23213

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2026-23213: Linux Kernel AMD PM DOS Vulnerability

CVE-2026-23213 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-23213

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-23213

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-23213

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform