CVE-2026-43318: Linux Kernel Race Condition Vulnerability

CVE-2026-43318 Overview

CVE-2026-43318 is a synchronization defect in the Linux kernel's AMD GPU driver (drm/amdgpu). The flaw resides in amdgpu_dma_buf_move_notify, the callback invoked when a shared buffer object (BO) backing a DMA-buf is invalidated. The function incorrectly assumes that page tables can be updated immediately, while in-flight jobs may still reference the original buffer location. In multi-GPU configurations without P2P PCI support, this race causes the importing GPU to access stale mappings and trigger page faults. The issue affects Linux kernel 7.0 release candidates rc1 through rc7 and earlier branches identified in the upstream commits.

Critical Impact
A local user running graphics workloads across multiple GPUs can trigger GPU page faults, leading to driver instability and denial of service on systems using the AMD GPU driver.

Affected Products

Linux kernel 7.0-rc1 through 7.0-rc7
Linux kernel stable branches prior to commits 3307459e, 82a7ea35, 89a9389a, and b18fc0ab
Systems using the amdgpu DRM driver, particularly multi-GPU configurations without P2P PCI support

Discovery Timeline

2026-05-08 - CVE-2026-43318 published to NVD
2026-05-15 - Last updated in NVD database

Technical Details for CVE-2026-43318

Vulnerability Analysis

The vulnerability is a race condition [CWE-NVD-noinfo] in the AMD GPU driver's DMA-buf invalidation path. When a buffer object shared across GPUs is migrated, amdgpu_dma_buf_move_notify must notify importing devices so they can update their page tables. A prior change made amdgpu_vm_handle_moved proceed as though immediate page table updates were safe by passing a ticket. This assumption is invalid when jobs targeting the buffer are still executing on the exporting GPU.

The upstream commit message describes a concrete scenario. With glxgears rendering on GPU0 and Xorg compositing on GPU1, the kernel may update GPU1's page tables while a tiled-to-linear blit job is still running on GPU0. The blit operation then references a buffer mapping that has already been torn down, producing a GPU page fault.

Root Cause

The root cause is improper synchronization between DMA-buf move notifications and outstanding GPU work. The fix removes the assumption that holding the reservation ticket grants permission to update page tables immediately. Instead, the driver waits for in-flight jobs on the shared buffer to complete before allowing the importer to refresh its mappings.

Attack Vector

Exploitation requires local access with the ability to submit GPU work. An unprivileged user running graphics or compute workloads on a system with multiple AMD GPUs, or any configuration that exercises the cross-device DMA-buf import path, can trigger the race. The result is a high-impact availability issue: GPU page faults, driver hangs, or system-wide graphics stack failure. Confidentiality and integrity are not directly impacted.

No verified public proof-of-concept is available. The reproduction described in the kernel changelog uses standard graphics workloads (glxgears plus Xorg) on dual-GPU systems without P2P PCI support.

Detection Methods for CVE-2026-43318

Indicators of Compromise

Kernel log messages from the amdgpu driver reporting GPU page faults, VM faults, or ring timeouts on systems with shared DMA-buf workloads
Unexpected drm_sched job timeouts or GPU reset events correlated with multi-GPU rendering sessions
Repeated X server or Wayland compositor crashes on hosts running AMD graphics with secondary GPUs

Detection Strategies

Inventory running kernel versions across the Linux fleet and flag hosts on 7.0-rc1 through 7.0-rc7 or unpatched stable branches using the amdgpu driver
Monitor dmesg and journalctl -k for amdgpu VM fault entries and correlate with user-level graphics process activity
Track GPU reset counters exposed under /sys/kernel/debug/dri/*/amdgpu_gpu_recover where debugfs is enabled

Monitoring Recommendations

Forward kernel logs to a centralized logging or SIEM platform and alert on amdgpu page fault patterns
Baseline normal GPU job completion behavior so anomalous timeouts and resets are surfaced quickly
For workstations running multi-GPU graphics workloads, enable crash reporting to capture driver state at fault time

How to Mitigate CVE-2026-43318

Immediate Actions Required

Update the Linux kernel to a version containing the upstream fixes referenced by commits 3307459e, 82a7ea35, 89a9389a, and b18fc0ab
Apply distribution security updates for the kernel package on all hosts using AMD GPUs
Restrict local access on multi-user systems with AMD graphics until patches are deployed

Patch Information

The fix corrects synchronization in amdgpu_dma_buf_move_notify so that page table updates wait for outstanding GPU jobs to complete on the shared buffer object. Patched commits are available upstream: Kernel commit 3307459e, Kernel commit 82a7ea35, Kernel commit 89a9389a, and Kernel commit b18fc0ab.

Workarounds

Avoid cross-GPU DMA-buf sharing by binding graphics sessions to a single AMD GPU using DRI_PRIME configuration where practical
On systems with P2P PCI support between GPUs, ensure it is enabled in firmware to bypass the affected code path
Disable secondary GPUs on affected hosts until the patched kernel is deployed if the workload permits

bash

# Verify the running kernel version and check for amdgpu in use
uname -r
lsmod | grep amdgpu

# After patching, reboot into the updated kernel
sudo reboot