CVE-2026-53207: Linux Kernel Race Condition Vulnerability

CVE-2026-53207 Overview

CVE-2026-53207 is a Linux kernel race condition in the memory-failure subsystem. The flaw resides in get_huge_page_for_hwpoison() within mm/memory-failure.c and triggers an AA (recursive) self-deadlock on the non-recursive hugetlb_lock spinlock. Two concurrent madvise(MADV_HWPOISON) calls against the same hugetlb folio, racing with an unmap operation, cause folio_put() to be invoked while hugetlb_lock is still held. When the folio reference count reaches zero, free_huge_folio() attempts to re-acquire the same spinlock, hanging the CPU.

Critical Impact
Local unprivileged callers can trigger a kernel spinlock self-deadlock, freezing the affected CPU and causing a denial of service on systems using hugetlb pages.

Affected Products

Linux kernel versions containing the __get_huge_page_for_hwpoison() code path prior to the fix commits
Distributions shipping affected stable kernel branches referenced by the upstream commits
Linux systems configured with hugetlb memory and CONFIG_MEMORY_FAILURE enabled

Discovery Timeline

2026-06-25 - CVE-2026-53207 published to NVD
2026-06-25 - Last updated in NVD database

Technical Details for CVE-2026-53207

Vulnerability Analysis

The vulnerability is a kernel race condition combined with a locking error in the hugetlb memory-failure path. The wrapper get_huge_page_for_hwpoison() in hugetlb.c acquires hugetlb_lock and then calls __get_huge_page_for_hwpoison(). Inside that function, the out: label invokes folio_put() to drop the GUP reference while the spinlock remains held.

Under normal conditions the refcount stays above zero, so folio_put() returns without freeing the folio. However, if a concurrent unmap() releases the page table mapping reference at the same moment, folio_put() drops the refcount from 1 to 0. This invokes free_huge_folio(), which calls spin_lock_irqsave(&hugetlb_lock) on the already-held non-recursive spinlock, producing an AA deadlock.

Root Cause

The root cause is improper lock scope. The cleanup path in __get_huge_page_for_hwpoison() releases a folio reference while the caller still holds hugetlb_lock. Because hugetlb_lock is a standard non-recursive spinlock, any code path reachable from within the critical section that may call free_huge_folio() will self-deadlock. The trigger requires two concurrent madvise(MADV_HWPOISON) invocations racing against an unmap on the same hugetlb folio, classifying this as a [Race Condition] and [Deadlock] defect.

Attack Vector

A local user with the ability to call madvise(MADV_HWPOISON) on a hugetlb-backed mapping can race two such calls against a concurrent unmap to hit the deadlock. MADV_HWPOISON historically requires CAP_SYS_ADMIN, limiting practical exploitation to privileged local contexts or constrained multi-tenant environments where the capability is granted. The result is a CPU hang and denial of service rather than memory corruption or code execution.

The upstream fix moves hugetlb_lock acquisition into get_huge_page_for_hwpoison() itself and places spin_unlock_irq() before the folio_put() at the out: label, ensuring the folio is always released outside the lock. See the patch series at the Linux Kernel Commit fc3ff42 and the Linux Kernel Commit 77b73b5 for the canonical changes.

Detection Methods for CVE-2026-53207

Indicators of Compromise

Kernel soft-lockup or hard-lockup messages referencing hugetlb_lock, free_huge_folio, or __get_huge_page_for_hwpoison in dmesg or /var/log/messages.
lockdep warnings reporting recursive acquisition of hugetlb_lock on kernels built with CONFIG_PROVE_LOCKING.
One or more CPUs stuck at 100% in kernel mode with backtraces showing madvise → try_memory_failure_hugetlb → spin_lock_irqsave frames.

Detection Strategies

Audit running kernel versions against the fixed stable commits listed in the references and flag any host still running an unpatched branch.
Enable lockdep and CONFIG_DEBUG_SPINLOCK in test environments to surface the AA deadlock before production exposure.
Monitor for repeated madvise(MADV_HWPOISON) syscalls from non-root or low-privilege contexts using auditd rules on syscall=madvise.

Monitoring Recommendations

Forward kernel ring buffer messages and soft-lockup events to a centralized log pipeline and alert on hugetlb_lock or BUG: spinlock strings.
Track CPU stall metrics and watchdog rcu_sched warnings on hosts that use hugetlb memory for databases, KVM, or DPDK workloads.
Inventory hosts with CONFIG_HUGETLBFS=y and CONFIG_MEMORY_FAILURE=y to prioritize patch deployment.

How to Mitigate CVE-2026-53207

Immediate Actions Required

Apply the upstream stable kernel updates corresponding to the commits referenced in the advisory and reboot affected hosts.
Identify and prioritize systems that allocate hugetlb pages, including database, virtualization, and high-performance networking workloads.
Restrict CAP_SYS_ADMIN and the ability to issue madvise(MADV_HWPOISON) to trusted administrative accounts only.

Patch Information

The fix is distributed across multiple stable branches. Reference patches include Linux Kernel Commit fc3ff42, Linux Kernel Commit 77b73b5, Linux Kernel Commit 3c2d42b, Linux Kernel Commit a33bfed, Linux Kernel Commit bf7ba8f, and Linux Kernel Commit dd77a83. The patch moves hugetlb_lock acquisition into get_huge_page_for_hwpoison() and unlocks before calling folio_put() at the out: label.

Workarounds

Disable hugetlb usage where feasible by setting vm.nr_hugepages=0 until the kernel is patched.
Limit madvise(MADV_HWPOISON) exposure by removing CAP_SYS_ADMIN from non-administrative containers and namespaces.
Avoid running test or fault-injection tooling that issues MADV_HWPOISON against hugetlb-backed memory on production hosts.

bash

# Verify running kernel and hugetlb configuration
uname -r
grep -E 'HugePages_Total|Hugepagesize' /proc/meminfo

# Temporarily disable hugetlb allocations pending patch
sysctl -w vm.nr_hugepages=0

# Audit madvise syscalls from non-root users
auditctl -a always,exit -F arch=b64 -S madvise -F auid!=0 -k madv_hwpoison