CVE-2025-68756: Linux Kernel Race Condition Vulnerability

CVE-2025-68756 Overview

A deadlock vulnerability has been identified in the Linux kernel's block multi-queue (blk-mq) subsystem. The flaw exists in the blk_mq_[un]quiesce_tagset() functions which improperly use the set->tag_list_lock mutex during queue quiesce operations. This creates a circular lock dependency that can result in system hangs when NVMe devices experience command timeouts while queue operations are being modified.

The vulnerability was introduced when commit 98d81f0df70c modified the NVMe driver to quiesce the entire tagset instead of individual queues. This change exposed a lock ordering issue between the timeout handling path and the queue removal path, causing two threads to deadlock while waiting for each other to release resources.

Critical Impact
Systems using NVMe storage devices may experience complete system hangs when device timeouts occur during queue management operations, requiring a hard reboot to recover.

Affected Products

Linux Kernel (versions with blk-mq tagset sharing)
NVMe storage subsystem components
Systems using block multi-queue I/O scheduling

Discovery Timeline

2026-01-05 - CVE CVE-2025-68756 published to NVD
2026-01-08 - Last updated in NVD database

Technical Details for CVE-2025-68756

Vulnerability Analysis

The deadlock condition arises from the interaction between two kernel code paths that both require access to shared resources but acquire locks in conflicting orders. The blk_mq_{add,del}_queue_tag_set() functions manage queue attachments to tagsets and must freeze queues before modifying the BLK_MQ_F_TAG_QUEUE_SHARED flag. These functions hold the set->tag_list_lock mutex during this operation.

Simultaneously, blk_mq_quiesce_tagset() attempts to walk the queues in set->tag_list while also holding this same lock. When an NVMe command times out, the timeout handler calls nvme_dev_disable(), which invokes blk_mq_quiesce_tagset(). If another thread is in the process of removing a queue and waiting for it to freeze (which requires the timeout handler to complete), a classic deadlock occurs.

The two conflicting stack traces demonstrate this circular dependency:

Thread A (timeout handler): nvme_timeout() → nvme_dev_disable() → blk_mq_quiesce_tagset() - waiting for set->tag_list_lock
Thread B (queue removal): nvme_ns_remove() → del_gendisk() → blk_mq_exit_queue() → blk_mq_update_tag_set_shared() → blk_mq_freeze_queue_wait() - holding set->tag_list_lock, waiting for queue freeze

Root Cause

The root cause is the improper synchronization mechanism used in blk_mq_[un]quiesce_tagset(). The functions use mutex-based locking (set->tag_list_lock) to protect the tag list traversal, but this creates a lock dependency that conflicts with the queue freeze operation. Since quiescing a queue does not require sleeping, the use of a mutex is unnecessarily restrictive and creates the conditions for deadlock.

The fix replaces the mutex-based synchronization with RCU (Read-Copy-Update), which is a lockless synchronization mechanism well-suited for read-mostly data structures. This change eliminates the lock ordering conflict by allowing the quiesce operation to traverse the queue list without holding any mutex.

Attack Vector

This is a denial of service vulnerability that manifests under specific operational conditions rather than through external exploitation. The deadlock can be triggered when:

An NVMe device experiences a command timeout while queue management operations are in progress
A system administrator unbinds an NVMe device via sysfs while I/O operations are pending
Hot-removal of NVMe storage occurs during high I/O load with timeouts

While not remotely exploitable in the traditional sense, the vulnerability can cause complete system unavailability requiring manual intervention. In virtualized or containerized environments where NVMe devices are shared or passthrough is used, the impact could affect multiple workloads.

Detection Methods for CVE-2025-68756

Indicators of Compromise

System hangs with no kernel panic or crash dump generated
Processes stuck in uninterruptible sleep state (D state) related to NVMe or block I/O operations
Kernel stack traces in logs showing blk_mq_quiesce_tagset and blk_mq_freeze_queue_wait on different threads
NVMe device timeout messages followed by system unresponsiveness

Detection Strategies

Monitor for hung task warnings in kernel logs referencing blk_mq_* or nvme_* functions
Use echo t > /proc/sysrq-trigger on unresponsive systems to capture stack traces showing the deadlock pattern
Deploy kernel watchdog monitoring to detect and alert on system hangs
Review dmesg output for NVMe timeout messages correlating with system stability issues

Monitoring Recommendations

Implement automated kernel log analysis for deadlock signatures involving block layer functions
Configure hardware watchdog timers to automatically recover from system hangs
Monitor NVMe device health metrics and timeout rates through smartctl or nvme-cli tools
Set up alerting for unusual patterns of NVMe command timeouts in storage-intensive workloads

How to Mitigate CVE-2025-68756

Immediate Actions Required

Update the Linux kernel to a patched version containing the RCU-based fix
Avoid hot-removal or unbinding of NVMe devices during periods of high I/O activity until patched
Consider disabling automatic NVMe device unbinding in production environments
Implement hardware watchdog timers to enable automatic recovery if deadlock occurs

Patch Information

The vulnerability has been addressed through multiple kernel commits that replace the mutex-based synchronization with RCU in the affected functions. The fix updates blk_mq_[un]quiesce_tagset() to use RCU traversal and modifies blk_mq_{add,del}_queue_tag_set() to use RCU-safe list operations.

Relevant patch commits include:

Workarounds

Avoid triggering NVMe device unbind operations during periods of active I/O or when timeouts may occur
Configure longer NVMe command timeout values to reduce the likelihood of timeout-triggered deadlocks
Use SCSI-based storage instead of NVMe where the risk is unacceptable and patching is not immediately possible
Implement monitoring to detect early signs of deadlock and trigger graceful failover before complete system hang

bash

# Check current kernel version for vulnerability status
uname -r

# Monitor for hung tasks in kernel logs
dmesg | grep -i "hung_task\|deadlock\|blk_mq"

# Configure hardware watchdog for automatic recovery
echo 60 > /proc/sys/kernel/hung_task_timeout_secs

CVE-2025-68756 Overview

Critical Impact
Systems using NVMe storage devices may experience complete system hangs when device timeouts occur during queue management operations, requiring a hard reboot to recover.

Affected Products

Linux Kernel (versions with blk-mq tagset sharing)
NVMe storage subsystem components
Systems using block multi-queue I/O scheduling

Discovery Timeline

2026-01-05 - CVE CVE-2025-68756 published to NVD
2026-01-08 - Last updated in NVD database

Technical Details for CVE-2025-68756

Vulnerability Analysis

The two conflicting stack traces demonstrate this circular dependency:

Thread A (timeout handler): nvme_timeout() → nvme_dev_disable() → blk_mq_quiesce_tagset() - waiting for set->tag_list_lock
Thread B (queue removal): nvme_ns_remove() → del_gendisk() → blk_mq_exit_queue() → blk_mq_update_tag_set_shared() → blk_mq_freeze_queue_wait() - holding set->tag_list_lock, waiting for queue freeze

Root Cause

Attack Vector

This is a denial of service vulnerability that manifests under specific operational conditions rather than through external exploitation. The deadlock can be triggered when:

An NVMe device experiences a command timeout while queue management operations are in progress
A system administrator unbinds an NVMe device via sysfs while I/O operations are pending
Hot-removal of NVMe storage occurs during high I/O load with timeouts

Detection Methods for CVE-2025-68756

Indicators of Compromise

System hangs with no kernel panic or crash dump generated
Processes stuck in uninterruptible sleep state (D state) related to NVMe or block I/O operations
Kernel stack traces in logs showing blk_mq_quiesce_tagset and blk_mq_freeze_queue_wait on different threads
NVMe device timeout messages followed by system unresponsiveness

Detection Strategies

Monitor for hung task warnings in kernel logs referencing blk_mq_* or nvme_* functions
Use echo t > /proc/sysrq-trigger on unresponsive systems to capture stack traces showing the deadlock pattern
Deploy kernel watchdog monitoring to detect and alert on system hangs
Review dmesg output for NVMe timeout messages correlating with system stability issues

Monitoring Recommendations

Implement automated kernel log analysis for deadlock signatures involving block layer functions
Configure hardware watchdog timers to automatically recover from system hangs
Monitor NVMe device health metrics and timeout rates through smartctl or nvme-cli tools
Set up alerting for unusual patterns of NVMe command timeouts in storage-intensive workloads

How to Mitigate CVE-2025-68756

Immediate Actions Required

Update the Linux kernel to a patched version containing the RCU-based fix
Avoid hot-removal or unbinding of NVMe devices during periods of high I/O activity until patched
Consider disabling automatic NVMe device unbinding in production environments
Implement hardware watchdog timers to enable automatic recovery if deadlock occurs

Patch Information

Relevant patch commits include:

Workarounds

Avoid triggering NVMe device unbind operations during periods of active I/O or when timeouts may occur
Configure longer NVMe command timeout values to reduce the likelihood of timeout-triggered deadlocks
Use SCSI-based storage instead of NVMe where the risk is unacceptable and patching is not immediately possible
Implement monitoring to detect early signs of deadlock and trigger graceful failover before complete system hang

bash

# Check current kernel version for vulnerability status
uname -r

# Monitor for hung tasks in kernel logs
dmesg | grep -i "hung_task\|deadlock\|blk_mq"

# Configure hardware watchdog for automatic recovery
echo 60 > /proc/sys/kernel/hung_task_timeout_secs

CVE-2025-68756: Linux Kernel Race Condition Vulnerability

CVE-2025-68756 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-68756

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-68756

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-68756

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform

CVE-2025-68756: Linux Kernel Race Condition Vulnerability

CVE-2025-68756 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-68756

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-68756

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-68756

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform