CVE-2026-23047 Overview
A logic error vulnerability has been identified in the Linux kernel's libceph module, specifically in the calc_target() function. The function fails to properly set the t->paused flag when requests should be paused, while only implementing logic to clear the flag when requests should resume. This incomplete state management creates a critical issue for linger requests (such as watch operations), which rely on proper pause state tracking for reestablishment after transitions.
Critical Impact
This vulnerability can cause RBD (RADOS Block Device) images to become permanently locked, with rbd_dev->watch_mutex held indefinitely. Affected systems may experience unresponsive block devices that cannot be unmapped, requiring system restart to recover.
Affected Products
- Linux kernel with Ceph/RBD support enabled
- Systems using libceph module for Ceph storage communication
- Environments utilizing RBD block device mappings
Discovery Timeline
- 2026-02-04 - CVE CVE-2026-23047 published to NVD
- 2026-02-04 - Last updated in NVD database
Technical Details for CVE-2026-23047
Vulnerability Analysis
The vulnerability resides in the calc_target() function within the libceph kernel module. This function is responsible for determining the target OSD (Object Storage Daemon) for Ceph requests and managing the pause state of those requests. The core issue is an asymmetric implementation of pause state management.
While calc_target() correctly clears the t->paused flag when a request should transition from paused to active state, it never sets this flag when a request should be paused. Instead, the setting of t->paused is delegated to __submit_request(). This approach works for regular requests but creates a critical gap for linger requests.
Linger requests, such as watch operations used by RBD, do not pass through __submit_request() in the same manner as regular requests. Consequently, when conditions require a linger request to be paused, the lreq->t.paused flag is never set. This leads to a cascade of failures when the system attempts to reestablish watches after pause/unpause transitions.
Root Cause
The root cause is incomplete state machine implementation in calc_target(). The function has the necessary context to determine when a request should be paused (it already checks these conditions to clear the pause state), but the code path to set the pause flag was never implemented. This creates an asymmetric state transition where:
- Requests can transition from paused → active (flag cleared by calc_target())
- Requests cannot properly transition from active → paused for linger types (flag never set)
The dependency on __submit_request() for setting the pause flag is architectural debt that fails to account for the different handling of linger requests in the Ceph client subsystem.
Attack Vector
This is a local denial of service condition that manifests under specific operational scenarios. The vulnerability triggers when:
- A watch request is established on an RBD device
- Network conditions or OSD state changes cause requests to be paused long enough for the unwatch request to time out
- A subsequent rewatch request enters what should be a paused state
- Because the pause flag is never set, the request fails to be placed on the need_resend_linger list
The attack vector is indirect—while an attacker cannot directly exploit this for code execution, they could potentially trigger the condition by manipulating network connectivity or Ceph cluster state in environments where they have partial access. The result is a persistent denial of service where:
- The rbd_register_watch() function blocks indefinitely waiting for lreq->reg_commit_wait completion
- The rbd_dev->watch_mutex remains held
- Any attempt to unmap the RBD device (via rbd unmap) hangs in D (uninterruptible sleep) state
Detection Methods for CVE-2026-23047
Indicators of Compromise
- Processes stuck in D (uninterruptible sleep) state when attempting rbd unmap operations
- Stale Ceph watch registrations that fail to reestablish after network disruptions
- Kernel log messages indicating linger_reg_commit_wait timeouts or watch registration failures
- Hung rbd_reregister_watch() calls visible in kernel stack traces
Detection Strategies
- Monitor for processes in uninterruptible sleep state related to RBD operations using ps aux | grep " D" combined with stack trace analysis
- Implement watchdog monitoring on Ceph watch registration operations with timeout alerts
- Use kernel tracing (ftrace or bpftrace) to monitor calc_target() function calls and pause flag state transitions
- Check for mutex contention on rbd_dev->watch_mutex using lock debugging tools
Monitoring Recommendations
- Enable Ceph client debug logging to track watch establishment and timeout patterns
- Implement automated health checks for RBD device mappings to detect stuck operations
- Configure alerts for abnormally long watch registration times or repeated registration attempts
- Monitor kernel memory for potential resource leaks from stuck linger requests
How to Mitigate CVE-2026-23047
Immediate Actions Required
- Apply the kernel patches from the stable kernel tree addressing this vulnerability
- If immediate patching is not possible, plan maintenance windows to restart systems that show symptoms of stuck RBD mappings
- Review and potentially increase timeout values for watch operations to reduce the likelihood of triggering the race condition
- Consider implementing redundant storage paths to minimize impact of individual RBD device issues
Patch Information
Multiple patches have been released to the Linux kernel stable branches to address this vulnerability. The fix modifies calc_target() to properly set t->paused when pause conditions are detected, ensuring consistent state management for both regular and linger requests.
Available patches:
- Kernel Patch Update 2b3329b3
- Kernel Patch Update 4d3399c5
- Kernel Patch Update 4ebc711b
- Kernel Patch Update 5647d42c
- Kernel Patch Update 5d0dc83c
- Kernel Patch Update 6f468f6f
- Kernel Patch Update c0fe2994
Workarounds
- If a system becomes affected, the only reliable workaround is to restart the system to release the held mutex and clear stuck requests
- Implement proactive monitoring to detect early signs of watch registration issues and trigger preemptive maintenance
- In clustered environments, consider failing over workloads from affected nodes before the condition fully manifests
- Reduce frequency of pause/unpause transitions by ensuring stable network connectivity to Ceph clusters
# Configuration example
# Check for affected processes
ps aux | grep -E "(rbd|ceph)" | grep " D"
# View kernel stack traces for stuck processes
cat /proc/$(pgrep -f "rbd unmap")/stack
# Monitor Ceph client watch status
cat /sys/kernel/debug/ceph/*/osdc
# Force unmount if needed (may require reboot)
echo 1 > /sys/bus/rbd/remove_single_major
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


