CVE-2025-68790 Overview
A use-after-free vulnerability has been discovered in the Linux kernel's mlx5 network driver, specifically in the HCA_PORTS component handling during LAG (Link Aggregation Group) teardown. The vulnerability occurs when the hca_devcom_comp pointer is not cleared in the device's private data after unregistration, allowing a second pass through mlx5_unload_one() to attempt unregistration of an already freed component.
Critical Impact
This vulnerability can lead to kernel panics and system crashes, particularly on s390 architecture where PCI error recovery events routinely trigger multiple passes through the unload function.
Affected Products
- Linux kernel with mlx5_core network driver
- Systems using Mellanox/NVIDIA ConnectX network adapters
- s390 architecture systems with PCI error recovery enabled
Discovery Timeline
- 2026-01-13 - CVE CVE-2025-68790 published to NVD
- 2026-01-13 - Last updated in NVD database
Technical Details for CVE-2025-68790
Vulnerability Analysis
This use-after-free vulnerability stems from improper state management in the mlx5 network driver during device unload operations. The Linux kernel's mlx5_core driver manages HCA (Host Channel Adapter) ports through a device communication component (hca_devcom_comp). During normal LAG teardown, this component is unregistered, but the pointer in the device's private data structure is not cleared.
On s390 architecture, PCI-level recovery events characteristically trigger two concurrent passes through mlx5_unload_one() - one via the poll_health() method and another through mlx5_pci_err_detected() as a callback from the generic PCI error recovery mechanism. When the second pass occurs before the pointer is cleared, it attempts to unregister the already-freed component, resulting in a use-after-free condition.
The crash manifests with a failing address pattern of 6b6b6b6b6b6b6000, which is the KASAN (Kernel Address Sanitizer) poison pattern for freed memory (0x6b), providing clear indication of use-after-free behavior.
Root Cause
The root cause is a missing state cleanup operation in the LAG teardown code path. After calling the unregister function for the HCA_PORTS component, the hca_devcom_comp pointer remains set to the now-invalid address. The fix requires clearing this pointer to NULL immediately after unregistration to prevent subsequent accesses.
The race condition is exacerbated by the s390 architecture's PCI error recovery design, which legitimately invokes the unload path through multiple code paths simultaneously. Without proper pointer clearing, the driver cannot distinguish between an initialized component requiring cleanup and an already-freed component.
Attack Vector
The vulnerability is triggered through PCI error recovery events on affected systems. While the attack vector requires local system access or the ability to induce PCI errors, the consequences include:
- Kernel panic due to invalid memory access
- System crash and potential denial of service
- Possible memory corruption if the freed memory is reallocated before the second unregister attempt
The call trace shows the crash occurring in __lock_acquire() when the driver attempts to acquire a lock on the already-freed device structure during mlx5_detach_device().
Detection Methods for CVE-2025-68790
Indicators of Compromise
- Kernel panic messages containing mlx5_unload_one or mlx5_pci_err_detected in the call trace
- KASAN reports with failing addresses matching the pattern 6b6b6b6b6b6b6xxx (freed memory poison)
- System crashes during PCI error recovery events on systems with Mellanox/NVIDIA network adapters
- Oops messages referencing mlx5_detach_device or mlx5_core module
Detection Strategies
- Enable KASAN (Kernel Address Sanitizer) to detect use-after-free conditions during development and testing
- Monitor kernel logs for mlx5_core module errors, particularly during PCI error recovery events
- Implement system monitoring for unexpected kernel panics on servers with ConnectX network adapters
- Use kernel debugging tools like lockdep to detect lock acquisition on invalid memory
Monitoring Recommendations
- Configure kernel crash dump collection (kdump) to capture detailed crash analysis for mlx5-related panics
- Set up alerting on kernel oops and panic events, filtering for mlx5_core module involvement
- Monitor PCI error recovery events on s390 systems with mlx5 network devices
- Review system stability metrics on servers using LAG configurations with Mellanox adapters
How to Mitigate CVE-2025-68790
Immediate Actions Required
- Update the Linux kernel to a version containing the fix commits
- If immediate patching is not possible, consider temporarily disabling LAG configurations on affected systems
- Prioritize patching on s390 architecture systems where the vulnerability is most easily triggered
- Review and test PCI error recovery procedures before applying patches in production environments
Patch Information
The vulnerability has been addressed in the Linux kernel through the following commits:
- Kernel commit 6a107cfe9c99a079e578a4c5eb70038101a3599f
- Kernel commit d2495f529d60e8e8c43e6ad524089c38b8be7bc4
The fix ensures that hca_devcom_comp is cleared in the device's private data immediately after unregistering it during LAG teardown, preventing subsequent passes through the unload function from attempting to access freed memory.
Workarounds
- Avoid triggering PCI error recovery events on systems that cannot be immediately patched
- On s390 systems, coordinate maintenance windows to reduce exposure during PCI-related operations
- Consider disabling Link Aggregation features temporarily on critical systems until patches can be applied
- Enable kernel crash dump analysis to collect diagnostic information if crashes occur before patching
# Verify current kernel version and check for mlx5 module
uname -r
lsmod | grep mlx5
# Check if LAG is configured on mlx5 interfaces
ip link show type bond
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


