CVE-2026-24160 Overview
CVE-2026-24160 affects NVIDIA TensorRT-LLM (TRT-LLM) on all supported platforms. The vulnerability stems from an unchecked return value that leads to a null pointer dereference [CWE-690]. A local attacker can trigger this condition to crash the TensorRT-LLM process, resulting in denial of service. Exploitation requires local access and user interaction, which limits the practical attack surface. The flaw does not expose data confidentiality or integrity, but availability impact is high because the affected inference runtime terminates abnormally. NVIDIA published an advisory and a corresponding fix through its support portal.
Critical Impact
Successful exploitation crashes the TensorRT-LLM runtime, disrupting large language model inference workloads on affected NVIDIA platforms.
Affected Products
- NVIDIA TensorRT-LLM (all platforms)
- nvidia:tensorrt_llm component across supported releases
- AI inference workloads built on NVIDIA TRT-LLM
Discovery Timeline
- 2026-05-20 - CVE-2026-24160 published to the National Vulnerability Database
- 2026-05-20 - Last updated in NVD database
Technical Details for CVE-2026-24160
Vulnerability Analysis
The defect resides in NVIDIA TensorRT-LLM, a library that optimizes and executes large language model inference on NVIDIA GPUs. The code path invokes a function that returns a pointer but does not validate the return value before dereferencing it. When the underlying call fails and returns a null pointer, the subsequent dereference triggers a segmentation fault. The result is an abrupt termination of the inference process.
The Common Weakness Enumeration assigns this issue to [CWE-690]: Unchecked Return Value to NULL Pointer Dereference. The attack vector is local, with low complexity, no privileges required, and user interaction needed to deliver the malformed input or model artifact. Confidentiality and integrity are unaffected, while availability impact is high.
Root Cause
The root cause is missing validation logic between a function returning a pointer and the code that consumes that pointer. The caller assumes the allocation or lookup always succeeds. Under specific input conditions reachable by a local user, the call returns null, and the dereference faults the process.
Attack Vector
An attacker with local access loads a crafted model, configuration, or input into a TensorRT-LLM workflow. User interaction is required, meaning the victim must execute or load the attacker-supplied artifact within the TRT-LLM runtime. No elevated privileges are needed. The exploit yields denial of service rather than code execution. Because no proof-of-concept code or public exploit is available, no exploitation code example is included in this article. Refer to the NVIDIA Support Article for vendor-supplied technical detail.
Detection Methods for CVE-2026-24160
Indicators of Compromise
- Unexpected segmentation faults or SIGSEGV signals terminating TensorRT-LLM processes
- Core dumps generated by TRT-LLM inference workers immediately after loading a user-supplied model or input batch
- Repeated process restarts of the inference service correlated with specific input artifacts
Detection Strategies
- Monitor host telemetry for abnormal termination of TensorRT-LLM processes and correlate with the user account that initiated the workload
- Inspect logs from AI inference orchestration platforms for crash loops tied to specific model files or prompt payloads
- Audit local users with access to TRT-LLM execution environments and review recently loaded model artifacts
Monitoring Recommendations
- Enable crash reporting and core dump collection on hosts running TensorRT-LLM
- Forward process termination events and GPU runtime errors to a centralized logging or SIEM platform
- Alert on repeated failures of the same inference worker process within short time windows
How to Mitigate CVE-2026-24160
Immediate Actions Required
- Apply the NVIDIA-supplied update for TensorRT-LLM referenced in the vendor advisory
- Restrict local access to systems running TRT-LLM to trusted users and service accounts only
- Validate the provenance of all model files and inference inputs before loading them into TRT-LLM
Patch Information
NVIDIA has published remediation guidance and patched builds. Consult the NVIDIA Support Article for affected versions and fixed releases. Additional references are available at the NVD CVE-2026-24160 Details and the CVE.org Record CVE-2026-24160.
Workarounds
- Limit which local users can submit models or inputs to TRT-LLM until patching is complete
- Run TensorRT-LLM workers under process supervision so crashes are contained and automatically restarted without service-wide impact
- Isolate inference workloads in dedicated containers or namespaces to reduce blast radius from a crashed process
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


