CVE-2026-34159 Overview
A critical remote code execution vulnerability has been identified in llama.cpp, a popular C/C++ inference library for large language models. Prior to version b8492, the RPC backend's deserialize_tensor() function skips all bounds validation when a tensor's buffer field is set to 0. This allows an unauthenticated attacker to read and write arbitrary process memory via crafted GRAPH_COMPUTE messages, ultimately achieving remote code execution without any authentication requirement.
Critical Impact
Unauthenticated attackers with TCP access to the RPC server port can achieve full ASLR bypass and remote code execution by combining pointer leaks from ALLOC_BUFFER/BUFFER_GET_BASE with arbitrary memory read/write capabilities.
Affected Products
- llama.cpp versions prior to b8492
- llama.cpp RPC backend deployments with exposed TCP ports
- Any system running vulnerable llama.cpp RPC server accessible over network
Discovery Timeline
- 2026-04-01 - CVE-2026-34159 published to NVD
- 2026-04-01 - Last updated in NVD database
Technical Details for CVE-2026-34159
Vulnerability Analysis
This vulnerability is classified under CWE-119 (Improper Restriction of Operations within the Bounds of a Memory Buffer). The flaw exists in the RPC backend's tensor deserialization logic, which fails to properly validate tensor buffer pointers before performing memory operations. When a specially crafted message sets the tensor buffer field to 0 (null), the code path bypasses critical bounds checking, allowing arbitrary memory access.
The attack is particularly severe because it requires no authentication - any attacker with TCP connectivity to the RPC server port can exploit this vulnerability. The combination of ALLOC_BUFFER and BUFFER_GET_BASE operations allows attackers to leak pointer information, defeating Address Space Layout Randomization (ASLR) protections. Once ASLR is bypassed, the arbitrary read/write primitive enables full remote code execution.
Root Cause
The root cause is a missing null pointer check in the deserialize_tensor() function within the RPC backend. The function fails to validate that the tensor's buffer pointer is valid before proceeding with memory operations, allowing attackers to manipulate memory access through carefully crafted GRAPH_COMPUTE messages with a buffer field set to 0.
Attack Vector
The attack is network-based and requires no user interaction or privileges. An attacker must have TCP access to the RPC server port. The attack sequence involves:
- Establishing a TCP connection to the llama.cpp RPC server
- Sending ALLOC_BUFFER and BUFFER_GET_BASE messages to leak pointer information and bypass ASLR
- Crafting a malicious GRAPH_COMPUTE message with the tensor buffer field set to 0
- Exploiting the missing bounds validation to perform arbitrary memory read/write operations
- Achieving remote code execution by overwriting critical memory structures
The following patch was applied to address this vulnerability:
const rpc_tensor * tensor = it_ptr->second;
struct ggml_tensor * result = deserialize_tensor(ctx, tensor);
- if (result == nullptr) {
+ if (result == nullptr || result->buffer == nullptr) {
+ GGML_LOG_ERROR("[%s] invalid tensor: null %s (id=%" PRIu64 ")\n",
+ __func__, result == nullptr ? "tensor" : "buffer", id);
return nullptr;
}
tensor_map[id] = result;
Source: GitHub Commit
Detection Methods for CVE-2026-34159
Indicators of Compromise
- Unusual TCP connections to llama.cpp RPC server ports from untrusted sources
- Abnormal GRAPH_COMPUTE messages containing tensors with null buffer fields
- Memory access violations or crashes in the llama.cpp process
- Unexpected process behavior or child process spawning from the RPC server
Detection Strategies
- Monitor network traffic for malformed RPC messages targeting llama.cpp services
- Implement application-level logging for ALLOC_BUFFER, BUFFER_GET_BASE, and GRAPH_COMPUTE operations
- Deploy memory integrity monitoring to detect unauthorized memory access patterns
- Use intrusion detection systems with rules for detecting RCE exploitation attempts
Monitoring Recommendations
- Enable verbose logging in llama.cpp RPC backend to capture tensor deserialization errors
- Monitor for the specific error message: "invalid tensor: null buffer"
- Track connection patterns to RPC server ports for reconnaissance activity
- Implement network segmentation to limit exposure of RPC services
How to Mitigate CVE-2026-34159
Immediate Actions Required
- Upgrade llama.cpp to version b8492 or later immediately
- Restrict network access to RPC server ports using firewall rules
- Implement network segmentation to isolate LLM inference services
- Audit logs for any indicators of exploitation attempts
Patch Information
The vulnerability has been patched in llama.cpp version b8492. The fix adds an explicit null check for the tensor buffer pointer in the deserialize_tensor() function, preventing the bypass of bounds validation. For detailed patch information, refer to the GitHub Pull Request #20908 and the GitHub Security Advisory.
Workarounds
- Block external access to llama.cpp RPC server ports at the network perimeter
- Deploy RPC services only on localhost or trusted internal networks
- Use VPN or SSH tunneling for remote access to RPC services instead of direct exposure
- Implement additional network-level authentication in front of RPC services
# Example: Restrict RPC server port access using iptables
iptables -A INPUT -p tcp --dport <rpc_port> -s 127.0.0.1 -j ACCEPT
iptables -A INPUT -p tcp --dport <rpc_port> -s <trusted_network>/24 -j ACCEPT
iptables -A INPUT -p tcp --dport <rpc_port> -j DROP
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


