CVE-2026-21869 Overview
A critical out-of-bounds write vulnerability has been identified in llama.cpp, the popular LLM inference library written in C/C++. The vulnerability exists in the server's completion endpoints where the n_discard parameter is parsed directly from JSON input without validation to ensure non-negative values. When a negative value is supplied and the context fills up, the llama_memory_seq_rm and llama_memory_seq_add functions receive a reversed range and negative offset, causing deterministic out-of-bounds memory writes during token evaluation.
Critical Impact
This vulnerability enables deterministic memory corruption that can crash the llama.cpp server process or potentially enable remote code execution (RCE) against systems running vulnerable versions.
Affected Products
- llama.cpp commits up to and including 55d4206c8
- llama.cpp server instances with completion endpoints exposed
Discovery Timeline
- 2026-01-08 - CVE-2026-21869 published to NVD
- 2026-01-08 - Last updated in NVD database
Technical Details for CVE-2026-21869
Vulnerability Analysis
This vulnerability stems from insufficient input validation in llama.cpp's server completion endpoints. The n_discard parameter, which controls how many tokens should be discarded from the context during inference, is accepted directly from user-supplied JSON without bounds checking. The absence of validation for negative values creates a dangerous condition where memory operations can be manipulated by an attacker.
When a user submits a request with a negative n_discard value and the model's context buffer fills up during processing, the internal memory management functions llama_memory_seq_rm and llama_memory_seq_add are invoked with corrupted parameters. The negative value causes range calculations to invert, resulting in a negative offset being passed to the token evaluation loop. This triggers writes to memory locations outside the intended buffer boundaries.
The vulnerability is classified under CWE-787 (Out-of-bounds Write), which represents one of the most dangerous vulnerability categories due to its potential to corrupt critical data structures, crash applications, or enable arbitrary code execution.
Root Cause
The root cause is the complete absence of input validation for the n_discard parameter in the JSON parsing logic of the completion endpoint handlers. The parameter is expected to be a non-negative integer representing the number of tokens to discard, but the code fails to enforce this constraint before using the value in memory offset calculations.
Attack Vector
An attacker can exploit this vulnerability remotely over the network by sending a crafted HTTP request to the llama.cpp server's completion endpoint. The attack requires user interaction in the form of processing a malicious request, but no authentication or special privileges are needed. The attacker constructs a JSON payload containing a negative n_discard value and submits it to the server. When the server processes this request and the context fills during inference, the out-of-bounds write occurs.
The vulnerability manifests in the memory management functions when processing token sequences. By supplying a negative n_discard value, the attacker causes the range parameters passed to llama_memory_seq_rm and llama_memory_seq_add to become inverted, and the offset calculation produces a negative value. This results in writes to memory addresses before the intended buffer, corrupting adjacent data structures. For technical details, refer to the GitHub Security Advisory.
Detection Methods for CVE-2026-21869
Indicators of Compromise
- Unexpected crashes or segmentation faults in llama.cpp server processes
- HTTP requests to completion endpoints containing negative integer values for n_discard
- Anomalous memory access patterns or core dumps from llama.cpp processes
- Error logs indicating memory corruption or invalid memory operations
Detection Strategies
- Implement web application firewall (WAF) rules to inspect JSON payloads for negative n_discard values
- Monitor llama.cpp server logs for unusual request patterns or processing errors
- Deploy application-level input validation to reject requests with negative parameter values
- Use runtime memory protection tools to detect out-of-bounds memory access attempts
Monitoring Recommendations
- Enable verbose logging on llama.cpp server instances to capture request parameters
- Implement network traffic analysis to identify suspicious patterns targeting completion endpoints
- Deploy crash monitoring and automated alerting for llama.cpp processes
- Conduct regular log reviews for requests containing unexpected parameter values
How to Mitigate CVE-2026-21869
Immediate Actions Required
- Restrict network access to llama.cpp server instances to trusted sources only
- Implement input validation at the network perimeter to reject negative n_discard values
- Consider temporarily disabling public access to completion endpoints until a patch is available
- Deploy network segmentation to isolate AI inference servers from critical infrastructure
Patch Information
There is no official fix available at the time of publication. Organizations should monitor the llama.cpp GitHub repository for security updates and apply patches immediately when released. The vulnerability affects commits 55d4206c8 and prior.
Workarounds
- Implement a reverse proxy or API gateway with input validation to sanitize n_discard parameters
- Use network-level access controls to limit which clients can reach the completion endpoints
- Deploy Web Application Firewall rules to block requests with negative integer values in JSON payloads
- Run llama.cpp server processes in sandboxed environments to limit the impact of potential exploitation
# Example nginx configuration to block negative n_discard values
# Add to server block protecting llama.cpp endpoint
location /completion {
# Reject requests containing negative n_discard
if ($request_body ~* "\"n_discard\"\s*:\s*-[0-9]+") {
return 400;
}
proxy_pass http://llama_backend;
}
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


