CVE-2026-21869: llama.cpp RCE Vulnerability

CVE-2026-21869 Overview

A critical out-of-bounds write vulnerability has been identified in llama.cpp, the popular LLM inference library written in C/C++. The vulnerability exists in the server's completion endpoints where the n_discard parameter is parsed directly from JSON input without validation to ensure non-negative values. When a negative value is supplied and the context fills up, the llama_memory_seq_rm and llama_memory_seq_add functions receive a reversed range and negative offset, causing deterministic out-of-bounds memory writes during token evaluation.

Critical Impact
This vulnerability enables deterministic memory corruption that can crash the llama.cpp server process or potentially enable remote code execution (RCE) against systems running vulnerable versions.

Affected Products

llama.cpp commits up to and including 55d4206c8
llama.cpp server instances with completion endpoints exposed

Discovery Timeline

2026-01-08 - CVE-2026-21869 published to NVD
2026-01-08 - Last updated in NVD database

Technical Details for CVE-2026-21869

Vulnerability Analysis

This vulnerability stems from insufficient input validation in llama.cpp's server completion endpoints. The n_discard parameter, which controls how many tokens should be discarded from the context during inference, is accepted directly from user-supplied JSON without bounds checking. The absence of validation for negative values creates a dangerous condition where memory operations can be manipulated by an attacker.

When a user submits a request with a negative n_discard value and the model's context buffer fills up during processing, the internal memory management functions llama_memory_seq_rm and llama_memory_seq_add are invoked with corrupted parameters. The negative value causes range calculations to invert, resulting in a negative offset being passed to the token evaluation loop. This triggers writes to memory locations outside the intended buffer boundaries.

The vulnerability is classified under CWE-787 (Out-of-bounds Write), which represents one of the most dangerous vulnerability categories due to its potential to corrupt critical data structures, crash applications, or enable arbitrary code execution.

Root Cause

The root cause is the complete absence of input validation for the n_discard parameter in the JSON parsing logic of the completion endpoint handlers. The parameter is expected to be a non-negative integer representing the number of tokens to discard, but the code fails to enforce this constraint before using the value in memory offset calculations.

Attack Vector

An attacker can exploit this vulnerability remotely over the network by sending a crafted HTTP request to the llama.cpp server's completion endpoint. The attack requires user interaction in the form of processing a malicious request, but no authentication or special privileges are needed. The attacker constructs a JSON payload containing a negative n_discard value and submits it to the server. When the server processes this request and the context fills during inference, the out-of-bounds write occurs.

The vulnerability manifests in the memory management functions when processing token sequences. By supplying a negative n_discard value, the attacker causes the range parameters passed to llama_memory_seq_rm and llama_memory_seq_add to become inverted, and the offset calculation produces a negative value. This results in writes to memory addresses before the intended buffer, corrupting adjacent data structures. For technical details, refer to the GitHub Security Advisory.

Detection Methods for CVE-2026-21869

Indicators of Compromise

Unexpected crashes or segmentation faults in llama.cpp server processes
HTTP requests to completion endpoints containing negative integer values for n_discard
Anomalous memory access patterns or core dumps from llama.cpp processes
Error logs indicating memory corruption or invalid memory operations

Detection Strategies

Implement web application firewall (WAF) rules to inspect JSON payloads for negative n_discard values
Monitor llama.cpp server logs for unusual request patterns or processing errors
Deploy application-level input validation to reject requests with negative parameter values
Use runtime memory protection tools to detect out-of-bounds memory access attempts

Monitoring Recommendations

Enable verbose logging on llama.cpp server instances to capture request parameters
Implement network traffic analysis to identify suspicious patterns targeting completion endpoints
Deploy crash monitoring and automated alerting for llama.cpp processes
Conduct regular log reviews for requests containing unexpected parameter values

How to Mitigate CVE-2026-21869

Immediate Actions Required

Restrict network access to llama.cpp server instances to trusted sources only
Implement input validation at the network perimeter to reject negative n_discard values
Consider temporarily disabling public access to completion endpoints until a patch is available
Deploy network segmentation to isolate AI inference servers from critical infrastructure

Patch Information

There is no official fix available at the time of publication. Organizations should monitor the llama.cpp GitHub repository for security updates and apply patches immediately when released. The vulnerability affects commits 55d4206c8 and prior.

Workarounds

Implement a reverse proxy or API gateway with input validation to sanitize n_discard parameters
Use network-level access controls to limit which clients can reach the completion endpoints
Deploy Web Application Firewall rules to block requests with negative integer values in JSON payloads
Run llama.cpp server processes in sandboxed environments to limit the impact of potential exploitation

bash

# Example nginx configuration to block negative n_discard values
# Add to server block protecting llama.cpp endpoint
location /completion {
    # Reject requests containing negative n_discard
    if ($request_body ~* "\"n_discard\"\s*:\s*-[0-9]+") {
        return 400;
    }
    proxy_pass http://llama_backend;
}

CVE-2026-21869 Overview

Critical Impact
This vulnerability enables deterministic memory corruption that can crash the llama.cpp server process or potentially enable remote code execution (RCE) against systems running vulnerable versions.

Affected Products

llama.cpp commits up to and including 55d4206c8
llama.cpp server instances with completion endpoints exposed

Discovery Timeline

2026-01-08 - CVE-2026-21869 published to NVD
2026-01-08 - Last updated in NVD database

Technical Details for CVE-2026-21869

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-21869

Indicators of Compromise

Unexpected crashes or segmentation faults in llama.cpp server processes
HTTP requests to completion endpoints containing negative integer values for n_discard
Anomalous memory access patterns or core dumps from llama.cpp processes
Error logs indicating memory corruption or invalid memory operations

Detection Strategies

Implement web application firewall (WAF) rules to inspect JSON payloads for negative n_discard values
Monitor llama.cpp server logs for unusual request patterns or processing errors
Deploy application-level input validation to reject requests with negative parameter values
Use runtime memory protection tools to detect out-of-bounds memory access attempts

Monitoring Recommendations

Enable verbose logging on llama.cpp server instances to capture request parameters
Implement network traffic analysis to identify suspicious patterns targeting completion endpoints
Deploy crash monitoring and automated alerting for llama.cpp processes
Conduct regular log reviews for requests containing unexpected parameter values

How to Mitigate CVE-2026-21869

Immediate Actions Required

Restrict network access to llama.cpp server instances to trusted sources only
Implement input validation at the network perimeter to reject negative n_discard values
Consider temporarily disabling public access to completion endpoints until a patch is available
Deploy network segmentation to isolate AI inference servers from critical infrastructure

Patch Information

Workarounds

Implement a reverse proxy or API gateway with input validation to sanitize n_discard parameters
Use network-level access controls to limit which clients can reach the completion endpoints
Deploy Web Application Firewall rules to block requests with negative integer values in JSON payloads
Run llama.cpp server processes in sandboxed environments to limit the impact of potential exploitation

bash

# Example nginx configuration to block negative n_discard values
# Add to server block protecting llama.cpp endpoint
location /completion {
    # Reject requests containing negative n_discard
    if ($request_body ~* "\"n_discard\"\s*:\s*-[0-9]+") {
        return 400;
    }
    proxy_pass http://llama_backend;
}

CVE-2026-21869: llama.cpp RCE Vulnerability

CVE-2026-21869 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-21869

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-21869

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-21869

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2026-21869: llama.cpp RCE Vulnerability

CVE-2026-21869 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-21869

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-21869

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-21869

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform