CVE-2024-42479: Ggerganov Llama.cpp RCE Vulnerability

CVE-2024-42479 Overview

CVE-2024-42479 is a critical arbitrary address write vulnerability in llama.cpp, a popular C/C++ library for LLM (Large Language Model) inference. The vulnerability exists in the rpc_tensor structure where an unsafe data pointer member can be exploited to write to arbitrary memory addresses. This flaw was introduced in the RPC (Remote Procedure Call) functionality of llama.cpp and has been fixed in commit b3561.

Critical Impact
This vulnerability allows attackers to perform arbitrary memory writes via a network-accessible attack vector without requiring authentication or user interaction. Successful exploitation could lead to complete system compromise, including remote code execution, data corruption, or denial of service.

Affected Products

ggerganov llama.cpp (versions prior to commit b3561)

Discovery Timeline

2024-08-12 - CVE-2024-42479 published to NVD
2024-08-15 - Last updated in NVD database

Technical Details for CVE-2024-42479

Vulnerability Analysis

This vulnerability is classified under CWE-123 (Write-what-where Condition) and CWE-787 (Out-of-Bounds Write). The flaw resides in the RPC tensor handling mechanism of llama.cpp, which is used to distribute LLM inference workloads across networked systems.

The rpc_tensor structure contains a data pointer member that lacks proper validation when processing incoming RPC requests. An attacker who can send malicious RPC requests to a vulnerable llama.cpp instance can manipulate this pointer to point to arbitrary memory locations. When subsequent write operations occur using this pointer, the attacker gains the ability to write controlled data to any memory address within the process space.

This type of write-what-where vulnerability is particularly dangerous as it can be leveraged to overwrite critical data structures, function pointers, or security-sensitive memory regions, potentially leading to full remote code execution.

Root Cause

The root cause of this vulnerability is insufficient validation of the data pointer member within the rpc_tensor structure. When RPC tensor data is received from a remote source, the library fails to verify that the pointer references a valid, expected memory region before performing write operations. This allows maliciously crafted RPC messages to inject arbitrary pointer values that are subsequently dereferenced during tensor operations.

Attack Vector

The vulnerability is exploitable over the network without requiring authentication or user interaction. An attacker needs network access to a llama.cpp instance running with RPC functionality enabled. By sending specially crafted RPC requests containing malicious tensor data with manipulated pointer values, the attacker can trigger arbitrary memory writes.

The attack flow involves:

Establishing a connection to the target llama.cpp RPC endpoint
Crafting an RPC request with a malicious rpc_tensor structure
Setting the data pointer to a target memory address
Triggering a write operation that uses the manipulated pointer
Achieving arbitrary code execution or system compromise

Technical details and the security patch can be found in the GitHub Security Advisory GHSA-wcr5-566p-9cwj.

Detection Methods for CVE-2024-42479

Indicators of Compromise

Unexpected network connections to llama.cpp RPC ports from untrusted sources
Abnormal memory access patterns or segmentation faults in llama.cpp processes
Process crashes or unexpected behavior during tensor operations
Suspicious RPC traffic patterns with malformed or oversized tensor data

Detection Strategies

Monitor network traffic to llama.cpp RPC endpoints for anomalous request patterns
Implement application-level logging for RPC tensor operations to detect invalid pointer values
Deploy memory protection mechanisms such as ASLR and DEP to make exploitation more difficult
Use runtime memory corruption detection tools in development and staging environments

Monitoring Recommendations

Enable detailed logging for all RPC operations in llama.cpp deployments
Configure network intrusion detection systems to alert on suspicious traffic to ML inference endpoints
Implement process monitoring to detect unexpected crashes or memory violations in llama.cpp instances
Regularly audit network access controls for systems running llama.cpp with RPC enabled

How to Mitigate CVE-2024-42479

Immediate Actions Required

Update llama.cpp to version b3561 or later immediately
Restrict network access to llama.cpp RPC endpoints using firewall rules
If updating is not immediately possible, disable RPC functionality until the patch can be applied
Audit logs for any suspicious RPC activity that may indicate exploitation attempts

Patch Information

The vulnerability has been fixed in llama.cpp commit b72942fac998672a79a1ae3c03b340f7e629980b. Users should update to this commit or any subsequent version that includes this fix. The patch implements proper validation of the data pointer member in the rpc_tensor structure before performing write operations.

For detailed patch information, see the GitHub Commit Changes.

Workarounds

Disable RPC functionality in llama.cpp if not required for your deployment
Implement network segmentation to isolate llama.cpp instances from untrusted networks
Use a reverse proxy with authentication in front of RPC endpoints to restrict access
Deploy intrusion prevention systems to detect and block exploitation attempts

bash

# Configuration example - Restrict RPC access via firewall
# Block external access to llama.cpp RPC port (example using iptables)
iptables -A INPUT -p tcp --dport 50052 -s 127.0.0.1 -j ACCEPT
iptables -A INPUT -p tcp --dport 50052 -j DROP