CVE-2024-42478 Overview
CVE-2024-42478 is a critical out-of-bounds read vulnerability affecting llama.cpp, a popular open-source project that provides LLM (Large Language Model) inference capabilities in C/C++. The vulnerability exists in the rpc_tensor structure where an unsafe data pointer member can be exploited to read arbitrary memory addresses, potentially leading to information disclosure or system compromise.
Critical Impact
This network-accessible vulnerability allows unauthenticated attackers to read arbitrary memory locations remotely, potentially exposing sensitive data such as model weights, user inputs, or system memory contents.
Affected Products
- ggerganov llama.cpp versions prior to commit b3561
- llama.cpp RPC server deployments exposed to untrusted networks
- Applications built using vulnerable llama.cpp libraries
Discovery Timeline
- 2024-08-12 - CVE-2024-42478 published to NVD
- 2024-08-15 - Last updated in NVD database
Technical Details for CVE-2024-42478
Vulnerability Analysis
This out-of-bounds read vulnerability (CWE-125) stems from improper handling of the data pointer member within the rpc_tensor structure used by llama.cpp's RPC (Remote Procedure Call) functionality. The RPC server, designed to enable distributed LLM inference, fails to properly validate memory addresses passed through the tensor data structure, allowing attackers to specify arbitrary memory locations for read operations.
The vulnerability is particularly concerning because the RPC server was previously configured to bind to all network interfaces (0.0.0.0) by default, making it accessible from any network. An attacker exploiting this flaw could read sensitive information from the process memory space, including model data, configuration details, or other runtime information.
Root Cause
The root cause lies in insufficient validation of the data pointer within the rpc_tensor structure. When processing RPC requests, the server does not verify that the pointer references valid, authorized memory regions before performing read operations. Additionally, the default network binding configuration (0.0.0.0) exposed this vulnerable functionality to network-based attackers without authentication.
Attack Vector
The attack can be executed remotely over the network without requiring authentication or user interaction. An attacker can craft malicious RPC requests containing specially constructed rpc_tensor structures with arbitrary memory addresses in the data pointer field. When the server processes these requests, it reads from the attacker-specified memory locations, potentially leaking sensitive data back to the attacker.
// Security patch in examples/rpc/rpc-server.cpp - Merge commit from fork
#include <stdio.h>
struct rpc_server_params {
- std::string host = "0.0.0.0";
+ std::string host = "127.0.0.1";
int port = 50052;
size_t backend_mem = 0;
};
Source: GitHub Commit Update
// Security patch in ggml/src/ggml-rpc.cpp - Merge commit from fork
fprintf(stderr, "Failed to set SO_REUSEADDR\n");
return nullptr;
}
+ if (inet_addr(host) == INADDR_NONE) {
+ fprintf(stderr, "Invalid host address: %s\n", host);
+ return nullptr;
+ }
struct sockaddr_in serv_addr;
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = inet_addr(host);
Source: GitHub Commit Update
Detection Methods for CVE-2024-42478
Indicators of Compromise
- Unexpected network connections to the llama.cpp RPC server port (default: 50052) from external IP addresses
- Anomalous RPC requests containing unusual or malformed tensor data structures
- Memory access violations or segmentation faults in llama.cpp processes
- Unexplained data exfiltration patterns from systems running llama.cpp RPC servers
Detection Strategies
- Monitor network traffic to port 50052 for connections originating from untrusted sources
- Implement network intrusion detection rules to identify malformed RPC tensor requests
- Use endpoint detection to monitor llama.cpp processes for abnormal memory access patterns
- Deploy application-level logging to capture and analyze incoming RPC requests
Monitoring Recommendations
- Enable verbose logging on llama.cpp RPC servers to track all incoming requests
- Configure network monitoring to alert on any external access to RPC server ports
- Implement host-based intrusion detection to identify memory read anomalies
- Regularly audit firewall rules to ensure RPC services are not exposed to untrusted networks
How to Mitigate CVE-2024-42478
Immediate Actions Required
- Update llama.cpp to commit b3561 or later immediately
- Verify that RPC servers are bound to 127.0.0.1 (localhost) rather than 0.0.0.0
- Implement network segmentation to isolate llama.cpp RPC services from untrusted networks
- Audit existing deployments for signs of exploitation
Patch Information
The vulnerability has been fixed in llama.cpp commit b72942fac998672a79a1ae3c03b340f7e629980b. The patch addresses the issue by changing the default RPC server binding from 0.0.0.0 to 127.0.0.1, limiting exposure to local connections only, and adding host address validation. Organizations should update to the patched version immediately. For detailed information, refer to the GitHub Security Advisory GHSA-5vm9-p64x-gqw9.
Workarounds
- Bind the RPC server to localhost (127.0.0.1) by explicitly setting the host parameter
- Deploy firewall rules to block external access to the RPC server port (50052)
- Use network segmentation to restrict RPC server access to trusted internal networks only
- Consider disabling the RPC functionality entirely if distributed inference is not required
# Configuration example
# Ensure RPC server binds only to localhost
./rpc-server --host 127.0.0.1 --port 50052
# Add firewall rule to block external access (iptables example)
iptables -A INPUT -p tcp --dport 50052 -s 127.0.0.1 -j ACCEPT
iptables -A INPUT -p tcp --dport 50052 -j DROP
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


