CVE-2024-2029: Mudler LocalAI RCE Vulnerability

CVE-2024-2029 Overview

A command injection vulnerability exists in the TranscriptEndpoint of mudler/localai, specifically within the audioToWav function used for converting audio files to WAV format for transcription. The vulnerability arises due to the lack of sanitization of user-supplied filenames before passing them to ffmpeg via a shell command, allowing an attacker to execute arbitrary commands on the host system. Successful exploitation could lead to unauthorized access, data breaches, or other detrimental impacts, depending on the privileges of the process executing the code.

Critical Impact
This command injection vulnerability allows unauthenticated remote attackers to execute arbitrary system commands on hosts running vulnerable versions of LocalAI, potentially leading to complete system compromise.

Affected Products

mudler localai (all versions prior to patch commit 31a4c9c9d3abc58de2bdc5305419181c8b33eb1c)

Discovery Timeline

2024-04-10 - CVE CVE-2024-2029 published to NVD
2025-07-15 - Last updated in NVD database

Technical Details for CVE-2024-2029

Vulnerability Analysis

This command injection vulnerability (CWE-78) targets the audio transcription functionality in LocalAI. The vulnerable code path exists in the audioToWav function within backend/go/transcribe/transcript.go. When processing audio files for transcription, the application constructs a shell command string by directly concatenating user-supplied filename parameters without proper sanitization. This allows attackers to inject arbitrary shell commands through specially crafted filenames.

The vulnerability is particularly severe because it requires no authentication, can be exploited remotely over the network, and grants attackers the ability to execute commands with the same privileges as the LocalAI process. Organizations running LocalAI as a service exposed to untrusted networks face significant risk of complete system compromise.

Root Cause

The root cause of this vulnerability is improper input validation and the use of shell command execution with unsanitized user input. The original implementation used a sh() function that passed a formatted string directly to /bin/sh -c, which interprets special characters and command separators. By constructing filenames containing shell metacharacters such as ;, |, $(), or backticks, attackers can break out of the intended command context and execute arbitrary commands.

Attack Vector

The attack is network-based and targets the TranscriptEndpoint API. An attacker can craft a malicious HTTP request containing a specially crafted filename in the audio file upload. When the server processes this request and attempts to convert the audio file using ffmpeg, the injected commands are executed by the underlying shell.

For example, a filename like audio.mp3; malicious_command # would result in the shell executing both the intended ffmpeg command and the injected malicious command.

The following code shows the security patch that addresses this vulnerability:

 	"github.com/go-skynet/LocalAI/core/schema"
 )
 
-func sh(c string) (string, error) {
-	cmd := exec.Command("/bin/sh", "-c", c)
+func runCommand(command []string) (string, error) {
+	cmd := exec.Command(command[0], command[1:]...)
 	cmd.Env = os.Environ()
-	o, err := cmd.CombinedOutput()
-	return string(o), err
+	out, err := cmd.CombinedOutput()
+	return string(out), err
 }
 
-// AudioToWav converts audio to wav for transcribe. It bashes out to ffmpeg
+// AudioToWav converts audio to wav for transcribe.
 // TODO: use https://github.com/mccoyst/ogg?
 func audioToWav(src, dst string) error {
-	out, err := sh(fmt.Sprintf("ffmpeg -i %s -format s16le -ar 16000 -ac 1 -acodec pcm_s16le %s", src, dst))
+    command := []string{"ffmpeg", "-i", src, "-format", "s16le", "-ar", "16000", "-ac", "1", "-acodec", "pcm_s16le", dst}
+	out, err := runCommand(command)
 	if err != nil {
 		return fmt.Errorf("error: %w out: %s", err, out)
 	}
-
 	return nil
 }

Source: GitHub Commit

The fix replaces the shell-based command execution with direct process execution using exec.Command() with an argument array, preventing shell interpretation of special characters in filenames.

Detection Methods for CVE-2024-2029

Indicators of Compromise

Unusual or unexpected child processes spawned by the LocalAI service process
HTTP requests to the transcription endpoint containing suspicious filename patterns with shell metacharacters (;, |, $(), backticks)
Unexpected network connections originating from the LocalAI server
Anomalous file system modifications or new files in unexpected locations

Detection Strategies

Monitor process creation events for unexpected command execution by the LocalAI process, particularly shell invocations with suspicious arguments
Implement web application firewall (WAF) rules to detect and block requests containing shell injection patterns in file upload parameters
Enable comprehensive logging on the transcription API endpoint and alert on malformed filename parameters
Deploy endpoint detection and response (EDR) solutions to identify command injection exploitation attempts

Monitoring Recommendations

Configure alerts for process trees showing LocalAI spawning unexpected child processes such as shells or system utilities
Monitor API access logs for requests to /v1/audio/transcriptions or similar transcription endpoints with unusual payloads
Implement network traffic analysis to detect potential data exfiltration or reverse shell connections from LocalAI hosts
Review system audit logs for privilege escalation attempts following potential exploitation

How to Mitigate CVE-2024-2029

Immediate Actions Required

Update LocalAI to a version containing commit 31a4c9c9d3abc58de2bdc5305419181c8b33eb1c or later
If immediate patching is not possible, restrict network access to the LocalAI service to trusted networks only
Implement input validation at the network perimeter to reject requests with potentially malicious filename patterns
Review logs for evidence of exploitation attempts and investigate any suspicious activity

Patch Information

The vulnerability has been addressed in the official LocalAI repository. The fix is available in commit 31a4c9c9d3abc58de2bdc5305419181c8b33eb1c. Organizations should update to the latest version of LocalAI that includes this security fix. Detailed information about the patch can be found in the GitHub commit and the Huntr bounty report.

Workarounds

Place LocalAI behind a reverse proxy that validates and sanitizes incoming request parameters
Implement network segmentation to isolate LocalAI instances from sensitive internal resources
Run LocalAI with minimal system privileges using containerization or dedicated service accounts with restricted permissions
Disable or block access to the transcription endpoint if the functionality is not required

bash

# Example: Restrict LocalAI access using iptables
# Allow only trusted internal network (adjust IP range as needed)
iptables -A INPUT -p tcp --dport 8080 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 8080 -j DROP