CVE-2024-58340: LangChain ReDoS Vulnerability

CVE-2024-58340 Overview

CVE-2024-58340 is a Regular Expression Denial of Service (ReDoS) vulnerability affecting LangChain versions up to and including 0.3.1. The vulnerability exists in the MRKLOutputParser.parse() method located in libs/langchain/langchain/agents/mrkl/output_parser.py. When parsing tool actions from model output, the parser applies a backtracking-prone regular expression that can be exploited to cause excessive CPU consumption and denial-of-service conditions.

An attacker who can supply or influence the parsed text—for example, via prompt injection in downstream applications that pass LLM output directly into MRKLOutputParser.parse()—can trigger this vulnerability by providing a specially crafted payload. This results in significant parsing delays and can render affected applications unresponsive.

Critical Impact
Network-accessible ReDoS vulnerability in LangChain's agent output parsing can cause denial of service through CPU exhaustion, affecting AI/ML applications that rely on MRKL agent architecture.

Affected Products

LangChain versions up to and including 0.3.1
Applications using MRKLOutputParser.parse() for LLM output processing
AI/ML pipelines that pass untrusted LLM output to LangChain's MRKL agent parser

Discovery Timeline

2026-01-12 - CVE CVE-2024-58340 published to NVD
2026-01-13 - Last updated in NVD database

Technical Details for CVE-2024-58340

Vulnerability Analysis

This vulnerability is classified under CWE-1333 (Inefficient Regular Expression Complexity). The core issue lies in the regular expression pattern used by the MRKLOutputParser class to extract tool actions from LLM-generated text. The regex contains patterns that exhibit exponential backtracking behavior when processing carefully crafted input strings.

In LangChain's MRKL (Modular Reasoning, Knowledge and Language) agent architecture, the output parser is responsible for interpreting the structured responses from language models. When the parser encounters malformed or adversarial input designed to exploit the regex's backtracking behavior, the pattern matching algorithm enters a state of catastrophic backtracking.

The attack is particularly concerning in AI/ML applications because the input to MRKLOutputParser.parse() often originates from LLM output, which can be influenced through prompt injection techniques. This creates an indirect attack vector where an attacker manipulates the LLM's response to contain a ReDoS payload that is then processed by the vulnerable parser.

Root Cause

The root cause of CVE-2024-58340 is the use of a backtracking-prone regular expression in the MRKLOutputParser.parse() method. Regular expressions with nested quantifiers or overlapping alternatives can exhibit exponential time complexity when the regex engine repeatedly backtracks to find a match against specially crafted input strings. The vulnerable code path is triggered when parsing text that causes the regex to explore an exponentially growing number of possible matches before ultimately failing.

Attack Vector

The attack vector for this vulnerability is network-based and requires no authentication or user interaction. An attacker can exploit this vulnerability through the following methods:

Direct Input Manipulation: If an application accepts user-controlled input that is passed to MRKLOutputParser.parse(), an attacker can directly submit a ReDoS payload.
Prompt Injection: In more sophisticated scenarios, an attacker can inject prompts that cause the underlying LLM to generate output containing ReDoS payloads. When this output is subsequently parsed by the vulnerable method, it triggers the denial of service.
Indirect Influence: Any pathway that allows an attacker to influence the text being parsed—including through data sources consumed by the LLM—represents a potential attack vector.

The attack causes excessive CPU consumption on the server processing the request, effectively creating a denial-of-service condition that can impact application availability.

Detection Methods for CVE-2024-58340

Indicators of Compromise

Abnormal CPU utilization spikes correlating with LangChain agent parsing operations
Unusually long response times for requests that invoke MRKL agent functionality
Application logs showing timeouts or hangs in MRKLOutputParser.parse() calls
Patterns in input data containing repetitive character sequences typical of ReDoS payloads

Detection Strategies

Monitor CPU usage patterns on systems running LangChain applications, particularly during LLM output parsing
Implement application-level logging to track the duration of MRKLOutputParser.parse() calls
Use web application firewalls (WAF) configured to detect ReDoS payload patterns in request content
Deploy runtime application self-protection (RASP) solutions capable of detecting algorithmic complexity attacks

Monitoring Recommendations

Set up alerting thresholds for abnormal CPU consumption on application servers handling LangChain workloads
Configure timeout mechanisms for parsing operations to prevent indefinite resource consumption
Implement request rate limiting on endpoints that invoke MRKL agent parsing functionality
Enable distributed tracing to identify slow parsing operations across microservice architectures

How to Mitigate CVE-2024-58340

Immediate Actions Required

Upgrade LangChain to a version newer than 0.3.1 that contains the fix for this vulnerability
Implement input validation and sanitization before passing data to MRKLOutputParser.parse()
Configure parsing timeouts to limit the maximum time allowed for regex processing
Consider implementing regex complexity analysis in CI/CD pipelines to prevent similar issues

Patch Information

Organizations should upgrade to the latest version of LangChain that addresses this vulnerability. Refer to the LangChain GitHub Repository for release notes and the VulnCheck Advisory for detailed remediation guidance. The Huntr Bounty Listing contains additional technical details about the vulnerability disclosure.

Workarounds

Implement a timeout wrapper around calls to MRKLOutputParser.parse() to prevent indefinite CPU consumption
Validate and limit the length of input strings before parsing to reduce attack surface
Consider using alternative parsing strategies that do not rely on the vulnerable regex pattern
Deploy the application behind a reverse proxy configured with request timeout limits

bash

# Configuration example - Python timeout wrapper
# Wrap MRKLOutputParser.parse() calls with signal-based timeout
# to prevent CPU exhaustion from ReDoS attacks
import signal

class TimeoutError(Exception):
    pass

def timeout_handler(signum, frame):
    raise TimeoutError("Parsing timeout exceeded")

# Set a 5-second timeout for parsing operations
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(5)