CVE-2025-1752: Llamaindex Llamaindex DoS Vulnerability

CVE-2025-1752 Overview

CVE-2025-1752 is a Denial of Service (DoS) vulnerability in the KnowledgeBaseWebReader class of the run-llama/llama_index project. The flaw affects versions up to and including v0.12.15 of the llama-index-readers-web integration. The get_article_urls function declares a max_depth parameter but never enforces it during recursive crawling. An attacker who can supply a crafted knowledge base URL can drive the crawler past Python's recursion limit. This exhausts process resources and crashes the Python interpreter hosting the application.

Critical Impact
Remote, unauthenticated attackers can crash any application using KnowledgeBaseWebReader by triggering uncontrolled recursion in the web crawler, resulting in a full availability loss.

Affected Products

LlamaIndex llama-index-readers-web versions up to 0.3.5
run-llama/llama_index releases up to and including v0.12.15
Applications embedding the KnowledgeBaseWebReader class for web knowledge base ingestion

Discovery Timeline

2025-05-10 - CVE-2025-1752 published to NVD
2025-10-15 - Last updated in NVD database

Technical Details for CVE-2025-1752

Vulnerability Analysis

The vulnerability lives in llama-index-integrations/readers/llama-index-readers-web/llama_index/readers/web/knowledge_base/base.py. The get_article_urls method recursively crawls a knowledge base site to enumerate article URLs. The signature exposes a max_depth parameter defaulted to 100, suggesting bounded traversal. In practice, no depth counter is passed between recursive calls, so the bound is never evaluated.

When the crawler encounters a site with many internal links or cyclical link structures, each discovered link triggers another recursive invocation. Python enforces a default recursion limit of 1000 frames. Once exceeded, the interpreter raises RecursionError and the host process terminates if the exception is unhandled. The weakness is classified as [CWE-674] Uncontrolled Recursion.

Root Cause

The root cause is missing enforcement of the documented max_depth parameter. The function accepts max_depth but never compares it against the current call depth. Without a depth accumulator, recursion proceeds until Python's interpreter limit is hit.

Attack Vector

Exploitation requires no authentication and no user interaction. An attacker provides a malicious or attacker-controlled URL to an application that processes web content via KnowledgeBaseWebReader. The crawler follows links from that root and recurses uncontrollably. The Python worker crashes, denying service to any pipeline, agent, or retrieval system depending on that process.

python

# Patched signature from commit 3c65db2947271de3bd1927dc66a044da385de4da
    def get_article_urls(
-        self, browser: Any, root_url: str, current_url: str, max_depth: int = 100
+        self,
+        browser: Any,
+        root_url: str,
+        current_url: str,
+        max_depth: int = 100,
+        depth: int = 0,
    ) -> List[str]:
        """
        Recursively crawl through the knowledge base to find a list of articles.

Source: run-llama/llama_index commit 3c65db2

Detection Methods for CVE-2025-1752

Indicators of Compromise

Python process crashes with RecursionError: maximum recursion depth exceeded in application logs
Sudden termination of LlamaIndex ingestion workers shortly after a KnowledgeBaseWebReader job starts
Unusually deep outbound HTTP request chains originating from a single crawler session targeting one host

Detection Strategies

Inventory Python environments and identify installations of llama-index-readers-web at version 0.3.5 or below via pip list or SBOM tooling.
Search source repositories for imports of KnowledgeBaseWebReader to enumerate affected applications.
Alert on repeated worker restarts or unhandled RecursionError exceptions in application telemetry.

Monitoring Recommendations

Monitor outbound HTTP request volume from ingestion services and flag bursts to a single domain that indicate runaway crawling.
Track process resource exhaustion and abnormal exit codes for containers hosting LlamaIndex workers.
Log and review URLs submitted to web reader endpoints, especially those accepting external or user-supplied input.

How to Mitigate CVE-2025-1752

Immediate Actions Required

Upgrade llama-index-readers-web to version 0.3.6 or later, which adds the missing depth accumulator.
Restrict which URLs may be passed to KnowledgeBaseWebReader to a vetted allowlist of trusted domains.
Run LlamaIndex web ingestion in isolated worker processes with automatic restart and resource limits.

Patch Information

The fix is delivered in commit 3c65db2947271de3bd1927dc66a044da385de4da and released as llama-index-readers-web version 0.3.6. The patch introduces a new depth: int = 0 argument that is incremented on each recursive call and checked against max_depth. See the GitHub commit and the Huntr bounty report for full details.

Workarounds

Wrap KnowledgeBaseWebReader calls in a subclass that enforces a manual depth counter before recursion proceeds.
Lower sys.setrecursionlimit is not recommended; instead, isolate crawls in subprocesses with timeouts and memory caps.
Validate and sanitize all user-supplied URLs and reject inputs that resolve to untrusted or high-link-density sites.

bash

# Upgrade to the patched release
pip install --upgrade 'llama-index-readers-web>=0.3.6'

# Verify the installed version
pip show llama-index-readers-web | grep -i version