CVE-2025-6209: Llamaindex Path Traversal Vulnerability

CVE-2025-6209 Overview

CVE-2025-6209 is a path traversal vulnerability [CWE-29] in the run-llama/llama_index Python library. The flaw resides in the encode_image function within generic_utils.py and affects versions 0.12.27 through 0.12.40. An unauthenticated network attacker can manipulate the image_path parameter to read arbitrary files on the host filesystem. The issue stems from missing validation and sanitization of file path inputs, which allows traversal sequences like ../ to escape the intended directory. The vulnerability is fixed in version 0.12.41.

Critical Impact
Attackers can read arbitrary files on the server, including sensitive system files such as /etc/passwd, configuration files, and application secrets, by supplying crafted image_path values to applications built on llama_index.

Affected Products

llama_index versions 0.12.27 through 0.12.40
Applications embedding the vulnerable encode_image utility from llama_index.core
LLM workflows that pass user-controlled image paths to llama_index multimodal components

Discovery Timeline

2025-07-07 - CVE-2025-6209 published to NVD
2025-07-30 - Last updated in NVD database

Technical Details for CVE-2025-6209

Vulnerability Analysis

The encode_image helper in llama_index/core/generic_utils.py accepts a file path and returns the base64-encoded contents of the referenced image. The function passes the caller-supplied image_path directly to file system operations without enforcing a base directory or rejecting traversal sequences. When the image_path is influenced by untrusted input, an attacker can supply values such as ../../../../etc/passwd to coerce the function into reading files outside the intended image directory.

Because llama_index is widely embedded in Retrieval-Augmented Generation (RAG) and multimodal LLM applications, the function is often reachable through HTTP endpoints, agent tools, or document ingestion pipelines that accept user-supplied references. The resulting file contents are returned to the caller or fed into downstream model prompts, where they can be exfiltrated.

Root Cause

The root cause is improper limitation of a pathname to a restricted directory [CWE-29]. The encode_image function does not canonicalize the input path, does not verify that the resolved path remains within an allowed directory, and does not validate that the target file is actually an image.

Attack Vector

Exploitation requires only network access to an application exposing llama_index image-handling functionality. No authentication or user interaction is required. The attacker supplies a crafted image_path containing directory traversal sequences, and the application returns or processes the contents of arbitrary readable files.

python

# Patch excerpt from llama-index-core/llama_index/core/schema.py
# Adds PIL-based image validation to ImageDocument path/url handling
from dataclasses_json import DataClassJsonMixin
from deprecated import deprecated
from typing_extensions import Self
from PIL import Image

from llama_index.core.bridge.pydantic import (
    AnyUrl,

Source: run-llama/llama_index commit cdeaab9

The patch introduces PIL.Image validation so that ImageDocument path and URL inputs are confirmed to be valid image data before being processed, limiting the ability to redirect the loader at arbitrary non-image files.

Detection Methods for CVE-2025-6209

Indicators of Compromise

HTTP request parameters or JSON fields named image_path, image, or path containing traversal sequences such as ../, ..\\, %2e%2e%2f, or absolute paths like /etc/, /root/, or C:\\Windows\\.
Application logs showing encode_image calls that resolve to files outside the configured image directory.
Outbound LLM responses or API responses containing contents of system files (/etc/passwd, .env, id_rsa, config.yaml).

Detection Strategies

Inspect Python application logs and web server access logs for requests targeting llama_index endpoints with encoded or literal .. sequences in path-like parameters.
Instrument calls to encode_image to log resolved absolute paths and alert when the path falls outside an approved image directory.
Apply static analysis to identify code that passes externally sourced data into llama_index image loading utilities without canonicalization.

Monitoring Recommendations

Monitor file system access from the Python interpreter for reads of sensitive paths such as /etc/shadow, /proc/self/environ, SSH keys, and cloud credential files.
Track installed llama-index-core package versions across environments and alert on versions between 0.12.27 and 0.12.40.
Correlate anomalous file reads with subsequent egress traffic from the application host to detect data exfiltration following path traversal.

How to Mitigate CVE-2025-6209

Immediate Actions Required

Upgrade llama-index-core to version 0.12.41 or later across all environments using the affected package.
Audit application code for any path-handling routine that forwards user-controlled values to encode_image or ImageDocument and add server-side allow-list validation.
Restrict the filesystem permissions of the process running llama_index so it cannot read sensitive system files or credential stores.

Patch Information

The vulnerability is fixed in llama_index version 0.12.41. The fix is implemented in commit cdeaab91a204d1c3527f177dac37390327aef274, which adds PIL.Image validation to ImageDocument path and URL inputs. Additional context is available in the Huntr bounty submission.

Workarounds

Validate that all image_path inputs resolve, via os.path.realpath, to a location within an explicit allow-listed directory before invoking encode_image.
Reject any input containing .., null bytes, URL-encoded traversal sequences, or absolute paths before passing it to llama_index APIs.
Run llama_index workloads under a dedicated low-privilege service account with read access limited to the required image directory.

bash

# Upgrade to the patched release
pip install --upgrade 'llama-index-core>=0.12.41'

# Verify the installed version
python -c "import llama_index.core, importlib.metadata as m; print(m.version('llama-index-core'))"