CVE-2026-24770: Infiniflow RAGFlow RCE Vulnerability

CVE-2026-24770 Overview

CVE-2026-24770 is a Zip Slip path traversal vulnerability [CWE-22] in RAGFlow, an open-source Retrieval-Augmented Generation (RAG) engine developed by Infiniflow. The flaw resides in the MinerUParser class, which downloads and extracts ZIP archives from an external mineru_server_url. The _extract_zip_no_root extraction routine fails to sanitize filenames inside the archive. Attackers can craft malicious ZIP files containing path traversal sequences to overwrite arbitrary files on the server, leading to Remote Code Execution (RCE). The issue affects RAGFlow version 0.23.1 and possibly earlier releases.

Critical Impact
Unauthenticated attackers can achieve arbitrary file write and Remote Code Execution on RAGFlow servers by supplying a malicious ZIP archive through the MinerU parser.

Affected Products

Infiniflow RAGFlow version 0.23.1
Earlier RAGFlow versions containing the MinerUParser class
Deployments configured with an external mineru_server_url

Discovery Timeline

2026-01-27 - CVE-2026-24770 published to the National Vulnerability Database (NVD)
2026-01-30 - Last updated in NVD database
Patch published in commit 64c75d558e4a17a4a48953b4c201526431d8338f and tracked in advisory GHSA-v7cf-w7gj-pgf4

Technical Details for CVE-2026-24770

Vulnerability Analysis

The vulnerability is a classic Zip Slip flaw. RAGFlow's MinerUParser integrates with an external MinerU service to process documents. The parser retrieves a ZIP archive from mineru_server_url and extracts the contents to a local working directory using the _extract_zip_no_root helper.

During extraction, the helper iterates over archive entries and writes each file to disk without validating the resolved output path. Entries containing relative path traversal sequences such as ../../etc/cron.d/payload escape the intended extraction directory. The attacker controls both the file contents and the destination path on the server file system.

Writing to attacker-chosen locations allows code execution through several documented techniques: dropping files into Python site-packages, overwriting RAGFlow configuration or source files, writing cron jobs, or replacing SSH authorized_keys. The CWE-22 classification reflects the improper limitation of a pathname to a restricted directory.

Root Cause

The root cause is missing filename sanitization inside _extract_zip_no_root. The function trusts the ZipInfo.filename field from the archive and concatenates it with the destination directory without canonicalizing the resulting path or verifying that it remains inside the target directory.

Attack Vector

The attack is exploitable over the network with no authentication or user interaction. An attacker who controls or can impersonate the configured mineru_server_url returns a crafted ZIP archive when RAGFlow requests parsed output. Conditions enabling exploitation include a malicious or compromised MinerU server, DNS or network-level redirection of the mineru_server_url endpoint, or any code path that allows user-supplied URLs to reach the parser. Successful exploitation yields arbitrary file write and, in practice, Remote Code Execution under the RAGFlow service account.

No verified public proof-of-concept code is referenced in the advisory, so exploitation details are described in prose only. Refer to the GitHub Security Advisory GHSA-v7cf-w7gj-pgf4 for vendor-supplied details.

Detection Methods for CVE-2026-24770

Indicators of Compromise

File system entries created outside the expected RAGFlow extraction directory, particularly under system paths such as /etc/, /root/.ssh/, or Python site-packages.
Unexpected outbound HTTP requests from RAGFlow to non-trusted hosts resolving to the configured mineru_server_url.
ZIP archive entries observed in logs or on disk containing .. traversal sequences in their filenames.
New or modified cron jobs, systemd units, or startup scripts following document parsing activity.

Detection Strategies

Audit RAGFlow worker process activity for file writes outside its working directory using file integrity monitoring on the host.
Inspect captured ZIP archives received from the MinerU endpoint and flag any entry whose normalized path escapes the extraction root.
Monitor child process creation by RAGFlow services for shells, interpreters, or scheduled task utilities not used during normal operation.

Monitoring Recommendations

Enable verbose logging on the MinerU integration to capture source URLs, archive hashes, and extraction targets.
Forward host process and file telemetry to a centralized data lake for correlation with parser activity.
Alert on RAGFlow service writes to sensitive directories or modifications to its own application files.

How to Mitigate CVE-2026-24770

Immediate Actions Required

Upgrade RAGFlow to a build that includes commit 64c75d558e4a17a4a48953b4c201526431d8338f from the Infiniflow patch commit.
Restrict mineru_server_url to a trusted, internally controlled MinerU instance and block outbound traffic to untrusted hosts.
Run the RAGFlow service under a least-privileged account with no write access to system directories.
Audit existing RAGFlow hosts for unexpected files created since deployment of the affected version.

Patch Information

The vendor fix is delivered in commit 64c75d558e4a17a4a48953b4c201526431d8338f. The patch updates _extract_zip_no_root to validate that each resolved extraction path remains within the target directory before writing. The corresponding advisory is GHSA-v7cf-w7gj-pgf4. Apply the patch or upgrade to a release that includes this commit.

Workarounds

Disable the MinerU parser integration until the patch is applied.
Place RAGFlow behind a network policy that restricts outbound HTTP traffic to an allowlist containing only trusted MinerU endpoints.
Mount the RAGFlow extraction directory on a separate volume with nodev, nosuid, and noexec options where supported.

bash

# Example: restrict RAGFlow egress to a single trusted MinerU host
iptables -A OUTPUT -m owner --uid-owner ragflow -d <trusted-mineru-ip> -p tcp --dport 443 -j ACCEPT
iptables -A OUTPUT -m owner --uid-owner ragflow -j REJECT