CVE-2024-12450 Overview
CVE-2024-12450 is a critical vulnerability in Infiniflow RAGFlow version 0.12.0. The web_crawl function in document_app.py fails to filter URL parameters supplied to its Chromium-based crawler. Attackers can abuse this to perform Full Read Server-Side Request Forgery (SSRF) against internal network resources and extract the rendered content through generated PDF files. The absence of protocol restrictions also permits the file:// scheme, enabling Arbitrary File Read on the server. The crawler additionally runs an outdated Chromium headless build with --no-sandbox, exposing the application to Remote Code Execution (RCE) through known V8 engine vulnerabilities. The issues are fixed in RAGFlow version 0.14.0.
Critical Impact
Unauthenticated network attackers can read internal services, exfiltrate local files, and achieve RCE on RAGFlow servers running version 0.12.0.
Affected Products
- Infiniflow RAGFlow 0.12.0
- Deployments using the web_crawl document ingestion feature
- Instances bundling the outdated Chromium headless component with --no-sandbox
Discovery Timeline
- 2025-03-20 - CVE-2024-12450 published to the National Vulnerability Database (NVD)
- 2025-04-04 - Last updated in NVD database
Technical Details for CVE-2024-12450
Vulnerability Analysis
The web_crawl endpoint accepts a user-controlled URL and passes it to a headless Chromium instance that renders the page to PDF. The URL is validated by a regular expression that explicitly allows http, https, ftp, and file schemes. No allow-list, DNS resolution check, or private-range filtering is performed before the request is issued.
Because the rendered output is returned to the requester as a PDF, any HTTP response from an internal service is fully readable. Supplying a file:// URL causes Chromium to render local filesystem contents directly into the PDF, exposing configuration files, credentials, and source code on the host.
The Chromium binary is launched with --no-sandbox, removing the process isolation that normally contains renderer compromises. Combined with an unpatched V8 engine, a malicious page loaded through the SSRF primitive can chain a public V8 exploit to achieve code execution in the RAGFlow process context. This vulnerability is tracked under CWE-918: Server-Side Request Forgery.
Root Cause
The root cause is permissive URL validation in api/utils/web_utils.py. The original is_valid_url regex accepted the ftp and file schemes, and the function performed no host or address filtering. Running headless Chromium with --no-sandbox on an outdated build compounded the issue by removing exploit mitigations.
Attack Vector
An unauthenticated remote attacker submits a crafted URL to the web_crawl function. Targeting http://169.254.169.254/, http://127.0.0.1:<port>/, or file:///etc/passwd yields the response content inside the returned PDF. Pointing the crawler at an attacker-controlled page that triggers a known Chromium V8 bug delivers RCE on the host.
# Security patch in api/utils/web_utils.py (commit 3faae0b)
def is_valid_url(url: str) -> bool:
- return bool(re.match(r"(https?|ftp|file)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za-z0-9+&@#/%=~_|]", url))
+ return bool(re.match(r"(https?)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za-z0-9+&@#/%=~_|]", url))
Source: GitHub Commit 3faae0b. The patch removes ftp and file from the allowed schemes, eliminating the Arbitrary File Read primitive. Operators should still upgrade to RAGFlow 0.14.0 to receive the bundled Chromium update and sandbox-related fixes.
Detection Methods for CVE-2024-12450
Indicators of Compromise
- Outbound HTTP requests from the RAGFlow host to RFC1918 addresses, 127.0.0.1, or cloud metadata endpoints such as 169.254.169.254.
- PDF artifacts in the RAGFlow document store containing rendered internal admin panels or local file content.
- Application logs showing web_crawl invocations with URLs beginning with file:// or ftp://.
- Chromium child processes spawned with --no-sandbox followed by anomalous subprocesses (shells, network tools).
Detection Strategies
- Inspect application logs for web_crawl requests where the url parameter targets non-public hosts or non-HTTP schemes.
- Monitor for process trees where the RAGFlow Python service spawns headless Chromium followed by /bin/sh, bash, or python children.
- Alert on egress traffic from the RAGFlow container to internal subnets or to instance metadata services.
Monitoring Recommendations
- Capture and retain reverse proxy logs in front of RAGFlow, indexing the url query parameter for retrospective hunts.
- Forward container runtime telemetry to a centralized data lake to correlate Chromium activity with outbound network flows.
- Track the file integrity of api/utils/web_utils.py and the bundled Chromium binary to detect drift from the patched build.
How to Mitigate CVE-2024-12450
Immediate Actions Required
- Upgrade Infiniflow RAGFlow to version 0.14.0 or later, which includes the patched is_valid_url function and an updated Chromium runtime.
- Restrict network egress from the RAGFlow service to the public internet only, blocking RFC1918 ranges and cloud metadata endpoints at the firewall.
- Remove or rotate any credentials, API keys, or tokens stored on the RAGFlow host that may have been exposed through Arbitrary File Read.
- Audit historical web_crawl requests and generated PDFs for evidence of SSRF or local file disclosure.
Patch Information
The fix is delivered in RAGFlow 0.14.0 via commit 3faae0b2c2f8a26233ee1442ba04874b3406f6e9. The patch narrows accepted URL schemes to http and https, removing support for file and ftp. Refer to the GitHub commit and the Huntr bug bounty report for additional context.
Workarounds
- If immediate upgrade is not possible, patch api/utils/web_utils.py locally to allow only https?:// URLs.
- Place RAGFlow behind a forward proxy that rejects requests to private IP ranges and non-HTTP schemes.
- Disable the document ingestion web_crawl feature until the upgrade is applied.
- Run the RAGFlow container with a restrictive network policy and drop --no-sandbox from the Chromium launch arguments where feasible.
# Configuration example: block private and metadata destinations via iptables
iptables -A OUTPUT -m owner --uid-owner ragflow -d 10.0.0.0/8 -j REJECT
iptables -A OUTPUT -m owner --uid-owner ragflow -d 172.16.0.0/12 -j REJECT
iptables -A OUTPUT -m owner --uid-owner ragflow -d 192.168.0.0/16 -j REJECT
iptables -A OUTPUT -m owner --uid-owner ragflow -d 169.254.169.254/32 -j REJECT
iptables -A OUTPUT -m owner --uid-owner ragflow -d 127.0.0.0/8 -j REJECT
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


