CVE-2026-26019: LangChain RecursiveUrlLoader SSRF Vulnerability

CVE-2026-26019 Overview

CVE-2026-26019 is a Server-Side Request Forgery (SSRF) vulnerability in LangChain's @langchain/community package. The RecursiveUrlLoader class, a web crawler designed to recursively follow links from a starting URL, contains a flawed URL validation mechanism that can be exploited to access internal infrastructure and sensitive cloud metadata services.

The vulnerability stems from two distinct weaknesses: first, the preventOutside option relies on String.startsWith() for URL comparison, which fails to perform proper semantic URL validation. Second, the crawler performs no validation against private or reserved IP addresses, allowing requests to cloud metadata services, localhost, and RFC 1918 addresses.

Critical Impact
Attackers controlling content on a crawled page can redirect the crawler to access internal infrastructure, cloud metadata endpoints (such as AWS IMDSv1), or other sensitive network resources, potentially leading to credential theft or further network compromise.

Affected Products

@langchain/community versions prior to 1.1.14
LangChain JavaScript/TypeScript applications using RecursiveUrlLoader

Discovery Timeline

February 11, 2026 - CVE-2026-26019 published to NVD
February 12, 2026 - Last updated in NVD database

Technical Details for CVE-2026-26019

Vulnerability Analysis

The RecursiveUrlLoader class provides a convenient mechanism for LLM-powered applications to ingest web content recursively. The preventOutside option, enabled by default, is intended to prevent the crawler from leaving the original domain. However, the implementation used JavaScript's String.startsWith() method to compare URLs, which only performs a naive string prefix check rather than proper URL parsing and domain validation.

This means an attacker who controls content on a page being crawled can craft malicious links that share a string prefix with the legitimate target URL but actually point to attacker-controlled infrastructure. For example, if crawling https://example.com, a link to https://example.com.evil.com would pass the startsWith() check despite pointing to a completely different domain.

The second vulnerability compounds this issue: the crawler had no allowlist/blocklist mechanism for IP addresses. Crawled pages could include links to http://169.254.169.254 (AWS metadata service), http://localhost, or any RFC 1918 private IP ranges, and the crawler would fetch them without restriction.

Root Cause

The root cause is improper input validation (CWE-918: Server-Side Request Forgery) in the URL comparison logic. Using String.startsWith() for security-sensitive URL origin checking is fundamentally flawed because:

It treats URLs as simple strings rather than structured data with distinct origin components
It fails to normalize URLs before comparison (e.g., trailing slashes, port numbers)
It cannot distinguish between domain boundaries (example.com vs example.com.attacker.com)

Additionally, the absence of IP address validation against private/reserved ranges allowed direct SSRF attacks targeting internal networks and cloud infrastructure.

Attack Vector

An attacker can exploit this vulnerability by placing malicious links on any page that will be crawled by a LangChain application using RecursiveUrlLoader. The attack flow is:

Attacker identifies a LangChain application crawling a target domain (e.g., https://target.com)
Attacker places content on a page within the crawl scope containing links like:
- https://target.com.attacker-domain.com/steal-data (prefix bypass)
- http://169.254.169.254/latest/meta-data/ (cloud metadata access)
- http://192.168.1.1/admin (internal network access)
The crawler follows these links, believing they are within the allowed scope
Responses from internal services are processed and potentially exposed

The security patch introduces proper SSRF hardening through new utility functions:

typescript

 import { JSDOM, VirtualConsole } from "jsdom";
 import { Document } from "@langchain/core/documents";
 import { AsyncCaller } from "@langchain/core/utils/async_caller";
+import { isSameOrigin, validateSafeUrl } from "@langchain/core/utils/ssrf";
 import {
   BaseDocumentLoader,
   DocumentLoader,

Source: GitHub Commit Changes

The fix adds a new SSRF utility module to the core library:

typescript

 export * as utils__json_patch from "../utils/json_patch.js";
 export * as utils__json_schema from "../utils/json_schema.js";
 export * as utils__math from "../utils/math.js";
+export * as utils__ssrf from "../utils/ssrf.js";
 export * as utils__stream from "../utils/stream.js";
 export * as utils__testing from "../utils/testing/index.js";
 export * as utils__tiktoken from "../utils/tiktoken.js";

Source: GitHub Commit Changes

Detection Methods for CVE-2026-26019

Indicators of Compromise

Unexpected outbound requests from LangChain applications to cloud metadata endpoints (e.g., 169.254.169.254)
Network connections from web crawling processes to internal RFC 1918 IP ranges
Requests to domains with suspicious prefix patterns matching legitimate crawl targets
Unusual data exfiltration patterns from crawler processes to external domains

Detection Strategies

Monitor DNS queries and HTTP requests from LangChain application servers for connections to metadata service IPs
Implement network segmentation rules that alert on crawlers accessing internal network ranges
Review application logs for RecursiveUrlLoader activity targeting unexpected URL patterns
Deploy web application firewalls (WAF) to detect and block SSRF attack patterns in outbound traffic

Monitoring Recommendations

Enable detailed logging for all RecursiveUrlLoader instances including full URL paths
Set up alerts for any outbound connections to RFC 1918 addresses or cloud metadata endpoints from application servers
Monitor for dependencies on @langchain/community versions below 1.1.14 in package manifests
Implement egress filtering and log all blocked outbound requests for forensic analysis

How to Mitigate CVE-2026-26019

Immediate Actions Required

Upgrade @langchain/community to version 1.1.14 or later immediately
Audit all LangChain applications using RecursiveUrlLoader to identify vulnerable deployments
Implement network-level controls to block outbound requests to cloud metadata services and private IP ranges
Review crawler configurations to ensure minimum necessary scope for URL following

Patch Information

The vulnerability is fixed in @langchain/community version 1.1.14. The patch introduces the isSameOrigin() and validateSafeUrl() utility functions in @langchain/core/utils/ssrf that perform proper URL origin validation and block requests to private/reserved IP addresses.

Patch resources:

Workarounds

If immediate patching is not possible, implement network egress controls to block requests to 169.254.169.254, localhost, and RFC 1918 ranges
Consider using a web proxy with URL filtering for all outbound crawler requests
Restrict the URLs that can be crawled to a strict allowlist of known-safe domains
Disable RecursiveUrlLoader functionality entirely until patching is complete

bash

# Network-level mitigation example: Block metadata service access with iptables
iptables -A OUTPUT -d 169.254.169.254 -j DROP
iptables -A OUTPUT -d 10.0.0.0/8 -j DROP
iptables -A OUTPUT -d 172.16.0.0/12 -j DROP
iptables -A OUTPUT -d 192.168.0.0/16 -j DROP

CVE-2026-26019 Overview

Critical Impact
Attackers controlling content on a crawled page can redirect the crawler to access internal infrastructure, cloud metadata endpoints (such as AWS IMDSv1), or other sensitive network resources, potentially leading to credential theft or further network compromise.

Affected Products

@langchain/community versions prior to 1.1.14
LangChain JavaScript/TypeScript applications using RecursiveUrlLoader

Discovery Timeline

February 11, 2026 - CVE-2026-26019 published to NVD
February 12, 2026 - Last updated in NVD database

Technical Details for CVE-2026-26019

Vulnerability Analysis

Root Cause

It treats URLs as simple strings rather than structured data with distinct origin components
It fails to normalize URLs before comparison (e.g., trailing slashes, port numbers)
It cannot distinguish between domain boundaries (example.com vs example.com.attacker.com)

Additionally, the absence of IP address validation against private/reserved ranges allowed direct SSRF attacks targeting internal networks and cloud infrastructure.

Attack Vector

An attacker can exploit this vulnerability by placing malicious links on any page that will be crawled by a LangChain application using RecursiveUrlLoader. The attack flow is:

Attacker identifies a LangChain application crawling a target domain (e.g., https://target.com)
Attacker places content on a page within the crawl scope containing links like:
- https://target.com.attacker-domain.com/steal-data (prefix bypass)
- http://169.254.169.254/latest/meta-data/ (cloud metadata access)
- http://192.168.1.1/admin (internal network access)
The crawler follows these links, believing they are within the allowed scope
Responses from internal services are processed and potentially exposed

The security patch introduces proper SSRF hardening through new utility functions:

typescript

 import { JSDOM, VirtualConsole } from "jsdom";
 import { Document } from "@langchain/core/documents";
 import { AsyncCaller } from "@langchain/core/utils/async_caller";
+import { isSameOrigin, validateSafeUrl } from "@langchain/core/utils/ssrf";
 import {
   BaseDocumentLoader,
   DocumentLoader,

Source: GitHub Commit Changes

The fix adds a new SSRF utility module to the core library:

typescript

 export * as utils__json_patch from "../utils/json_patch.js";
 export * as utils__json_schema from "../utils/json_schema.js";
 export * as utils__math from "../utils/math.js";
+export * as utils__ssrf from "../utils/ssrf.js";
 export * as utils__stream from "../utils/stream.js";
 export * as utils__testing from "../utils/testing/index.js";
 export * as utils__tiktoken from "../utils/tiktoken.js";

Source: GitHub Commit Changes

Detection Methods for CVE-2026-26019

Indicators of Compromise

Unexpected outbound requests from LangChain applications to cloud metadata endpoints (e.g., 169.254.169.254)
Network connections from web crawling processes to internal RFC 1918 IP ranges
Requests to domains with suspicious prefix patterns matching legitimate crawl targets
Unusual data exfiltration patterns from crawler processes to external domains

Detection Strategies

Monitor DNS queries and HTTP requests from LangChain application servers for connections to metadata service IPs
Implement network segmentation rules that alert on crawlers accessing internal network ranges
Review application logs for RecursiveUrlLoader activity targeting unexpected URL patterns
Deploy web application firewalls (WAF) to detect and block SSRF attack patterns in outbound traffic

Monitoring Recommendations

Enable detailed logging for all RecursiveUrlLoader instances including full URL paths
Set up alerts for any outbound connections to RFC 1918 addresses or cloud metadata endpoints from application servers
Monitor for dependencies on @langchain/community versions below 1.1.14 in package manifests
Implement egress filtering and log all blocked outbound requests for forensic analysis

How to Mitigate CVE-2026-26019

Immediate Actions Required

Upgrade @langchain/community to version 1.1.14 or later immediately
Audit all LangChain applications using RecursiveUrlLoader to identify vulnerable deployments
Implement network-level controls to block outbound requests to cloud metadata services and private IP ranges
Review crawler configurations to ensure minimum necessary scope for URL following

Patch Information

Patch resources:

Workarounds

If immediate patching is not possible, implement network egress controls to block requests to 169.254.169.254, localhost, and RFC 1918 ranges
Consider using a web proxy with URL filtering for all outbound crawler requests
Restrict the URLs that can be crawled to a strict allowlist of known-safe domains
Disable RecursiveUrlLoader functionality entirely until patching is complete

bash

# Network-level mitigation example: Block metadata service access with iptables
iptables -A OUTPUT -d 169.254.169.254 -j DROP
iptables -A OUTPUT -d 10.0.0.0/8 -j DROP
iptables -A OUTPUT -d 172.16.0.0/12 -j DROP
iptables -A OUTPUT -d 192.168.0.0/16 -j DROP

CVE-2026-26019: LangChain RecursiveUrlLoader SSRF Vulnerability

CVE-2026-26019 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-26019

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-26019

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-26019

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2026-26019: LangChain RecursiveUrlLoader SSRF Vulnerability

CVE-2026-26019 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-26019

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-26019

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-26019

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform