CVE-2026-41066: Lxml XXE Vulnerability in Python Library

CVE-2026-41066 Overview

CVE-2026-41066 is an XML External Entity (XXE) vulnerability affecting lxml, a widely-used library for processing XML and HTML in the Python programming language. Prior to version 6.1.0, using either of the two parsers in the default configuration (with resolve_entities=True) allows untrusted XML input to read local files from the system. This vulnerability enables attackers to exploit the default parser behavior to access sensitive local file contents through crafted XML payloads.

Critical Impact
Applications using lxml versions prior to 6.1.0 with default parser configurations are vulnerable to local file disclosure attacks via malicious XML input, potentially exposing sensitive system files and application data.

Affected Products

lxml versions prior to 6.1.0
Applications using lxml XML/HTML parsers with default resolve_entities=True configuration
Python applications processing untrusted XML input via lxml

Discovery Timeline

2026-04-24 - CVE CVE-2026-41066 published to NVD
2026-04-27 - Last updated in NVD database

Technical Details for CVE-2026-41066

Vulnerability Analysis

This vulnerability falls under CWE-611 (Improper Restriction of XML External Entity Reference). The lxml library's XML parsers, when operating with default settings, automatically resolve external entities within XML documents. This behavior allows an attacker to craft malicious XML input containing external entity declarations that reference local file paths on the target system. When the parser processes such input, it resolves these entities by reading the contents of the specified files, effectively enabling arbitrary local file disclosure.

The vulnerability is particularly concerning because it affects the default configuration of lxml, meaning developers who have not explicitly hardened their parser settings are at risk. Applications that accept XML input from untrusted sources—such as web services, file upload handlers, or data import functions—are primary attack targets.

Root Cause

The root cause of this vulnerability lies in the default value of the resolve_entities parameter in lxml's XML parsers. By default, this parameter is set to True, which instructs the parser to resolve all entity references, including external entities. External entity resolution allows XML documents to reference and include content from external sources, including local file system paths. Without explicit configuration to disable or restrict this behavior, the parser will attempt to read and include the contents of any file path specified in an external entity declaration.

Attack Vector

The attack exploits the network-accessible nature of applications using lxml to process XML data. An attacker submits a crafted XML document containing a DOCTYPE declaration with an external entity that references a local file path (e.g., /etc/passwd on Linux systems or sensitive application configuration files). When the vulnerable lxml parser processes this document with default settings, it reads the specified file and incorporates its contents into the parsed XML structure.

The vulnerability mechanism works as follows: an attacker crafts an XML payload containing an external entity declaration pointing to a sensitive local file. The entity is then referenced within the XML document body. When lxml parses this document with resolve_entities=True (the default), the parser reads the target file and substitutes the entity reference with the file contents. The attacker can then extract this information from the application's response or behavior.

For technical details on the exploitation mechanism and proof-of-concept examples, refer to the GitHub Security Advisory and the Launchpad Bug Report.

Detection Methods for CVE-2026-41066

Indicators of Compromise

Unusual file access patterns in application logs showing reads of sensitive system files like /etc/passwd, /etc/shadow, or application configuration files
XML payloads in application input containing DOCTYPE declarations with ENTITY references to local file paths
Application responses or error messages containing unexpected file contents or path references
Network traffic containing XML data with suspicious external entity declarations

Detection Strategies

Implement application-level logging to capture and analyze incoming XML payloads for DOCTYPE and ENTITY declarations
Deploy Web Application Firewall (WAF) rules to detect and block XML payloads containing external entity patterns
Monitor for file system access anomalies where the application process reads files outside its expected scope
Use static code analysis tools to identify lxml usage without explicit resolve_entities configuration

Monitoring Recommendations

Enable verbose logging for XML parsing operations in production environments to capture potential exploitation attempts
Implement alerting for application processes accessing sensitive system files that are outside normal operational requirements
Regularly audit Python dependencies to identify vulnerable lxml versions in your software inventory
Monitor for outbound network connections that may indicate data exfiltration following successful XXE exploitation

How to Mitigate CVE-2026-41066

Immediate Actions Required

Upgrade lxml to version 6.1.0 or later immediately across all environments
Audit all Python applications in your environment that use lxml for XML processing
Configure existing lxml installations to explicitly set resolve_entities='internal' or resolve_entities=False as an interim measure
Review application logs for evidence of prior exploitation attempts using XXE payloads

Patch Information

The vulnerability is fixed in lxml version 6.1.0. Organizations should upgrade to this version or later to fully remediate the vulnerability. The fix ensures that external entity resolution is properly restricted by default, preventing unauthorized local file access. For detailed patch information and release notes, refer to the GitHub Security Advisory.

Workarounds

Explicitly set resolve_entities='internal' when creating lxml parsers to restrict entity resolution to internal entities only
Set resolve_entities=False to completely disable entity resolution if your application does not require this functionality
Implement input validation to reject XML documents containing DOCTYPE declarations before passing them to lxml parsers
Use XML schema validation to restrict accepted XML structures and prevent malicious payloads from reaching the parser

bash

# Configuration example - Python code to safely configure lxml parser
# Set resolve_entities to 'internal' or False when creating parser
# parser = etree.XMLParser(resolve_entities='internal')
# or
# parser = etree.XMLParser(resolve_entities=False)
# 
# Upgrade lxml to patched version:
pip install --upgrade lxml>=6.1.0

CVE-2026-41066 Overview

Critical Impact
Applications using lxml versions prior to 6.1.0 with default parser configurations are vulnerable to local file disclosure attacks via malicious XML input, potentially exposing sensitive system files and application data.

Affected Products

lxml versions prior to 6.1.0
Applications using lxml XML/HTML parsers with default resolve_entities=True configuration
Python applications processing untrusted XML input via lxml

Discovery Timeline

2026-04-24 - CVE CVE-2026-41066 published to NVD
2026-04-27 - Last updated in NVD database

Technical Details for CVE-2026-41066

Vulnerability Analysis

Root Cause

Attack Vector

For technical details on the exploitation mechanism and proof-of-concept examples, refer to the GitHub Security Advisory and the Launchpad Bug Report.

Detection Methods for CVE-2026-41066

Indicators of Compromise

Unusual file access patterns in application logs showing reads of sensitive system files like /etc/passwd, /etc/shadow, or application configuration files
XML payloads in application input containing DOCTYPE declarations with ENTITY references to local file paths
Application responses or error messages containing unexpected file contents or path references
Network traffic containing XML data with suspicious external entity declarations

Detection Strategies

Implement application-level logging to capture and analyze incoming XML payloads for DOCTYPE and ENTITY declarations
Deploy Web Application Firewall (WAF) rules to detect and block XML payloads containing external entity patterns
Monitor for file system access anomalies where the application process reads files outside its expected scope
Use static code analysis tools to identify lxml usage without explicit resolve_entities configuration

Monitoring Recommendations

Enable verbose logging for XML parsing operations in production environments to capture potential exploitation attempts
Implement alerting for application processes accessing sensitive system files that are outside normal operational requirements
Regularly audit Python dependencies to identify vulnerable lxml versions in your software inventory
Monitor for outbound network connections that may indicate data exfiltration following successful XXE exploitation

How to Mitigate CVE-2026-41066

Immediate Actions Required

Upgrade lxml to version 6.1.0 or later immediately across all environments
Audit all Python applications in your environment that use lxml for XML processing
Configure existing lxml installations to explicitly set resolve_entities='internal' or resolve_entities=False as an interim measure
Review application logs for evidence of prior exploitation attempts using XXE payloads

Patch Information

Workarounds

Explicitly set resolve_entities='internal' when creating lxml parsers to restrict entity resolution to internal entities only
Set resolve_entities=False to completely disable entity resolution if your application does not require this functionality
Implement input validation to reject XML documents containing DOCTYPE declarations before passing them to lxml parsers
Use XML schema validation to restrict accepted XML structures and prevent malicious payloads from reaching the parser

bash

# Configuration example - Python code to safely configure lxml parser
# Set resolve_entities to 'internal' or False when creating parser
# parser = etree.XMLParser(resolve_entities='internal')
# or
# parser = etree.XMLParser(resolve_entities=False)
# 
# Upgrade lxml to patched version:
pip install --upgrade lxml>=6.1.0

CVE-2026-41066: Lxml XXE Vulnerability in Python Library

CVE-2026-41066 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-41066

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-41066

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-41066

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform

CVE-2026-41066: Lxml XXE Vulnerability in Python Library

CVE-2026-41066 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-41066

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-41066

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-41066

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform