CVE-2026-41066 Overview
CVE-2026-41066 is an XML External Entity (XXE) vulnerability affecting lxml, a widely-used library for processing XML and HTML in the Python programming language. Prior to version 6.1.0, using either of the two parsers in the default configuration (with resolve_entities=True) allows untrusted XML input to read local files from the system. This vulnerability enables attackers to exploit the default parser behavior to access sensitive local file contents through crafted XML payloads.
Critical Impact
Applications using lxml versions prior to 6.1.0 with default parser configurations are vulnerable to local file disclosure attacks via malicious XML input, potentially exposing sensitive system files and application data.
Affected Products
- lxml versions prior to 6.1.0
- Applications using lxml XML/HTML parsers with default resolve_entities=True configuration
- Python applications processing untrusted XML input via lxml
Discovery Timeline
- 2026-04-24 - CVE CVE-2026-41066 published to NVD
- 2026-04-27 - Last updated in NVD database
Technical Details for CVE-2026-41066
Vulnerability Analysis
This vulnerability falls under CWE-611 (Improper Restriction of XML External Entity Reference). The lxml library's XML parsers, when operating with default settings, automatically resolve external entities within XML documents. This behavior allows an attacker to craft malicious XML input containing external entity declarations that reference local file paths on the target system. When the parser processes such input, it resolves these entities by reading the contents of the specified files, effectively enabling arbitrary local file disclosure.
The vulnerability is particularly concerning because it affects the default configuration of lxml, meaning developers who have not explicitly hardened their parser settings are at risk. Applications that accept XML input from untrusted sources—such as web services, file upload handlers, or data import functions—are primary attack targets.
Root Cause
The root cause of this vulnerability lies in the default value of the resolve_entities parameter in lxml's XML parsers. By default, this parameter is set to True, which instructs the parser to resolve all entity references, including external entities. External entity resolution allows XML documents to reference and include content from external sources, including local file system paths. Without explicit configuration to disable or restrict this behavior, the parser will attempt to read and include the contents of any file path specified in an external entity declaration.
Attack Vector
The attack exploits the network-accessible nature of applications using lxml to process XML data. An attacker submits a crafted XML document containing a DOCTYPE declaration with an external entity that references a local file path (e.g., /etc/passwd on Linux systems or sensitive application configuration files). When the vulnerable lxml parser processes this document with default settings, it reads the specified file and incorporates its contents into the parsed XML structure.
The vulnerability mechanism works as follows: an attacker crafts an XML payload containing an external entity declaration pointing to a sensitive local file. The entity is then referenced within the XML document body. When lxml parses this document with resolve_entities=True (the default), the parser reads the target file and substitutes the entity reference with the file contents. The attacker can then extract this information from the application's response or behavior.
For technical details on the exploitation mechanism and proof-of-concept examples, refer to the GitHub Security Advisory and the Launchpad Bug Report.
Detection Methods for CVE-2026-41066
Indicators of Compromise
- Unusual file access patterns in application logs showing reads of sensitive system files like /etc/passwd, /etc/shadow, or application configuration files
- XML payloads in application input containing DOCTYPE declarations with ENTITY references to local file paths
- Application responses or error messages containing unexpected file contents or path references
- Network traffic containing XML data with suspicious external entity declarations
Detection Strategies
- Implement application-level logging to capture and analyze incoming XML payloads for DOCTYPE and ENTITY declarations
- Deploy Web Application Firewall (WAF) rules to detect and block XML payloads containing external entity patterns
- Monitor for file system access anomalies where the application process reads files outside its expected scope
- Use static code analysis tools to identify lxml usage without explicit resolve_entities configuration
Monitoring Recommendations
- Enable verbose logging for XML parsing operations in production environments to capture potential exploitation attempts
- Implement alerting for application processes accessing sensitive system files that are outside normal operational requirements
- Regularly audit Python dependencies to identify vulnerable lxml versions in your software inventory
- Monitor for outbound network connections that may indicate data exfiltration following successful XXE exploitation
How to Mitigate CVE-2026-41066
Immediate Actions Required
- Upgrade lxml to version 6.1.0 or later immediately across all environments
- Audit all Python applications in your environment that use lxml for XML processing
- Configure existing lxml installations to explicitly set resolve_entities='internal' or resolve_entities=False as an interim measure
- Review application logs for evidence of prior exploitation attempts using XXE payloads
Patch Information
The vulnerability is fixed in lxml version 6.1.0. Organizations should upgrade to this version or later to fully remediate the vulnerability. The fix ensures that external entity resolution is properly restricted by default, preventing unauthorized local file access. For detailed patch information and release notes, refer to the GitHub Security Advisory.
Workarounds
- Explicitly set resolve_entities='internal' when creating lxml parsers to restrict entity resolution to internal entities only
- Set resolve_entities=False to completely disable entity resolution if your application does not require this functionality
- Implement input validation to reject XML documents containing DOCTYPE declarations before passing them to lxml parsers
- Use XML schema validation to restrict accepted XML structures and prevent malicious payloads from reaching the parser
# Configuration example - Python code to safely configure lxml parser
# Set resolve_entities to 'internal' or False when creating parser
# parser = etree.XMLParser(resolve_entities='internal')
# or
# parser = etree.XMLParser(resolve_entities=False)
#
# Upgrade lxml to patched version:
pip install --upgrade lxml>=6.1.0
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


