CVE-2023-26119: Htmlunit RCE Vulnerability via XSLT

CVE-2023-26119 Overview

CVE-2023-26119 is a Remote Code Execution (RCE) vulnerability affecting HtmlUnit, a popular headless browser library for Java. Versions of the package net.sourceforge.htmlunit:htmlunit from version 0 up to (but not including) 3.0.0 are vulnerable to Remote Code Execution via XSLT processing when browsing an attacker-controlled webpage. This vulnerability allows attackers to execute arbitrary code on systems running vulnerable HtmlUnit versions by serving malicious XSLT content.

Critical Impact
Attackers can achieve full remote code execution by luring applications using HtmlUnit to process maliciously crafted XSLT content, potentially leading to complete system compromise.

Affected Products

HtmlUnit versions prior to 3.0.0
Applications using net.sourceforge.htmlunit:htmlunit dependency
Java applications leveraging HtmlUnit for web scraping or headless browser automation

Discovery Timeline

2023-04-03 - CVE-2023-26119 published to NVD
2024-11-21 - Last updated in NVD database

Technical Details for CVE-2023-26119

Vulnerability Analysis

This vulnerability exists in HtmlUnit's XSLT (Extensible Stylesheet Language Transformations) processor implementation. The core issue stems from the XSLT processor not having secure processing features enabled, which allows malicious XSLT stylesheets to execute arbitrary Java code through extension functions. When HtmlUnit processes a webpage containing malicious XSLT content, the attacker can leverage XSLT's extensibility features to invoke arbitrary Java methods, effectively achieving code execution within the context of the application.

The vulnerability is classified under CWE-94 (Improper Control of Generation of Code / Code Injection), as attackers can inject and execute code through the XSLT processing mechanism. This attack requires no authentication or user interaction beyond the target application visiting a malicious webpage controlled by the attacker.

Root Cause

The root cause is the absence of the FEATURE_SECURE_PROCESSING flag in HtmlUnit's XSLT processor configuration. Without this security feature enabled, the XSLT processor permits dangerous operations including:

Execution of arbitrary Java extension functions
Access to external resources and file systems
Invocation of runtime commands through Java reflection

The secure processing feature is designed specifically to prevent such dangerous operations by restricting the XSLT processor's capabilities to safe transformations only.

Attack Vector

The attack vector is network-based and requires no privileges or user interaction. An attacker hosts a malicious webpage containing crafted XSLT content. When a vulnerable HtmlUnit application browses or processes this page, the XSLT processor parses the malicious stylesheet and executes the embedded code. This makes the vulnerability particularly dangerous for web scraping applications, automated testing frameworks, and any service that uses HtmlUnit to render or process external web content.

The security patch enables FEATURE_SECURE_PROCESSING for the XSLT processor:

java

import java.util.HashMap;
import java.util.Map;

import javax.xml.XMLConstants;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Result;

Source: HtmlUnit GitHub Commit

The changelog entry documenting the fix:

text

 
     <body>
         <release version="2.70.0" date="Febuary xx, 2023" description="Bugfixes">
+            <action type="fix" dev="rbri">
+                Enable FEATURE_SECURE_PROCESSING for the XSLT processor.
+            </action>
             <action type="add" dev="rbri">
                 Document disabling of website certificate check in the FAQ.
             </action>

Source: HtmlUnit GitHub Commit

Detection Methods for CVE-2023-26119

Indicators of Compromise

Unexpected outbound network connections from Java applications using HtmlUnit
Suspicious XSLT processing activity in application logs
Unusual Java process spawning or command execution traced to HtmlUnit components
Malicious XSLT stylesheets containing Java extension function calls in processed web content

Detection Strategies

Monitor for HtmlUnit dependency versions below 3.0.0 in Maven/Gradle dependency trees using software composition analysis (SCA) tools
Implement application logging to track XSLT transformations and flag suspicious extension function usage
Deploy network monitoring to detect connections to known malicious domains from HtmlUnit-based applications
Use runtime application self-protection (RASP) solutions to detect code injection attempts

Monitoring Recommendations

Enable detailed logging for XSLTProcessor class activities within HtmlUnit applications
Monitor Java process behavior for signs of code injection such as unexpected subprocess creation
Implement allowlisting for domains that HtmlUnit applications are permitted to access
Review application dependencies regularly with automated vulnerability scanning tools

How to Mitigate CVE-2023-26119

Immediate Actions Required

Upgrade HtmlUnit to version 3.0.0 or later immediately
Audit all applications using HtmlUnit to identify vulnerable deployments
Restrict network access for HtmlUnit-based applications to trusted domains only
Consider implementing input validation for URLs processed by HtmlUnit

Patch Information

The vulnerability has been addressed in HtmlUnit version 3.0.0 and later releases. The fix involves enabling FEATURE_SECURE_PROCESSING for the XSLT processor, which restricts dangerous XSLT operations. Organizations should update their Maven or Gradle dependencies to use the patched version. For detailed patch information, refer to the HtmlUnit security commit and the Snyk vulnerability advisory.

Workarounds

If immediate upgrade is not possible, implement strict URL allowlisting to prevent HtmlUnit from accessing untrusted domains
Deploy network-level controls to block access to external or unknown websites from systems running vulnerable HtmlUnit versions
Consider isolating HtmlUnit applications in sandboxed environments with limited permissions
Implement content security policies to validate and sanitize XSLT content before processing

bash

# Maven dependency update example
# Update pom.xml to use patched version:
# <dependency>
#     <groupId>net.sourceforge.htmlunit</groupId>
#     <artifactId>htmlunit</artifactId>
#     <version>3.0.0</version>
# </dependency>

# Verify current HtmlUnit version in project
mvn dependency:tree | grep htmlunit