CVE-2026-8177: XML::LibXML Perl DoS Vulnerability

CVE-2026-8177 Overview

CVE-2026-8177 is an out-of-bounds read vulnerability in XML::LibXML versions through 2.0210 for Perl. The parser reads past the end of input strings when processing XML node names that contain truncated UTF-8 byte sequences. A node name ending in the middle of a multi-byte UTF-8 sequence causes the parser to read into adjacent heap memory. Any Perl process passing attacker-controlled strings to DOM node-name methods can trigger this code path on the default API. The likely consequence is a process crash, resulting in denial of service. The flaw is classified under [CWE-125] (Out-of-bounds Read).

Critical Impact
Remote attackers can crash Perl applications that parse untrusted XML node names by submitting truncated UTF-8 sequences, causing denial of service.

Affected Products

XML::LibXML for Perl, versions through 2.0210
Any Perl application invoking DOM node-name methods on attacker-controlled input
CPAN distributions bundling vulnerable XML::LibXML releases

Discovery Timeline

2026-05-10 - CVE-2026-8177 published to NVD
2026-05-12 - Last updated in NVD database

Technical Details for CVE-2026-8177

Vulnerability Analysis

The vulnerability resides in the domParseChar function within dom.c. This function decodes UTF-8 sequences from XML node-name buffers without verifying that continuation bytes are present and well-formed. UTF-8 encodes a code point across one to four bytes, and multi-byte sequences require continuation bytes that match the bit pattern 10xxxxxx. When a node name terminates mid-sequence, domParseChar continues reading the expected number of bytes anyway. The parser dereferences memory beyond the input boundary, returning whatever resides in adjacent heap allocations. Crashes occur when the read crosses an unmapped page boundary.

Root Cause

The root cause is missing validation of UTF-8 continuation bytes in domParseChar. The function trusted the leading byte's length indicator and read 2, 3, or 4 bytes without bounds or pattern checks. Truncated input is silently accepted instead of rejected with an error.

Attack Vector

The attack vector is network-reachable. Any service that hands user-controlled strings to XML::LibXML DOM node-name methods exposes the vulnerable path. Examples include web applications parsing XML payloads, SOAP endpoints, and feed processors. No authentication is required.

         if ((c & 0xe0) == 0xe0) {
             if ((c & 0xf0) == 0xf0) {
                 /* 4-byte code */
+                if ((cur[1] & 0xC0) != 0x80 ||
+                    (cur[2] & 0xC0) != 0x80 ||
+                    (cur[3] & 0xC0) != 0x80)
+                {
+                    *len = -1;
+                    return(0);
+                }
                 *len = 4;
                 val = (cur[0] & 0x7) << 18;
                 val |= (cur[1] & 0x3f) << 12;
                 val |= (cur[2] & 0x3f) << 6;
                 val |= cur[3] & 0x3f;
             } else {
                 /* 3-byte code */
+                if ((cur[1] & 0xC0) != 0x80 ||
+                    (cur[2] & 0xC0) != 0x80)
+                {
+                    *len = -1;
+                    return(0);
+                }
                 *len = 3;
                 val = (cur[0] & 0xf) << 12;
                 val |= (cur[1] & 0x3f) << 6;
                 val |= cur[2] & 0x3f;
             }

Source: GitHub Commit Patch. The patch adds explicit checks that each continuation byte matches 0x80 in its top two bits and aborts parsing with *len = -1 when validation fails.

Detection Methods for CVE-2026-8177

Indicators of Compromise

Unexpected crashes or segmentation faults in Perl processes that load XML::LibXML
Web server or worker restarts correlated with inbound XML payloads containing non-ASCII bytes
XML inputs whose node names terminate with bytes in the range 0xC0–0xF7 without valid continuation bytes
Stack traces referencing domParseChar in dom.c

Detection Strategies

Inspect XML traffic for malformed UTF-8 sequences in element and attribute names using a WAF or XML schema validator
Enable core dump collection on Perl workers and review for faults inside XML::LibXML.so
Audit application code paths that pass untrusted input to DOM node-name methods such as createElement, setNodeName, or getElementsByTagName

Monitoring Recommendations

Track abnormal process termination rates for Perl services handling XML
Alert on log entries indicating XML parser failures combined with client-supplied data
Monitor CPAN package inventory for XML::LibXML versions at or below 2.0210

How to Mitigate CVE-2026-8177

Immediate Actions Required

Upgrade XML::LibXML to a patched release above 2.0210 once published on CPAN
Apply the upstream fix from GitHub Pull Request #149 to local builds if a CPAN release is not yet available
Identify all Perl services that process untrusted XML and prioritize them for patching
Restart Perl worker processes after deploying the updated module

Patch Information

The fix is committed in GitHub Commit 15652bd and tracked in GitHub Issue #146. The patch validates UTF-8 continuation bytes in domParseChar before reading them. Coordinated disclosure was published on the OpenWall oss-security list.

Workarounds

Validate or normalize UTF-8 in all XML inputs before passing them to XML::LibXML DOM methods
Reject XML payloads whose element or attribute names contain bytes outside the printable ASCII range when feasible
Place an XML schema validator or WAF in front of services that accept attacker-controlled XML
Restrict the size and character set of inputs forwarded to DOM node-name APIs

bash

# Upgrade XML::LibXML from CPAN once a fixed release is available
cpanm XML::LibXML

# Or apply the upstream patch directly to a local checkout
curl -L https://github.com/cpan-authors/XML-LibXML/commit/15652bd905a6c9dda59a81b14d4766adbbae2ea8.patch \
  | git apply -
perl Makefile.PL && make && make test && sudo make install