Skip to main content
CVE Vulnerability Database

CVE-2026-8829: Perl HTML::Entities Info Disclosure Flaw

CVE-2026-8829 is an information disclosure vulnerability in Perl HTML::Entities that allows heap memory disclosure through freed memory access. This article covers technical details, affected versions, and mitigations.

Published:

CVE-2026-8829 Overview

CVE-2026-8829 is a use-after-free vulnerability [CWE-416] in the HTML::Entities Perl module shipped with the HTML-Parser distribution. Versions before 3.84 contain a flaw in the XS routine that implements _decode_entities, where a cached pointer into a hash value SV becomes invalid after a buffer reallocation. Decoding attacker-controlled HTML entities can read freed heap memory and copy adjacent contents into the destination scalar. The defect affects Perl applications and web frameworks that process untrusted HTML through HTML::Entities::decode_entities or _decode_entities.

Critical Impact

Heap memory disclosure during HTML entity decoding may leak adjacent process memory into application-visible strings, potentially exposing secrets handled by the same Perl interpreter.

Affected Products

  • HTML::Entities (HTML-Parser distribution) versions before 3.84
  • Perl applications linking the vulnerable HTML-Parser XS module
  • Downstream distributions packaging libhtml-parser-perl before 3.84

Discovery Timeline

  • 2026-06-04 - CVE-2026-8829 published to NVD
  • 2026-06-04 - Issue discussed on the OpenWall oss-security list
  • 2026-06-04 - Last updated in NVD database

Technical Details for CVE-2026-8829

Vulnerability Analysis

The vulnerability lives in util.c of the HTML-Parser distribution, inside the XS implementation backing HTML::Entities::_decode_entities. The routine caches a char *repl pointer obtained from the PV buffer of an entity-value scalar returned by hv_fetch on the entity2char hash. When the caller passes an input SV that is the same SV stored as a value in entity2char, and that value embeds its own key as an entity reference, decoding triggers a recursive expansion path. A subsequent call to grow_gap() reallocates the underlying PV buffer of that SV. The original allocation is freed, but repl still points into it. The copy loop then reads repl_len bytes from the freed allocation and writes them into the destination SV.

Root Cause

The root cause is a stale pointer lifetime assumption. The XS code assumes the repl pointer remains valid across operations that may resize the same SV's storage. Perl's SV machinery is free to relocate the PV buffer when grow_gap() adjusts the in-place decode region, invalidating any cached interior pointer.

Attack Vector

An attacker supplies HTML content containing entity references that alias entries in the entity2char table in a way that causes the decode routine to operate on the hash value SV itself. Triggering the reallocation during decoding causes the freed buffer to be read back. The disclosed bytes are written into the output string and may surface in rendered HTML, log output, or downstream serialization, depending on the application.

c
    char *repl;
    STRLEN repl_len;
+   char *repl_allocated = 0;
    char buf[UTF8_MAXLEN];
    int repl_utf8;
    int high_surrogate = 0;

Source: GitHub Commit Patch

The patch introduces a repl_allocated ownership variable so the routine can hold its own copy of the replacement bytes rather than aliasing into a buffer it does not control.

Detection Methods for CVE-2026-8829

Indicators of Compromise

  • Unexpected non-printable or binary bytes appearing in fields produced by HTML::Entities::decode_entities output.
  • Crashes or ASAN heap-use-after-free reports inside Parser.so or Entities.so when fuzzing entity decoding.
  • Anomalous strings in application logs or HTTP responses containing fragments that resemble in-process secrets.

Detection Strategies

  • Inventory installed HTML-Parser versions across hosts using perl -MHTML::Parser -e 'print $HTML::Parser::VERSION' and flag any below 3.84.
  • Run the upstream regression test t/entities.t shipped in the fix commit against locally built modules to confirm patch presence.
  • Build the module with AddressSanitizer in pre-production to catch use-after-free conditions during HTML processing.

Monitoring Recommendations

  • Monitor Perl worker processes for abnormal terminations and segmentation faults correlated with HTML processing endpoints.
  • Alert on outbound responses or log entries containing high-entropy or non-UTF-8 byte sequences from fields known to pass through entity decoding.
  • Track package manager events that downgrade libhtml-parser-perl or HTML-Parser to versions earlier than 3.84.

How to Mitigate CVE-2026-8829

Immediate Actions Required

  • Upgrade HTML-Parser to version 3.84 or later on all systems running Perl workloads that decode HTML entities.
  • Rebuild and redeploy any application bundles, containers, or PAR archives that vendor the older XS module.
  • Audit application code paths that feed untrusted HTML into decode_entities and confirm post-upgrade behavior with the new test in t/entities.t.

Patch Information

The fix is committed upstream in the libwww-perl/HTML-Parser repository and is included in HTML-Parser 3.84. Review the GitHub Pull Request and the GitHub Commit Patch for the source change. Background discussion is available on the OpenWall OSS-Security Discussion.

Workarounds

  • Avoid calling HTML::Entities::_decode_entities directly with SVs that may alias entries in the entity2char hash.
  • Pre-validate or strip HTML entity references from untrusted input before invoking decode routines on shared scalars.
  • Where upgrade is not yet possible, copy input strings with my $copy = "$input" before decoding to break SV identity with hash values.
bash
# Upgrade via cpanm
cpanm HTML::Parser@3.84

# Debian/Ubuntu
apt-get update && apt-get install --only-upgrade libhtml-parser-perl

# Verify installed version
perl -MHTML::Parser -e 'print "HTML::Parser $HTML::Parser::VERSION\n"'

Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

Default Legacy - Prefooter | Experience the World’s Most Advanced Cybersecurity Platform

Experience the Most Advanced Cybersecurity Platform

See how the world’s most intelligent, autonomous cybersecurity platform can protect your organization today and into the future.