CVE-2026-33347 Overview
CVE-2026-33347 is an allowlist bypass vulnerability in league/commonmark, a popular PHP Markdown parser. The DomainFilteringAdapter in the Embed extension contains a flaw in its domain-matching regex that fails to properly assert hostname boundaries. This allows attackers to bypass domain allowlists by crafting malicious domains that include allowed domains as substrings.
Critical Impact
Attackers can bypass security controls intended to restrict embedded content to trusted domains, potentially enabling Cross-Site Scripting (XSS) attacks or loading malicious content from attacker-controlled servers.
Affected Products
- league/commonmark versions 2.3.0 to 2.8.1
- Applications using the Embed extension with DomainFilteringAdapter
- PHP applications relying on domain allowlist filtering for embedded content
Discovery Timeline
- 2026-03-24 - CVE CVE-2026-33347 published to NVD
- 2026-03-25 - Last updated in NVD database
Technical Details for CVE-2026-33347
Vulnerability Analysis
This vulnerability exists within the DomainFilteringAdapter component of the league/commonmark Embed extension. The core issue stems from an improperly constructed regular expression used to validate domains against an allowlist. When checking if an embedded URL's domain matches an allowed domain, the regex lacks a proper hostname boundary assertion at the beginning of the pattern.
As a result, an attacker can craft a domain like youtube.com.evil.com that will pass validation when youtube.com is on the allowlist. The regex matches the substring youtube.com within the malicious domain without verifying that it represents the complete hostname. This effectively renders the allowlist protection ineffective, as any attacker-controlled domain containing an allowed domain as a substring will bypass the security check.
The vulnerability is classified as CWE-79 (Cross-Site Scripting) because bypassing the domain allowlist can enable the injection of malicious embedded content, which may execute scripts or load resources from untrusted sources in the context of the victim's browser.
Root Cause
The root cause is a missing hostname boundary assertion in the domain-matching regex within the DomainFilteringAdapter. The regular expression fails to anchor the match to ensure the allowed domain is the complete hostname rather than just a substring of the target domain. This is a common regex construction error where developers match a literal string without considering how it might appear as part of a larger string.
Attack Vector
The attack is network-based and requires no authentication or user interaction. An attacker can exploit this vulnerability by:
- Identifying that a target application uses league/commonmark with the Embed extension
- Determining which domains are on the allowlist (e.g., youtube.com, vimeo.com)
- Registering or controlling a domain that contains an allowed domain as a substring (e.g., youtube.com.attacker.com)
- Submitting Markdown content with an embed URL pointing to the attacker-controlled domain
- The DomainFilteringAdapter incorrectly validates the malicious domain as allowed
- Malicious content from the attacker's server is embedded and rendered to users
This bypass could lead to the delivery of malicious scripts, phishing content, or other harmful resources to users who trust the application's domain filtering.
Detection Methods for CVE-2026-33347
Indicators of Compromise
- Embedded content URLs containing allowed domains as substrings followed by additional domain components (e.g., youtube.com.malicious.com)
- Unexpected outbound requests to domains that appear similar to allowlisted domains
- User reports of suspicious embedded content or unexpected behavior in Markdown-rendered pages
Detection Strategies
- Implement regex-based log analysis to identify embed URLs where an allowed domain appears as a non-terminal substring
- Monitor web application logs for embed requests to domains not explicitly in the allowlist
- Deploy Content Security Policy (CSP) reporting to detect attempts to load content from unexpected origins
Monitoring Recommendations
- Enable verbose logging for the Embed extension to capture all domain validation decisions
- Set up alerts for embed requests to domains that contain but don't exactly match allowlisted domains
- Regularly audit embedded content in user-generated Markdown for suspicious URLs
How to Mitigate CVE-2026-33347
Immediate Actions Required
- Upgrade league/commonmark to version 2.8.2 or later immediately
- Audit existing content for potentially malicious embed URLs that may have bypassed the allowlist
- Review and validate that CSP headers are properly configured to provide defense-in-depth
Patch Information
The vulnerability has been patched in league/commonmark version 2.8.2. The fix adds proper hostname boundary assertions to the domain-matching regex, ensuring that only exact domain matches or valid subdomains are permitted through the allowlist.
For detailed patch information, see the GitHub Security Advisory GHSA-hh8v-hgvp-g3f5 and the commit that addresses this issue.
Workarounds
- If immediate upgrade is not possible, disable the Embed extension entirely until the patch can be applied
- Implement additional server-side validation of embed URLs using strict domain matching before passing content to the Markdown parser
- Deploy restrictive Content Security Policy headers to limit which domains can serve embedded content, providing defense-in-depth
# Update league/commonmark via Composer
composer require league/commonmark:^2.8.2
# Verify the installed version
composer show league/commonmark | grep versions
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

