CVE-2022-24836 Overview
CVE-2022-24836 is a Regular Expression Denial of Service (ReDoS) vulnerability affecting Nokogiri, the widely-used open source XML and HTML parsing library for Ruby. Versions prior to v1.13.4 contain an inefficient regular expression in the HTML encoding detection mechanism that is susceptible to excessive backtracking when processing specially crafted HTML documents. This algorithmic complexity flaw allows remote attackers to cause a denial of service condition by supplying malicious input that triggers catastrophic regex backtracking.
Critical Impact
Applications using vulnerable Nokogiri versions can be rendered unresponsive through network-accessible input, potentially affecting web applications, API services, and backend systems that process untrusted HTML content.
Affected Products
- Nokogiri versions prior to 1.13.4 (Ruby gem)
- Fedora 34, 35, and 36
- Debian Linux 9.0 and 10.0
- Apple macOS (various versions)
Discovery Timeline
- April 11, 2022 - CVE-2022-24836 published to NVD
- November 21, 2024 - Last updated in NVD database
Technical Details for CVE-2022-24836
Vulnerability Analysis
This vulnerability represents a classic Regular Expression Denial of Service (ReDoS) attack surface in the HTML4::EncodingReader detection functionality. The vulnerable regex pattern used excessive backtracking quantifiers when parsing the XML declaration at the beginning of HTML documents. When an attacker provides a carefully crafted input string, the regex engine enters a pathological state where it attempts an exponentially growing number of matching combinations before failing or succeeding.
The vulnerable code path is triggered when Nokogiri attempts to auto-detect the encoding of an HTML document by scanning for an XML declaration (<?xml ... ?>). The inefficient pattern allowed attackers to supply input that forced the regex engine into computationally expensive backtracking operations, effectively exhausting CPU resources and blocking application threads.
Root Cause
The root cause lies in the regular expression pattern used in lib/nokogiri/html4/document.rb within the detect_encoding method. The original pattern /\A(<\?xml[ \t\r\n]+[^>]*>)/ contained a greedy quantifier (+) followed by another quantifier (*), creating overlapping match possibilities. When processing certain inputs, this pattern exhibits exponential time complexity due to catastrophic backtracking.
The fix simplified the regex by changing [ \t\r\n]+ to [ \t\r\n], removing the problematic + quantifier that allowed unbounded repetition of whitespace characters before the negated character class.
Attack Vector
The vulnerability is exploitable over the network without authentication. An attacker can submit malicious HTML content to any application endpoint that processes HTML using Nokogiri. Attack scenarios include:
- Web applications accepting user-supplied HTML for sanitization or transformation
- API endpoints processing HTML payloads
- Web scrapers or crawlers parsing untrusted external content
- Email processing systems handling HTML email bodies
# Security patch in lib/nokogiri/html4/document.rb - fix(perf): HTML4::EncodingReader detection
end
def self.detect_encoding(chunk)
- (m = chunk.match(/\A(<\?xml[ \t\r\n]+[^>]*>)/)) &&
+ (m = chunk.match(/\A(<\?xml[ \t\r\n][^>]*>)/)) &&
(return Nokogiri.XML(m[1]).encoding)
if Nokogiri.jruby?
Source: GitHub Commit for Nokogiri
Detection Methods for CVE-2022-24836
Indicators of Compromise
- Abnormally high CPU utilization on application servers processing HTML content
- Unusually long response times for web requests involving HTML parsing
- Thread pool exhaustion in Ruby web applications
- Application log entries showing timeout errors during HTML processing operations
Detection Strategies
- Monitor application performance metrics for sudden spikes in CPU usage during HTML processing operations
- Implement request timeout monitoring to detect processing delays exceeding normal thresholds
- Audit Ruby Gemfile.lock files across your environment to identify Nokogiri versions below 1.13.4
- Deploy dependency scanning tools to continuously monitor for vulnerable library versions
Monitoring Recommendations
- Configure alerting thresholds for CPU utilization on servers running Ruby applications with Nokogiri dependencies
- Implement request duration monitoring with alerting for outliers in HTML processing endpoints
- Enable detailed logging for HTML parsing operations to capture forensic evidence of exploitation attempts
- Monitor application thread pools for signs of thread starvation
How to Mitigate CVE-2022-24836
Immediate Actions Required
- Upgrade Nokogiri to version 1.13.4 or later immediately using bundle update nokogiri
- Audit all Ruby applications in your environment for Nokogiri dependencies
- Implement request timeouts on endpoints that process untrusted HTML content
- Consider implementing input size limits for HTML processing endpoints as a defense-in-depth measure
Patch Information
The vulnerability has been addressed in Nokogiri version 1.13.4 and later. The fix is available in the GitHub Commit for Nokogiri. The patch modifies the problematic regular expression pattern to eliminate the catastrophic backtracking condition by removing the unnecessary + quantifier.
For distribution-specific patches, refer to the following advisories:
- Debian LTS Announcement May 2022
- Gentoo GLSA 202208-29
- Apple Support Article HT213532 for macOS updates
Workarounds
- There are no known workarounds for this vulnerability; upgrading to Nokogiri >= 1.13.4 is the only effective mitigation
- Implement strict request timeouts as a temporary mitigation to limit the impact of exploitation attempts
- Consider rate limiting on endpoints that process HTML to reduce the impact of potential attacks
# Configuration example - Update Nokogiri in your Ruby project
# Update Gemfile to specify minimum safe version
echo 'gem "nokogiri", ">= 1.13.4"' >> Gemfile
bundle update nokogiri
# Verify the installed version
bundle exec gem list nokogiri
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

