CVE-2024-12720 Overview
CVE-2024-12720 is a Regular Expression Denial of Service (ReDoS) vulnerability in the huggingface/transformers library. The flaw resides in tokenization_nougat_fast.py, specifically inside the post_process_single() function. A regular expression in this function exhibits exponential time complexity when processing specially crafted input, triggering catastrophic backtracking. Attackers can submit malicious tokenizer input to force excessive CPU consumption and cause application downtime. The issue affects version v4.46.3 and earlier releases of the library. The vulnerability is tracked under [CWE-1333] (Inefficient Regular Expression Complexity).
Critical Impact
Remote, unauthenticated attackers can exhaust CPU resources and cause denial of service in any application that feeds untrusted text through the Nougat fast tokenizer.
Affected Products
- Hugging Face Transformers v4.46.3 (latest at time of disclosure)
- Prior versions of huggingface/transformers containing tokenization_nougat_fast.py
- Downstream applications and services that invoke the Nougat tokenizer on user-supplied input
Discovery Timeline
- 2025-03-20 - CVE-2024-12720 published to the National Vulnerability Database
- 2025-08-01 - Last updated in the NVD database
Technical Details for CVE-2024-12720
Vulnerability Analysis
The Nougat tokenizer ships with a post-processing routine that normalizes model output before returning text to the caller. Inside post_process_single(), a regular expression scans the input to clean up formatting artifacts. The pattern contains ambiguous quantifiers that overlap on the same input characters. When the engine encounters input designed to maximize alternative match paths, it explores them exhaustively before failing. This catastrophic backtracking causes processing time to grow exponentially with input length.
A single request containing a few kilobytes of crafted text can pin a CPU core for minutes. Repeated requests starve worker threads and degrade or stop service for legitimate users. The attack does not require authentication and works over the network whenever the affected tokenizer is exposed through an inference API or web service.
Root Cause
The root cause is an inefficient regular expression in tokenization_nougat_fast.py that lacks atomic grouping or possessive quantifiers. Overlapping repetition operators allow the regex engine to evaluate many redundant match permutations. Python's re module uses a backtracking NFA engine, which amplifies the cost of these ambiguous patterns. Input validation prior to the regex call does not bound the length or structure of the text being processed.
Attack Vector
The attack vector is network-based and requires no privileges or user interaction. An attacker submits crafted text to any endpoint that routes data through the Nougat fast tokenizer's post_process_single() function. This includes hosted inference APIs, document conversion pipelines, and notebooks that process untrusted user content. The vulnerability does not expose data confidentiality or integrity but produces high availability impact through CPU exhaustion. See the Huntr bounty report for additional technical context.
Detection Methods for CVE-2024-12720
Indicators of Compromise
- Sustained high CPU usage on Python worker processes running transformers without a corresponding rise in throughput
- Request timeouts or stalled responses from inference endpoints that invoke the Nougat tokenizer
- Stack traces or profiler samples showing extended time inside re module functions called from tokenization_nougat_fast.py
- Inbound requests containing repetitive or pathological character sequences targeting tokenizer endpoints
Detection Strategies
- Profile tokenizer call durations and alert when post_process_single() exceeds a baseline threshold
- Apply request payload size limits and reject inputs that exceed expected document lengths
- Use a regex execution timeout wrapper to abort matches that run beyond a fixed time budget
- Log and correlate spikes in CPU utilization with HTTP request payloads to identify abusive clients
Monitoring Recommendations
- Track per-endpoint p95 and p99 latency for any service that loads huggingface/transformers
- Monitor process-level CPU saturation on inference workers and trigger alerts on sustained 100% utilization
- Capture the transformers package version across the fleet and flag hosts pinned to versions at or below v4.46.3
- Inspect web application firewall logs for repeated submissions of long or unusual character sequences to tokenizer-backed endpoints
How to Mitigate CVE-2024-12720
Immediate Actions Required
- Upgrade huggingface/transformers to a release that includes the upstream fix commit deac971c469bcbb182c2e52da0b82fb3bf54cccf
- Enforce strict input length limits on any endpoint that calls the Nougat fast tokenizer
- Place rate limiting and request timeouts in front of tokenizer-backed inference services
- Audit internal pipelines for use of tokenization_nougat_fast and isolate them from untrusted input until patched
Patch Information
The maintainers fixed the regex in commit deac971c469bcbb182c2e52da0b82fb3bf54cccf. Review the GitHub commit for the exact regex replacement applied to post_process_single(). Upgrade to a transformers release that incorporates this commit and redeploy any container images or virtual environments that bundle the library.
Workarounds
- Cap request body size at the reverse proxy or API gateway to prevent large inputs from reaching the tokenizer
- Wrap calls to post_process_single() in a subprocess or thread with a hard CPU time limit
- Disable the Nougat fast tokenizer path and route traffic through an alternative tokenizer if feasible
- Pre-filter input to reject character patterns known to trigger backtracking before invoking the tokenizer
# Upgrade transformers to a patched release
pip install --upgrade "transformers>4.46.3"
# Verify the installed version
python -c "import transformers; print(transformers.__version__)"
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

