CVE-2025-6638 Overview
CVE-2025-6638 is a Regular Expression Denial of Service (ReDoS) vulnerability in the Hugging Face Transformers library. The flaw resides in the remove_language_code() method of the MarianTokenizer class. Affected version 4.52.4 uses an inefficient regular expression that exhibits catastrophic backtracking when processing crafted input strings containing malformed language code patterns. An attacker who can submit input to a service tokenizing text with MarianTokenizer can trigger excessive CPU consumption, leading to denial of service. The issue has been resolved in Transformers 4.53.0 by removing the regular expression entirely. The vulnerability is tracked under [CWE-1333: Inefficient Regular Expression Complexity].
Critical Impact
Network-reachable inference endpoints using MarianTokenizer can be rendered unavailable by a single crafted input string that exhausts CPU resources.
Affected Products
- Hugging Face Transformers 4.52.4
- Python applications importing MarianTokenizer from transformers.models.marian
- Inference services and pipelines invoking Marian translation models
Discovery Timeline
- 2025-09-12 - CVE-2025-6638 published to NVD
- 2025-10-21 - Last updated in NVD database
Technical Details for CVE-2025-6638
Vulnerability Analysis
The defect lives in src/transformers/models/marian/tokenization_marian.py. The MarianTokenizer.remove_language_code() method applied a regular expression to strip language code prefixes such as >>fr<< from input text before tokenization. The pattern's structure permitted catastrophic backtracking: when an attacker supplies a string that nearly matches the language code shape but contains repeating ambiguous characters, the regex engine explores an exponential number of match paths. CPU time grows superlinearly with input length, blocking the worker thread executing tokenization. Because tokenization runs synchronously before model inference on most translation pipelines, a single malicious request can stall the entire serving process.
Root Cause
The root cause is inefficient regular expression complexity [CWE-1333]. The regex used to detect and remove language code markers contained overlapping quantifiers that produced ambiguous matches against partially malformed inputs. The maintainers' fix abandons regex matching altogether and replaces it with deterministic string operations, eliminating the backtracking surface.
Attack Vector
Exploitation requires only the ability to supply input text to a service that calls MarianTokenizer. No authentication or user interaction is needed. Common attack surfaces include public translation APIs, chatbots, document processors, and any HTTP endpoint that forwards user input into a Marian-based pipeline. Confidentiality and integrity are not impacted; the effect is sustained CPU exhaustion on the host running tokenization.
# Patch excerpt: src/transformers/models/marian/tokenization_marian.py
from shutil import copyfile
from typing import Any, Optional, Union
-import regex as re
import sentencepiece
from ...tokenization_utils import PreTrainedTokenizer
Source: Hugging Face Transformers commit 47c34fb
The patch removes the regex import and rewrites remove_language_code() using non-regex string parsing, eliminating backtracking risk.
Detection Methods for CVE-2025-6638
Indicators of Compromise
- Sustained 100% CPU utilization on Python worker processes loading transformers.models.marian.tokenization_marian.
- Request latency spikes or timeouts on translation endpoints correlating with single inbound requests.
- Application logs showing tokenization calls that never return for specific inputs containing repeated > or < characters.
Detection Strategies
- Inventory Python environments for transformers==4.52.4 using pip list or SBOM tooling, and flag instances importing MarianTokenizer.
- Inspect application traces for unusually long execution time inside MarianTokenizer.remove_language_code frames.
- Apply input length and character-class rate limits at the API gateway and alert when payloads exceed expected language code formats.
Monitoring Recommendations
- Track per-request CPU time on inference workers and trigger alerts when tokenization exceeds a defined threshold (for example, 500 ms).
- Monitor process restarts, OOM events, and worker pool saturation on services hosting Marian models.
- Forward web application firewall logs to a SIEM such as Singularity Data Lake to correlate anomalous payload patterns with backend resource exhaustion.
How to Mitigate CVE-2025-6638
Immediate Actions Required
- Upgrade transformers to version 4.53.0 or later in all production, staging, and development environments.
- Audit dependency manifests (requirements.txt, pyproject.toml, Pipfile.lock) and rebuild container images that pin vulnerable versions.
- Restart inference services after upgrade to ensure the patched tokenizer is loaded.
Patch Information
The fix is contained in commit 47c34fba5c303576560cb29767efb452ff12b8be, released as part of Hugging Face Transformers 4.53.0. The maintainers replaced regex-based language code stripping with plain string operations. Reference: Hugging Face Transformers commit 47c34fb and the Huntr bounty disclosure.
Workarounds
- Enforce a strict maximum input length (for example, 2,048 characters) before passing text to MarianTokenizer.
- Validate or strip language code prefixes (>>xx<<) at the application layer using a deterministic parser before invoking the tokenizer.
- Run tokenization in a worker process with a hard CPU time limit (resource.setrlimit(RLIMIT_CPU, ...)) so malicious inputs cannot block the main service.
# Upgrade to the patched release
pip install --upgrade "transformers>=4.53.0"
# Verify installed version
python -c "import transformers; print(transformers.__version__)"
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


