CVE-2025-6638: Huggingface Transformers DoS Vulnerability

CVE-2025-6638 Overview

CVE-2025-6638 is a Regular Expression Denial of Service (ReDoS) vulnerability in the Hugging Face Transformers library. The flaw resides in the remove_language_code() method of the MarianTokenizer class. Affected version 4.52.4 uses an inefficient regular expression that exhibits catastrophic backtracking when processing crafted input strings containing malformed language code patterns. An attacker who can submit input to a service tokenizing text with MarianTokenizer can trigger excessive CPU consumption, leading to denial of service. The issue has been resolved in Transformers 4.53.0 by removing the regular expression entirely. The vulnerability is tracked under [CWE-1333: Inefficient Regular Expression Complexity].

Critical Impact
Network-reachable inference endpoints using MarianTokenizer can be rendered unavailable by a single crafted input string that exhausts CPU resources.

Affected Products

Hugging Face Transformers 4.52.4
Python applications importing MarianTokenizer from transformers.models.marian
Inference services and pipelines invoking Marian translation models

Discovery Timeline

2025-09-12 - CVE-2025-6638 published to NVD
2025-10-21 - Last updated in NVD database

Technical Details for CVE-2025-6638

Vulnerability Analysis

The defect lives in src/transformers/models/marian/tokenization_marian.py. The MarianTokenizer.remove_language_code() method applied a regular expression to strip language code prefixes such as >>fr<< from input text before tokenization. The pattern's structure permitted catastrophic backtracking: when an attacker supplies a string that nearly matches the language code shape but contains repeating ambiguous characters, the regex engine explores an exponential number of match paths. CPU time grows superlinearly with input length, blocking the worker thread executing tokenization. Because tokenization runs synchronously before model inference on most translation pipelines, a single malicious request can stall the entire serving process.

Root Cause

The root cause is inefficient regular expression complexity [CWE-1333]. The regex used to detect and remove language code markers contained overlapping quantifiers that produced ambiguous matches against partially malformed inputs. The maintainers' fix abandons regex matching altogether and replaces it with deterministic string operations, eliminating the backtracking surface.

Attack Vector

Exploitation requires only the ability to supply input text to a service that calls MarianTokenizer. No authentication or user interaction is needed. Common attack surfaces include public translation APIs, chatbots, document processors, and any HTTP endpoint that forwards user input into a Marian-based pipeline. Confidentiality and integrity are not impacted; the effect is sustained CPU exhaustion on the host running tokenization.

python

# Patch excerpt: src/transformers/models/marian/tokenization_marian.py
 from shutil import copyfile
 from typing import Any, Optional, Union

-import regex as re
 import sentencepiece

 from ...tokenization_utils import PreTrainedTokenizer

Source: Hugging Face Transformers commit 47c34fb

The patch removes the regex import and rewrites remove_language_code() using non-regex string parsing, eliminating backtracking risk.

Detection Methods for CVE-2025-6638

Indicators of Compromise

Sustained 100% CPU utilization on Python worker processes loading transformers.models.marian.tokenization_marian.
Request latency spikes or timeouts on translation endpoints correlating with single inbound requests.
Application logs showing tokenization calls that never return for specific inputs containing repeated > or < characters.

Detection Strategies

Inventory Python environments for transformers==4.52.4 using pip list or SBOM tooling, and flag instances importing MarianTokenizer.
Inspect application traces for unusually long execution time inside MarianTokenizer.remove_language_code frames.
Apply input length and character-class rate limits at the API gateway and alert when payloads exceed expected language code formats.

Monitoring Recommendations

Track per-request CPU time on inference workers and trigger alerts when tokenization exceeds a defined threshold (for example, 500 ms).
Monitor process restarts, OOM events, and worker pool saturation on services hosting Marian models.
Forward web application firewall logs to a SIEM such as Singularity Data Lake to correlate anomalous payload patterns with backend resource exhaustion.

How to Mitigate CVE-2025-6638

Immediate Actions Required

Upgrade transformers to version 4.53.0 or later in all production, staging, and development environments.
Audit dependency manifests (requirements.txt, pyproject.toml, Pipfile.lock) and rebuild container images that pin vulnerable versions.
Restart inference services after upgrade to ensure the patched tokenizer is loaded.

Patch Information

The fix is contained in commit 47c34fba5c303576560cb29767efb452ff12b8be, released as part of Hugging Face Transformers 4.53.0. The maintainers replaced regex-based language code stripping with plain string operations. Reference: Hugging Face Transformers commit 47c34fb and the Huntr bounty disclosure.

Workarounds

Enforce a strict maximum input length (for example, 2,048 characters) before passing text to MarianTokenizer.
Validate or strip language code prefixes (>>xx<<) at the application layer using a deterministic parser before invoking the tokenizer.
Run tokenization in a worker process with a hard CPU time limit (resource.setrlimit(RLIMIT_CPU, ...)) so malicious inputs cannot block the main service.

bash

# Upgrade to the patched release
pip install --upgrade "transformers>=4.53.0"

# Verify installed version
python -c "import transformers; print(transformers.__version__)"

CVE-2025-6638 Overview

Critical Impact
Network-reachable inference endpoints using MarianTokenizer can be rendered unavailable by a single crafted input string that exhausts CPU resources.

Affected Products

Hugging Face Transformers 4.52.4
Python applications importing MarianTokenizer from transformers.models.marian
Inference services and pipelines invoking Marian translation models

Discovery Timeline

2025-09-12 - CVE-2025-6638 published to NVD
2025-10-21 - Last updated in NVD database

Technical Details for CVE-2025-6638

Vulnerability Analysis

Root Cause

Attack Vector

python

# Patch excerpt: src/transformers/models/marian/tokenization_marian.py
 from shutil import copyfile
 from typing import Any, Optional, Union

-import regex as re
 import sentencepiece

 from ...tokenization_utils import PreTrainedTokenizer

Source: Hugging Face Transformers commit 47c34fb

The patch removes the regex import and rewrites remove_language_code() using non-regex string parsing, eliminating backtracking risk.

Detection Methods for CVE-2025-6638

Indicators of Compromise

Sustained 100% CPU utilization on Python worker processes loading transformers.models.marian.tokenization_marian.
Request latency spikes or timeouts on translation endpoints correlating with single inbound requests.
Application logs showing tokenization calls that never return for specific inputs containing repeated > or < characters.

Detection Strategies

Inventory Python environments for transformers==4.52.4 using pip list or SBOM tooling, and flag instances importing MarianTokenizer.
Inspect application traces for unusually long execution time inside MarianTokenizer.remove_language_code frames.
Apply input length and character-class rate limits at the API gateway and alert when payloads exceed expected language code formats.

Monitoring Recommendations

Track per-request CPU time on inference workers and trigger alerts when tokenization exceeds a defined threshold (for example, 500 ms).
Monitor process restarts, OOM events, and worker pool saturation on services hosting Marian models.
Forward web application firewall logs to a SIEM such as Singularity Data Lake to correlate anomalous payload patterns with backend resource exhaustion.

How to Mitigate CVE-2025-6638

Immediate Actions Required

Upgrade transformers to version 4.53.0 or later in all production, staging, and development environments.
Audit dependency manifests (requirements.txt, pyproject.toml, Pipfile.lock) and rebuild container images that pin vulnerable versions.
Restart inference services after upgrade to ensure the patched tokenizer is loaded.

Patch Information

Workarounds

Enforce a strict maximum input length (for example, 2,048 characters) before passing text to MarianTokenizer.
Validate or strip language code prefixes (>>xx<<) at the application layer using a deterministic parser before invoking the tokenizer.
Run tokenization in a worker process with a hard CPU time limit (resource.setrlimit(RLIMIT_CPU, ...)) so malicious inputs cannot block the main service.

bash

# Upgrade to the patched release
pip install --upgrade "transformers>=4.53.0"

# Verify installed version
python -c "import transformers; print(transformers.__version__)"

CVE-2025-6638: Huggingface Transformers DoS Vulnerability

CVE-2025-6638 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-6638

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-6638

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-6638

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2025-6638: Huggingface Transformers DoS Vulnerability

CVE-2025-6638 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-6638

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-6638

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-6638

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform