CVE-2024-6232: Python CPython ReDoS Vulnerability

CVE-2024-6232 Overview

CVE-2024-6232 is a Regular Expression Denial of Service (ReDoS) vulnerability affecting CPython's tarfile module. The vulnerability exists in regular expressions used during tarfile.TarFile header parsing, which allow excessive backtracking when processing specifically-crafted tar archives. An attacker can exploit this flaw by supplying a malicious tar file that causes the regex engine to enter an exponential time complexity state, effectively hanging or crashing applications that process untrusted tar archives.

Critical Impact
Applications processing untrusted tar archives are vulnerable to denial of service attacks through CPU exhaustion caused by malicious tar file headers triggering catastrophic regex backtracking.

Affected Products

Python CPython versions prior to patched releases
Python 3.13.0 alpha0 through alpha6
Python 3.13.0 beta1 through beta4 and rc1

Discovery Timeline

September 3, 2024 - CVE-2024-6232 published to NVD
November 3, 2025 - Last updated in NVD database

Technical Details for CVE-2024-6232

Vulnerability Analysis

This vulnerability is classified under CWE-1333 (Inefficient Regular Expression Complexity). The tarfile module in CPython uses regular expressions to parse tar archive headers. When these regular expressions encounter specially crafted input patterns, they can exhibit catastrophic backtracking behavior, where the regex engine explores an exponentially growing number of possible match paths before determining that no match exists.

The vulnerability can be exploited remotely without authentication by providing a malicious tar archive to any Python application that processes tar files from untrusted sources. This includes web applications accepting tar uploads, backup utilities, package managers, and data processing pipelines. The attack results in availability impact through CPU resource exhaustion, potentially causing complete service disruption.

Root Cause

The root cause lies in the inefficient regular expression patterns used within the tarfile.TarFile header parsing logic. These patterns contain constructs that allow for excessive backtracking when presented with adversarial input strings. Specifically, the regex patterns likely contain nested quantifiers or overlapping alternatives that create exponential time complexity under certain input conditions.

Attack Vector

The attack is network-exploitable with low complexity. An attacker creates a tar archive with specially crafted header fields designed to maximize regex backtracking. When a vulnerable Python application opens and parses this archive using tarfile.TarFile, the header parsing routine invokes the vulnerable regex patterns against the malicious headers.

The attack requires no privileges and no user interaction beyond the target application processing the malicious file. The impact is limited to availability—there is no confidentiality or integrity breach, but the denial of service can be severe, potentially consuming all available CPU resources on the affected system.

Detection Methods for CVE-2024-6232

Indicators of Compromise

Unusual CPU spikes correlated with tar file processing operations
Python processes consuming excessive CPU time during archive extraction
Application timeouts or hangs when processing tar files from external sources
Abnormally large or malformed tar files appearing in upload directories

Detection Strategies

Monitor Python application processes for sustained high CPU utilization during file processing operations
Implement application-level timeouts for tar file parsing operations to detect and terminate stuck processes
Review application logs for patterns indicating stalled or unresponsive archive processing
Deploy file integrity monitoring to identify anomalous tar files being submitted to processing pipelines

Monitoring Recommendations

Configure resource usage alerts for Python processes exceeding normal CPU thresholds
Implement watchdog timers around tarfile operations to detect and log excessive processing times
Monitor application response times for services that accept tar file uploads
Track and analyze tar file processing metrics to establish baselines and detect anomalies

How to Mitigate CVE-2024-6232

Immediate Actions Required

Upgrade Python to the latest patched version for your release branch
Implement input validation and file size limits for tar archives from untrusted sources
Add timeout mechanisms around tar file parsing operations to prevent resource exhaustion
Consider sandboxing or isolating tar file processing in resource-constrained environments

Patch Information

The Python Software Foundation has released patches across multiple CPython branches. The fix addresses the inefficient regular expression patterns in the tarfile module. Patches are available in the following commits:

For additional information, refer to the Python Security Announcement, GitHub Issue #121285, and GitHub Pull Request #121286.

Workarounds

Implement processing timeouts for all tarfile operations to limit the duration of potential ReDoS attacks
Validate tar file structure and header sizes before full parsing using lightweight checks
Process untrusted tar archives in isolated environments with strict CPU and memory limits
Consider using alternative tar parsing libraries that are not affected by this vulnerability for high-risk applications

bash

# Example: Resource-limited tar processing with timeout
timeout 30s python3 -c "
import tarfile
import resource

# Set CPU time limit (30 seconds)
resource.setrlimit(resource.RLIMIT_CPU, (30, 30))

# Process tar file with timeout protection
with tarfile.open('archive.tar', 'r') as tar:
    tar.extractall(path='/tmp/extract')
"