CVE-2025-4330: Python Tarfile Path Traversal Vulnerability

CVE-2025-4330 Overview

CVE-2025-4330 is a path traversal vulnerability in Python's tarfile module that allows extraction filters to be bypassed. When extracting untrusted tar archives using TarFile.extractall() or TarFile.extract() with the filter= parameter set to "data" or "tar", an attacker can craft malicious archives containing symlinks that point outside the intended destination directory. This vulnerability also permits unauthorized modification of certain file metadata.

Critical Impact
Attackers can escape directory boundaries during tar extraction, potentially overwriting arbitrary files on the system and compromising application integrity. Applications using Python 3.14 or later with default filter settings are automatically affected due to the changed default value of filter= from "no filtering" to "data".

Affected Products

Python tarfile module (versions prior to patched releases)
Python 3.14+ with default extraction filter behavior
Applications using TarFile.extractall() or TarFile.extract() with "data" or "tar" filters

Discovery Timeline

June 3, 2025 - CVE-2025-4330 published to NVD
June 5, 2025 - Last updated in NVD database

Technical Details for CVE-2025-4330

Vulnerability Analysis

This vulnerability exists in Python's tarfile module, specifically in how extraction filters handle symbolic link targets. The extraction filters "data" and "tar" were designed to provide security boundaries during archive extraction, but the implementation contained a flaw that allowed these filters to be circumvented.

When processing tar archives containing symbolic links, the vulnerable code failed to properly normalize and validate link targets before extraction. This allowed attackers to create symlinks pointing to locations outside the designated extraction directory, effectively escaping the intended sandbox. Additionally, the vulnerability enabled modification of file metadata that should have been protected by the extraction filter.

The issue is classified under CWE-22 (Improper Limitation of a Pathname to a Restricted Directory), commonly known as path traversal. The vulnerability is exploitable over the network by delivering malicious tar archives to applications that process them with the affected extraction methods.

Root Cause

The root cause lies in insufficient validation of symbolic link targets during the tar extraction process. The extraction filter mechanism did not properly resolve and check symlink paths against the destination directory boundary before creating the links. The os.path.realpath() function's behavior when encountering errors (such as non-existent paths or permission issues) contributed to the bypass, as it would append remaining path components unchanged rather than strictly validating them.

Attack Vector

An attacker can exploit this vulnerability by crafting a malicious tar archive containing specially constructed symbolic links. When a victim application extracts this archive using the vulnerable tarfile methods with "data" or "tar" filters, the symlinks are created pointing outside the extraction directory. This can lead to:

Arbitrary file read/write through symlink redirection
Configuration file manipulation
Application compromise via overwriting executable files or libraries
Potential for further system compromise depending on application privileges

The following patch demonstrates how the Python maintainers addressed the symlink target validation issue:

text

    Raised to refuse extracting a symbolic link pointing outside the destination
    directory.
 
+.. exception:: LinkFallbackError
+
+   Raised to refuse emulating a link (hard or symbolic) by extracting another
+   archive member, when that member would be rejected by the filter location.
+   The exception that was raised to reject the replacement member is available
+   as :attr:`!BaseException.__context__`.
+
+   .. versionadded:: next
+
 
 The following constants are available at the module level:

Source: GitHub CPython Commit

The fix also includes enhanced os.path.realpath() behavior with a new ALLOW_MISSING mode:

text

-   If a path doesn't exist or a symlink loop is encountered, and *strict* is
-   ``True``, :exc:`OSError` is raised. If *strict* is ``False`` these errors
-   are ignored, and so the result might be missing or otherwise inaccessible.
+   By default, the path is evaluated up to the first component that does not
+   exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
+   All such components are appended unchanged to the existing part of the path.
+
+   Some errors that are handled this way include "access denied", "not a
+   directory", or "bad argument to internal function". Thus, the
+   resulting path may be missing or inaccessible, may still contain
+   links or loops, and may traverse non-directories.
+
+   This behavior can be modified by keyword arguments:
+
+   If *strict* is ``True``, the first error encountered when evaluating the path is
+   re-raised.
+   In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
+   or another :exc:`OSError` if it is otherwise inaccessible.
+
+   If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
+   :exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
+   Thus, the returned path will not contain any symbolic links, but the named
+   file and some of its parent directories may be missing.

Source: GitHub CPython Commit

Detection Methods for CVE-2025-4330

Indicators of Compromise

Unexpected symbolic links created outside application directories after tar extraction operations
File modifications in system directories that correlate with tar archive processing events
Application logs showing tarfile extraction activities with suspicious archive names or sources
Presence of tar archives containing symlinks with ../ sequences or absolute paths targeting sensitive locations

Detection Strategies

Monitor Python application logs for tarfile.extractall() and tarfile.extract() operations processing external or untrusted archives
Implement file integrity monitoring on critical system directories and configuration files
Deploy application-level logging to capture tar extraction events including source archive paths and extraction destinations
Use SentinelOne's behavioral AI to detect suspicious file system operations following archive extraction activities

Monitoring Recommendations

Enable enhanced logging for applications that process user-supplied tar archives
Configure alerts for symlink creation events in unexpected directories
Monitor for new LinkFallbackError exceptions in Python applications after patching (indicates blocked exploitation attempts)
Review application code for usage of tarfile module with filter="data" or filter="tar" parameters

How to Mitigate CVE-2025-4330

Immediate Actions Required

Upgrade Python installations to the latest patched versions that include the security fixes
Audit all applications using Python's tarfile module to identify vulnerable extraction patterns
Avoid processing untrusted tar archives until patches are applied
Consider implementing additional sandboxing for archive extraction operations
Review the Python Security Announcement for version-specific guidance

Patch Information

Multiple patches have been released to address this vulnerability across different Python versions. The fixes include adding a new LinkFallbackError exception to properly refuse dangerous link emulation operations and enhancing os.path.realpath() with a new strict=ALLOW_MISSING mode for proper symlink normalization. Refer to the GitHub CPython Issue #135034 and the associated pull request for complete technical details of the fixes.

Workarounds

Implement pre-extraction validation of tar archive contents to detect and reject archives containing suspicious symlinks
Use a custom extraction filter that explicitly validates all link targets against allowed paths before extraction
Extract archives to isolated temporary directories and validate contents before moving to final destinations
Run archive extraction operations in sandboxed environments with restricted file system access
Consider using alternative archive libraries with stronger security controls if Python cannot be immediately updated

bash

# Example: Validate tar contents before extraction
python3 -c "
import tarfile
import os

def validate_archive(archive_path, dest_dir):
    with tarfile.open(archive_path, 'r:*') as tar:
        for member in tar.getmembers():
            # Check for symlinks pointing outside destination
            if member.issym() or member.islnk():
                link_path = os.path.normpath(os.path.join(dest_dir, member.linkname))
                dest_path = os.path.normpath(dest_dir)
                if not link_path.startswith(dest_path):
                    print(f'UNSAFE: {member.name} -> {member.linkname}')
                    return False
    return True
"

CVE-2025-4330 Overview

Critical Impact
Attackers can escape directory boundaries during tar extraction, potentially overwriting arbitrary files on the system and compromising application integrity. Applications using Python 3.14 or later with default filter settings are automatically affected due to the changed default value of filter= from "no filtering" to "data".

Affected Products

Python tarfile module (versions prior to patched releases)
Python 3.14+ with default extraction filter behavior
Applications using TarFile.extractall() or TarFile.extract() with "data" or "tar" filters

Discovery Timeline

June 3, 2025 - CVE-2025-4330 published to NVD
June 5, 2025 - Last updated in NVD database

Technical Details for CVE-2025-4330

Vulnerability Analysis

Root Cause

Attack Vector

Arbitrary file read/write through symlink redirection
Configuration file manipulation
Application compromise via overwriting executable files or libraries
Potential for further system compromise depending on application privileges

The following patch demonstrates how the Python maintainers addressed the symlink target validation issue:

text

    Raised to refuse extracting a symbolic link pointing outside the destination
    directory.
 
+.. exception:: LinkFallbackError
+
+   Raised to refuse emulating a link (hard or symbolic) by extracting another
+   archive member, when that member would be rejected by the filter location.
+   The exception that was raised to reject the replacement member is available
+   as :attr:`!BaseException.__context__`.
+
+   .. versionadded:: next
+
 
 The following constants are available at the module level:

Source: GitHub CPython Commit

The fix also includes enhanced os.path.realpath() behavior with a new ALLOW_MISSING mode:

text

-   If a path doesn't exist or a symlink loop is encountered, and *strict* is
-   ``True``, :exc:`OSError` is raised. If *strict* is ``False`` these errors
-   are ignored, and so the result might be missing or otherwise inaccessible.
+   By default, the path is evaluated up to the first component that does not
+   exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
+   All such components are appended unchanged to the existing part of the path.
+
+   Some errors that are handled this way include "access denied", "not a
+   directory", or "bad argument to internal function". Thus, the
+   resulting path may be missing or inaccessible, may still contain
+   links or loops, and may traverse non-directories.
+
+   This behavior can be modified by keyword arguments:
+
+   If *strict* is ``True``, the first error encountered when evaluating the path is
+   re-raised.
+   In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
+   or another :exc:`OSError` if it is otherwise inaccessible.
+
+   If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
+   :exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
+   Thus, the returned path will not contain any symbolic links, but the named
+   file and some of its parent directories may be missing.

Source: GitHub CPython Commit

Detection Methods for CVE-2025-4330

Indicators of Compromise

Unexpected symbolic links created outside application directories after tar extraction operations
File modifications in system directories that correlate with tar archive processing events
Application logs showing tarfile extraction activities with suspicious archive names or sources
Presence of tar archives containing symlinks with ../ sequences or absolute paths targeting sensitive locations

Detection Strategies

Monitor Python application logs for tarfile.extractall() and tarfile.extract() operations processing external or untrusted archives
Implement file integrity monitoring on critical system directories and configuration files
Deploy application-level logging to capture tar extraction events including source archive paths and extraction destinations
Use SentinelOne's behavioral AI to detect suspicious file system operations following archive extraction activities

Monitoring Recommendations

Enable enhanced logging for applications that process user-supplied tar archives
Configure alerts for symlink creation events in unexpected directories
Monitor for new LinkFallbackError exceptions in Python applications after patching (indicates blocked exploitation attempts)
Review application code for usage of tarfile module with filter="data" or filter="tar" parameters

How to Mitigate CVE-2025-4330

Immediate Actions Required

Upgrade Python installations to the latest patched versions that include the security fixes
Audit all applications using Python's tarfile module to identify vulnerable extraction patterns
Avoid processing untrusted tar archives until patches are applied
Consider implementing additional sandboxing for archive extraction operations
Review the Python Security Announcement for version-specific guidance

Patch Information

Workarounds

Implement pre-extraction validation of tar archive contents to detect and reject archives containing suspicious symlinks
Use a custom extraction filter that explicitly validates all link targets against allowed paths before extraction
Extract archives to isolated temporary directories and validate contents before moving to final destinations
Run archive extraction operations in sandboxed environments with restricted file system access
Consider using alternative archive libraries with stronger security controls if Python cannot be immediately updated

bash

# Example: Validate tar contents before extraction
python3 -c "
import tarfile
import os

def validate_archive(archive_path, dest_dir):
    with tarfile.open(archive_path, 'r:*') as tar:
        for member in tar.getmembers():
            # Check for symlinks pointing outside destination
            if member.issym() or member.islnk():
                link_path = os.path.normpath(os.path.join(dest_dir, member.linkname))
                dest_path = os.path.normpath(dest_dir)
                if not link_path.startswith(dest_path):
                    print(f'UNSAFE: {member.name} -> {member.linkname}')
                    return False
    return True
"

CVE-2025-4330: Python Tarfile Path Traversal Vulnerability

CVE-2025-4330 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-4330

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-4330

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-4330

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2025-4330: Python Tarfile Path Traversal Vulnerability

CVE-2025-4330 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-4330

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-4330

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-4330

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform