CVE-2025-4330 Overview
CVE-2025-4330 is a path traversal vulnerability in Python's tarfile module that allows extraction filters to be bypassed. When extracting untrusted tar archives using TarFile.extractall() or TarFile.extract() with the filter= parameter set to "data" or "tar", an attacker can craft malicious archives containing symlinks that point outside the intended destination directory. This vulnerability also permits unauthorized modification of certain file metadata.
Critical Impact
Attackers can escape directory boundaries during tar extraction, potentially overwriting arbitrary files on the system and compromising application integrity. Applications using Python 3.14 or later with default filter settings are automatically affected due to the changed default value of filter= from "no filtering" to "data".
Affected Products
- Python tarfile module (versions prior to patched releases)
- Python 3.14+ with default extraction filter behavior
- Applications using TarFile.extractall() or TarFile.extract() with "data" or "tar" filters
Discovery Timeline
- June 3, 2025 - CVE-2025-4330 published to NVD
- June 5, 2025 - Last updated in NVD database
Technical Details for CVE-2025-4330
Vulnerability Analysis
This vulnerability exists in Python's tarfile module, specifically in how extraction filters handle symbolic link targets. The extraction filters "data" and "tar" were designed to provide security boundaries during archive extraction, but the implementation contained a flaw that allowed these filters to be circumvented.
When processing tar archives containing symbolic links, the vulnerable code failed to properly normalize and validate link targets before extraction. This allowed attackers to create symlinks pointing to locations outside the designated extraction directory, effectively escaping the intended sandbox. Additionally, the vulnerability enabled modification of file metadata that should have been protected by the extraction filter.
The issue is classified under CWE-22 (Improper Limitation of a Pathname to a Restricted Directory), commonly known as path traversal. The vulnerability is exploitable over the network by delivering malicious tar archives to applications that process them with the affected extraction methods.
Root Cause
The root cause lies in insufficient validation of symbolic link targets during the tar extraction process. The extraction filter mechanism did not properly resolve and check symlink paths against the destination directory boundary before creating the links. The os.path.realpath() function's behavior when encountering errors (such as non-existent paths or permission issues) contributed to the bypass, as it would append remaining path components unchanged rather than strictly validating them.
Attack Vector
An attacker can exploit this vulnerability by crafting a malicious tar archive containing specially constructed symbolic links. When a victim application extracts this archive using the vulnerable tarfile methods with "data" or "tar" filters, the symlinks are created pointing outside the extraction directory. This can lead to:
- Arbitrary file read/write through symlink redirection
- Configuration file manipulation
- Application compromise via overwriting executable files or libraries
- Potential for further system compromise depending on application privileges
The following patch demonstrates how the Python maintainers addressed the symlink target validation issue:
Raised to refuse extracting a symbolic link pointing outside the destination
directory.
+.. exception:: LinkFallbackError
+
+ Raised to refuse emulating a link (hard or symbolic) by extracting another
+ archive member, when that member would be rejected by the filter location.
+ The exception that was raised to reject the replacement member is available
+ as :attr:`!BaseException.__context__`.
+
+ .. versionadded:: next
+
The following constants are available at the module level:
Source: GitHub CPython Commit
The fix also includes enhanced os.path.realpath() behavior with a new ALLOW_MISSING mode:
- If a path doesn't exist or a symlink loop is encountered, and *strict* is
- ``True``, :exc:`OSError` is raised. If *strict* is ``False`` these errors
- are ignored, and so the result might be missing or otherwise inaccessible.
+ By default, the path is evaluated up to the first component that does not
+ exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
+ All such components are appended unchanged to the existing part of the path.
+
+ Some errors that are handled this way include "access denied", "not a
+ directory", or "bad argument to internal function". Thus, the
+ resulting path may be missing or inaccessible, may still contain
+ links or loops, and may traverse non-directories.
+
+ This behavior can be modified by keyword arguments:
+
+ If *strict* is ``True``, the first error encountered when evaluating the path is
+ re-raised.
+ In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
+ or another :exc:`OSError` if it is otherwise inaccessible.
+
+ If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
+ :exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
+ Thus, the returned path will not contain any symbolic links, but the named
+ file and some of its parent directories may be missing.
Source: GitHub CPython Commit
Detection Methods for CVE-2025-4330
Indicators of Compromise
- Unexpected symbolic links created outside application directories after tar extraction operations
- File modifications in system directories that correlate with tar archive processing events
- Application logs showing tarfile extraction activities with suspicious archive names or sources
- Presence of tar archives containing symlinks with ../ sequences or absolute paths targeting sensitive locations
Detection Strategies
- Monitor Python application logs for tarfile.extractall() and tarfile.extract() operations processing external or untrusted archives
- Implement file integrity monitoring on critical system directories and configuration files
- Deploy application-level logging to capture tar extraction events including source archive paths and extraction destinations
- Use SentinelOne's behavioral AI to detect suspicious file system operations following archive extraction activities
Monitoring Recommendations
- Enable enhanced logging for applications that process user-supplied tar archives
- Configure alerts for symlink creation events in unexpected directories
- Monitor for new LinkFallbackError exceptions in Python applications after patching (indicates blocked exploitation attempts)
- Review application code for usage of tarfile module with filter="data" or filter="tar" parameters
How to Mitigate CVE-2025-4330
Immediate Actions Required
- Upgrade Python installations to the latest patched versions that include the security fixes
- Audit all applications using Python's tarfile module to identify vulnerable extraction patterns
- Avoid processing untrusted tar archives until patches are applied
- Consider implementing additional sandboxing for archive extraction operations
- Review the Python Security Announcement for version-specific guidance
Patch Information
Multiple patches have been released to address this vulnerability across different Python versions. The fixes include adding a new LinkFallbackError exception to properly refuse dangerous link emulation operations and enhancing os.path.realpath() with a new strict=ALLOW_MISSING mode for proper symlink normalization. Refer to the GitHub CPython Issue #135034 and the associated pull request for complete technical details of the fixes.
Workarounds
- Implement pre-extraction validation of tar archive contents to detect and reject archives containing suspicious symlinks
- Use a custom extraction filter that explicitly validates all link targets against allowed paths before extraction
- Extract archives to isolated temporary directories and validate contents before moving to final destinations
- Run archive extraction operations in sandboxed environments with restricted file system access
- Consider using alternative archive libraries with stronger security controls if Python cannot be immediately updated
# Example: Validate tar contents before extraction
python3 -c "
import tarfile
import os
def validate_archive(archive_path, dest_dir):
with tarfile.open(archive_path, 'r:*') as tar:
for member in tar.getmembers():
# Check for symlinks pointing outside destination
if member.issym() or member.islnk():
link_path = os.path.normpath(os.path.join(dest_dir, member.linkname))
dest_path = os.path.normpath(dest_dir)
if not link_path.startswith(dest_path):
print(f'UNSAFE: {member.name} -> {member.linkname}')
return False
return True
"
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


