CVE-2025-4138 Overview
CVE-2025-4138 is a path traversal vulnerability in Python's tarfile module that allows extraction filters to be bypassed. This security flaw enables symbolic link targets to point outside the intended destination directory and permits modification of certain file metadata, potentially leading to unauthorized file access or system compromise.
The vulnerability affects applications using the tarfile module to extract untrusted tar archives via TarFile.extractall() or TarFile.extract() when the filter= parameter is set to "data" or "tar". Notably, Python 3.14 and later versions changed the default filter value from "no filtering" to "data", meaning applications relying on this new default behavior are also affected.
Critical Impact
Attackers can craft malicious tar archives that bypass extraction filters, enabling symlink-based path traversal to read sensitive files outside the extraction directory or potentially overwrite critical system files.
Affected Products
- Python tarfile module with filter="data" parameter
- Python tarfile module with filter="tar" parameter
- Python 3.14+ using default filter behavior
Discovery Timeline
- 2025-06-03 - CVE-2025-4138 published to NVD
- 2025-06-05 - Last updated in NVD database
Technical Details for CVE-2025-4138
Vulnerability Analysis
This vulnerability is classified under CWE-22 (Improper Limitation of a Pathname to a Restricted Directory), commonly known as path traversal. The flaw exists in how the tarfile module handles symbolic link normalization when extraction filters are applied.
When processing tar archives containing symbolic links, the filter mechanism intended to restrict extraction to the destination directory can be circumvented. This allows symlink targets to escape the intended extraction boundary, potentially enabling attackers to access or modify files anywhere on the filesystem where the application has permissions.
The vulnerability is particularly concerning because the "data" and "tar" filters were specifically designed to provide security against such attacks. Applications that adopted these filters expecting protection are now vulnerable.
Root Cause
The root cause lies in insufficient normalization of symbolic link targets within the tarfile extraction filter logic. The os.path.realpath() function's handling of path components that don't exist or encounter errors during evaluation allowed paths to be constructed that bypass the destination directory restriction.
The original implementation would append problematic path components unchanged to the resolved portion of the path when errors occurred, rather than properly validating and normalizing the complete path. This behavior could be exploited through carefully crafted symlink chains or paths that trigger specific error conditions during resolution.
Attack Vector
An attacker can exploit this vulnerability by crafting a malicious tar archive containing symbolic links with specially constructed target paths. When the victim application extracts this archive using the vulnerable tarfile methods with filter="data" or filter="tar", the symlinks can point outside the extraction directory.
The attack requires network access to deliver the malicious archive to a vulnerable application, but no authentication or user interaction is necessary for exploitation once the archive is processed.
// Security patch demonstrating the fix for os.path.realpath() behavior
links encountered in the path (if they are supported by the operating
system).
- If a path doesn't exist or a symlink loop is encountered, and *strict* is
- ``True``, :exc:`OSError` is raised. If *strict* is ``False``, the path is
- resolved as far as possible and any remainder is appended without checking
- whether it exists.
+ By default, the path is evaluated up to the first component that does not
+ exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
+ All such components are appended unchanged to the existing part of the path.
+
+ Some errors that are handled this way include "access denied", "not a
+ directory", or "bad argument to internal function". Thus, the
+ resulting path may be missing or inaccessible, may still contain
+ links or loops, and may traverse non-directories.
+
+ This behavior can be modified by keyword arguments:
+
+ If *strict* is ``True``, the first error encountered when evaluating the path is
+ re-raised.
+ In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
+ or another :exc:`OSError` if it is otherwise inaccessible.
+
+ If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
+ :exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
+ Thus, the returned path will not contain any symbolic links, but the named
+ file and some of its parent directories may be missing.
Source: GitHub CPython Commit Update
Detection Methods for CVE-2025-4138
Indicators of Compromise
- Unexpected symbolic links created during tar extraction pointing to paths outside the extraction directory
- File access or modification attempts to sensitive system files (e.g., /etc/passwd, /etc/shadow, configuration files) following tar extraction operations
- Python processes accessing files in unexpected locations after processing tar archives
- Audit logs showing tarfile operations followed by suspicious file system activity
Detection Strategies
- Monitor Python applications for calls to TarFile.extractall() or TarFile.extract() with untrusted input sources
- Implement file integrity monitoring on sensitive system files and directories
- Deploy application-level logging to track tar archive extraction operations and resulting file system changes
- Use SentinelOne's behavioral AI to detect anomalous file access patterns following archive extraction
Monitoring Recommendations
- Enable audit logging for file system operations in directories where tar extraction occurs
- Configure alerts for symbolic link creation events that target paths outside expected directories
- Monitor network traffic for tar file downloads to systems running vulnerable Python versions
- Review application logs for tarfile.OutsideDestinationError or the new tarfile.LinkFallbackError exceptions
How to Mitigate CVE-2025-4138
Immediate Actions Required
- Audit all Python applications for usage of TarFile.extractall() or TarFile.extract() with the filter= parameter set to "data" or "tar"
- Update Python installations to patched versions as they become available
- Avoid processing untrusted tar archives until patches are applied
- Implement additional validation of extracted file paths before allowing file system operations
Patch Information
The Python Security Team has released patches addressing this vulnerability. Multiple commits have been made to the CPython repository to fix the underlying issues:
- GitHub CPython Issue Report - Original issue tracking
- GitHub CPython Pull Request - Security fix implementation
- Python Security Announcement Thread - Official security announcement
The patches introduce improved path normalization through enhancements to os.path.realpath() with a new strict='allow_missing' option and add a new LinkFallbackError exception to properly handle link emulation security.
Workarounds
- Implement a custom filter function that performs strict path validation before extraction
- Use filter="fully_trusted" only when extracting archives from completely trusted sources
- Wrap tar extraction in a sandboxed environment with restricted file system access
- Manually verify archive contents using TarFile.getmembers() before extraction to identify suspicious symlinks
# Configuration example - Verify tar archive contents before extraction
python3 -c "
import tarfile
import os
def safe_extract(tar_path, dest_dir):
with tarfile.open(tar_path, 'r:*') as tar:
for member in tar.getmembers():
# Check for symlinks pointing outside destination
if member.issym() or member.islnk():
link_target = os.path.normpath(
os.path.join(dest_dir, os.path.dirname(member.name), member.linkname)
)
if not link_target.startswith(os.path.abspath(dest_dir)):
print(f'WARNING: Suspicious link detected: {member.name} -> {member.linkname}')
continue
# Only extract after validation
# tar.extractall(dest_dir, filter='data') # Apply patch first
"
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


