CVE-2025-4138: Python tarfile Path Traversal Vulnerability

CVE-2025-4138 Overview

CVE-2025-4138 is a path traversal vulnerability in Python's tarfile module that allows extraction filters to be bypassed. This security flaw enables symbolic link targets to point outside the intended destination directory and permits modification of certain file metadata, potentially leading to unauthorized file access or system compromise.

The vulnerability affects applications using the tarfile module to extract untrusted tar archives via TarFile.extractall() or TarFile.extract() when the filter= parameter is set to "data" or "tar". Notably, Python 3.14 and later versions changed the default filter value from "no filtering" to "data", meaning applications relying on this new default behavior are also affected.

Critical Impact
Attackers can craft malicious tar archives that bypass extraction filters, enabling symlink-based path traversal to read sensitive files outside the extraction directory or potentially overwrite critical system files.

Affected Products

Python tarfile module with filter="data" parameter
Python tarfile module with filter="tar" parameter
Python 3.14+ using default filter behavior

Discovery Timeline

2025-06-03 - CVE-2025-4138 published to NVD
2025-06-05 - Last updated in NVD database

Technical Details for CVE-2025-4138

Vulnerability Analysis

This vulnerability is classified under CWE-22 (Improper Limitation of a Pathname to a Restricted Directory), commonly known as path traversal. The flaw exists in how the tarfile module handles symbolic link normalization when extraction filters are applied.

When processing tar archives containing symbolic links, the filter mechanism intended to restrict extraction to the destination directory can be circumvented. This allows symlink targets to escape the intended extraction boundary, potentially enabling attackers to access or modify files anywhere on the filesystem where the application has permissions.

The vulnerability is particularly concerning because the "data" and "tar" filters were specifically designed to provide security against such attacks. Applications that adopted these filters expecting protection are now vulnerable.

Root Cause

The root cause lies in insufficient normalization of symbolic link targets within the tarfile extraction filter logic. The os.path.realpath() function's handling of path components that don't exist or encounter errors during evaluation allowed paths to be constructed that bypass the destination directory restriction.

The original implementation would append problematic path components unchanged to the resolved portion of the path when errors occurred, rather than properly validating and normalizing the complete path. This behavior could be exploited through carefully crafted symlink chains or paths that trigger specific error conditions during resolution.

Attack Vector

An attacker can exploit this vulnerability by crafting a malicious tar archive containing symbolic links with specially constructed target paths. When the victim application extracts this archive using the vulnerable tarfile methods with filter="data" or filter="tar", the symlinks can point outside the extraction directory.

The attack requires network access to deliver the malicious archive to a vulnerable application, but no authentication or user interaction is necessary for exploitation once the archive is processed.

text

// Security patch demonstrating the fix for os.path.realpath() behavior
    links encountered in the path (if they are supported by the operating
    system).
 
-   If a path doesn't exist or a symlink loop is encountered, and *strict* is
-   ``True``, :exc:`OSError` is raised. If *strict* is ``False``, the path is
-   resolved as far as possible and any remainder is appended without checking
-   whether it exists.
+   By default, the path is evaluated up to the first component that does not
+   exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
+   All such components are appended unchanged to the existing part of the path.
+
+   Some errors that are handled this way include "access denied", "not a
+   directory", or "bad argument to internal function". Thus, the
+   resulting path may be missing or inaccessible, may still contain
+   links or loops, and may traverse non-directories.
+
+   This behavior can be modified by keyword arguments:
+
+   If *strict* is ``True``, the first error encountered when evaluating the path is
+   re-raised.
+   In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
+   or another :exc:`OSError` if it is otherwise inaccessible.
+
+   If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
+   :exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
+   Thus, the returned path will not contain any symbolic links, but the named
+   file and some of its parent directories may be missing.

Source: GitHub CPython Commit Update

Detection Methods for CVE-2025-4138

Indicators of Compromise

Unexpected symbolic links created during tar extraction pointing to paths outside the extraction directory
File access or modification attempts to sensitive system files (e.g., /etc/passwd, /etc/shadow, configuration files) following tar extraction operations
Python processes accessing files in unexpected locations after processing tar archives
Audit logs showing tarfile operations followed by suspicious file system activity

Detection Strategies

Monitor Python applications for calls to TarFile.extractall() or TarFile.extract() with untrusted input sources
Implement file integrity monitoring on sensitive system files and directories
Deploy application-level logging to track tar archive extraction operations and resulting file system changes
Use SentinelOne's behavioral AI to detect anomalous file access patterns following archive extraction

Monitoring Recommendations

Enable audit logging for file system operations in directories where tar extraction occurs
Configure alerts for symbolic link creation events that target paths outside expected directories
Monitor network traffic for tar file downloads to systems running vulnerable Python versions
Review application logs for tarfile.OutsideDestinationError or the new tarfile.LinkFallbackError exceptions

How to Mitigate CVE-2025-4138

Immediate Actions Required

Audit all Python applications for usage of TarFile.extractall() or TarFile.extract() with the filter= parameter set to "data" or "tar"
Update Python installations to patched versions as they become available
Avoid processing untrusted tar archives until patches are applied
Implement additional validation of extracted file paths before allowing file system operations

Patch Information

The Python Security Team has released patches addressing this vulnerability. Multiple commits have been made to the CPython repository to fix the underlying issues:

GitHub CPython Issue Report - Original issue tracking
GitHub CPython Pull Request - Security fix implementation
Python Security Announcement Thread - Official security announcement

The patches introduce improved path normalization through enhancements to os.path.realpath() with a new strict='allow_missing' option and add a new LinkFallbackError exception to properly handle link emulation security.

Workarounds

Implement a custom filter function that performs strict path validation before extraction
Use filter="fully_trusted" only when extracting archives from completely trusted sources
Wrap tar extraction in a sandboxed environment with restricted file system access
Manually verify archive contents using TarFile.getmembers() before extraction to identify suspicious symlinks

bash

# Configuration example - Verify tar archive contents before extraction
python3 -c "
import tarfile
import os

def safe_extract(tar_path, dest_dir):
    with tarfile.open(tar_path, 'r:*') as tar:
        for member in tar.getmembers():
            # Check for symlinks pointing outside destination
            if member.issym() or member.islnk():
                link_target = os.path.normpath(
                    os.path.join(dest_dir, os.path.dirname(member.name), member.linkname)
                )
                if not link_target.startswith(os.path.abspath(dest_dir)):
                    print(f'WARNING: Suspicious link detected: {member.name} -> {member.linkname}')
                    continue
        # Only extract after validation
        # tar.extractall(dest_dir, filter='data')  # Apply patch first
"

CVE-2025-4138 Overview

Critical Impact
Attackers can craft malicious tar archives that bypass extraction filters, enabling symlink-based path traversal to read sensitive files outside the extraction directory or potentially overwrite critical system files.

Affected Products

Python tarfile module with filter="data" parameter
Python tarfile module with filter="tar" parameter
Python 3.14+ using default filter behavior

Discovery Timeline

2025-06-03 - CVE-2025-4138 published to NVD
2025-06-05 - Last updated in NVD database

Technical Details for CVE-2025-4138

Vulnerability Analysis

Root Cause

Attack Vector

The attack requires network access to deliver the malicious archive to a vulnerable application, but no authentication or user interaction is necessary for exploitation once the archive is processed.

text

// Security patch demonstrating the fix for os.path.realpath() behavior
    links encountered in the path (if they are supported by the operating
    system).
 
-   If a path doesn't exist or a symlink loop is encountered, and *strict* is
-   ``True``, :exc:`OSError` is raised. If *strict* is ``False``, the path is
-   resolved as far as possible and any remainder is appended without checking
-   whether it exists.
+   By default, the path is evaluated up to the first component that does not
+   exist, is a symlink loop, or whose evaluation raises :exc:`OSError`.
+   All such components are appended unchanged to the existing part of the path.
+
+   Some errors that are handled this way include "access denied", "not a
+   directory", or "bad argument to internal function". Thus, the
+   resulting path may be missing or inaccessible, may still contain
+   links or loops, and may traverse non-directories.
+
+   This behavior can be modified by keyword arguments:
+
+   If *strict* is ``True``, the first error encountered when evaluating the path is
+   re-raised.
+   In particular, :exc:`FileNotFoundError` is raised if *path* does not exist,
+   or another :exc:`OSError` if it is otherwise inaccessible.
+
+   If *strict* is :py:data:`os.path.ALLOW_MISSING`, errors other than
+   :exc:`FileNotFoundError` are re-raised (as with ``strict=True``).
+   Thus, the returned path will not contain any symbolic links, but the named
+   file and some of its parent directories may be missing.

Source: GitHub CPython Commit Update

Detection Methods for CVE-2025-4138

Indicators of Compromise

Unexpected symbolic links created during tar extraction pointing to paths outside the extraction directory
File access or modification attempts to sensitive system files (e.g., /etc/passwd, /etc/shadow, configuration files) following tar extraction operations
Python processes accessing files in unexpected locations after processing tar archives
Audit logs showing tarfile operations followed by suspicious file system activity

Detection Strategies

Monitor Python applications for calls to TarFile.extractall() or TarFile.extract() with untrusted input sources
Implement file integrity monitoring on sensitive system files and directories
Deploy application-level logging to track tar archive extraction operations and resulting file system changes
Use SentinelOne's behavioral AI to detect anomalous file access patterns following archive extraction

Monitoring Recommendations

Enable audit logging for file system operations in directories where tar extraction occurs
Configure alerts for symbolic link creation events that target paths outside expected directories
Monitor network traffic for tar file downloads to systems running vulnerable Python versions
Review application logs for tarfile.OutsideDestinationError or the new tarfile.LinkFallbackError exceptions

How to Mitigate CVE-2025-4138

Immediate Actions Required

Audit all Python applications for usage of TarFile.extractall() or TarFile.extract() with the filter= parameter set to "data" or "tar"
Update Python installations to patched versions as they become available
Avoid processing untrusted tar archives until patches are applied
Implement additional validation of extracted file paths before allowing file system operations

Patch Information

The Python Security Team has released patches addressing this vulnerability. Multiple commits have been made to the CPython repository to fix the underlying issues:

GitHub CPython Issue Report - Original issue tracking
GitHub CPython Pull Request - Security fix implementation
Python Security Announcement Thread - Official security announcement

Workarounds

Implement a custom filter function that performs strict path validation before extraction
Use filter="fully_trusted" only when extracting archives from completely trusted sources
Wrap tar extraction in a sandboxed environment with restricted file system access
Manually verify archive contents using TarFile.getmembers() before extraction to identify suspicious symlinks

bash

# Configuration example - Verify tar archive contents before extraction
python3 -c "
import tarfile
import os

def safe_extract(tar_path, dest_dir):
    with tarfile.open(tar_path, 'r:*') as tar:
        for member in tar.getmembers():
            # Check for symlinks pointing outside destination
            if member.issym() or member.islnk():
                link_target = os.path.normpath(
                    os.path.join(dest_dir, os.path.dirname(member.name), member.linkname)
                )
                if not link_target.startswith(os.path.abspath(dest_dir)):
                    print(f'WARNING: Suspicious link detected: {member.name} -> {member.linkname}')
                    continue
        # Only extract after validation
        # tar.extractall(dest_dir, filter='data')  # Apply patch first
"

CVE-2025-4138: Python tarfile Path Traversal Vulnerability

CVE-2025-4138 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-4138

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-4138

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-4138

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2025-4138: Python tarfile Path Traversal Vulnerability

CVE-2025-4138 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2025-4138

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2025-4138

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2025-4138

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform