CVE-2025-64712 Overview
CVE-2025-64712 is a critical path traversal vulnerability affecting the Unstructured library, an open-source Python framework used for ingesting and pre-processing images and text documents including PDFs, HTML, Word docs, and many more file formats. Prior to version 0.18.18, a path traversal vulnerability in the partition_msg function allows an attacker to write or overwrite arbitrary files on the filesystem when processing malicious MSG files with attachments.
Critical Impact
Attackers can achieve arbitrary file write or overwrite capabilities on the filesystem by crafting malicious MSG files with specially constructed attachment filenames, potentially leading to remote code execution or system compromise.
Affected Products
- Unstructured library versions prior to 0.18.18
- Python applications using the partition_msg function for MSG file processing
- Document processing pipelines handling untrusted MSG attachments
Discovery Timeline
- 2026-02-04 - CVE CVE-2025-64712 published to NVD
- 2026-02-27 - Last updated in NVD database
Technical Details for CVE-2025-64712
Vulnerability Analysis
This vulnerability stems from insufficient input validation in the MSG file attachment processing functionality within the Unstructured library. The partition_msg function fails to properly sanitize attachment filenames extracted from MSG files before using them in file system operations. When processing an MSG file, the library extracts attachment metadata including filenames, which are then used to determine where attachments are written on disk.
The lack of proper filename sanitization means that maliciously crafted filenames containing path traversal sequences (such as ../ for Unix systems or ..\ for Windows) are not neutralized. This allows an attacker to escape the intended directory and write files to arbitrary locations on the filesystem where the application has write permissions.
Root Cause
The root cause is a classic CWE-22 (Path Traversal) vulnerability where the file_name property of MSG attachments was directly returned without any sanitization. The original implementation simply returned self._attachment.file_name or "unknown" without removing path components, null bytes, or other potentially dangerous characters that could be used to traverse directories.
Attack Vector
An attacker can exploit this vulnerability by crafting a malicious MSG file containing attachments with specially crafted filenames. The attack is network-accessible, requiring no authentication or user interaction beyond the victim's application processing the malicious MSG file.
The attack scenario involves:
- Creating an MSG file with an attachment whose filename contains path traversal sequences (e.g., ../../../etc/cron.d/malicious)
- Delivering this MSG file to a target system running the vulnerable Unstructured library
- When the application processes the MSG file using partition_msg, the attachment is written to an attacker-controlled path on the filesystem
"""The original name of the attached file, no path.
This value is 'unknown' if it is not present in the MSG file (not expected).
+ The filename is sanitized to prevent path traversal attacks.
"""
- return self._attachment.file_name or "unknown"
+ raw_filename = self._attachment.file_name or "unknown"
+
+ # Sanitize the filename to prevent path traversal attacks
+ # Remove any path components for both Unix and Windows paths
+ # Use both separators to handle cross-platform attacks
+ safe_filename = os.path.basename(raw_filename.replace("\\", "/"))
+
+ # Remove null bytes and other control characters
+ safe_filename = safe_filename.replace("\0", "")
+
+ # If the filename becomes empty after sanitization, use a default
+ if not safe_filename or safe_filename in (".", ".."):
+ safe_filename = "unknown"
+
+ return safe_filename
@lazyproperty
def _attachment_last_modified(self) -> str | None:
Source: GitHub Commit Update
Detection Methods for CVE-2025-64712
Indicators of Compromise
- Unexpected files appearing in sensitive directories such as /etc/, /var/, or application configuration paths
- MSG files containing attachments with suspicious filenames including ../ or ..\ sequences
- File system events showing write operations to paths outside expected document processing directories
- Log entries indicating MSG file processing followed by unexpected file creation events
Detection Strategies
- Implement file integrity monitoring on critical system directories and application paths
- Deploy runtime application security monitoring to detect path traversal attempts during file operations
- Analyze MSG files at ingestion points for attachments with path traversal patterns in filenames
- Monitor Python application logs for unusual file write operations during document processing
Monitoring Recommendations
- Configure alerts for file creation or modification in sensitive system directories during MSG processing workflows
- Implement behavioral analysis to baseline normal file write patterns for document processing applications
- Deploy endpoint detection and response (EDR) solutions capable of correlating MSG file processing with suspicious filesystem activity
- Enable audit logging for all file system operations in environments processing untrusted documents
How to Mitigate CVE-2025-64712
Immediate Actions Required
- Upgrade the Unstructured library to version 0.18.18 or later immediately
- Audit existing deployments to identify systems running vulnerable versions using pip show unstructured
- Review file systems for evidence of exploitation, particularly unexpected files in sensitive directories
- Implement network-level controls to quarantine MSG files pending library updates
Patch Information
The vulnerability has been patched in version 0.18.18 of the Unstructured library. The fix implements proper filename sanitization by using os.path.basename() to extract only the filename component, replacing backslashes with forward slashes to handle cross-platform path traversal attempts, removing null bytes and control characters, and defaulting to "unknown" for empty or dangerous filenames. See the GitHub Security Advisory for complete details.
Workarounds
- If immediate patching is not possible, implement input validation at the application layer to reject MSG files with suspicious attachment filenames
- Process MSG files in isolated sandbox environments with restricted filesystem permissions
- Configure application-level controls to limit write operations to specific directories only
- Deploy web application firewalls or content filters to scan incoming MSG files for path traversal patterns in attachment metadata
# Upgrade to patched version
pip install --upgrade unstructured>=0.18.18
# Verify installed version
pip show unstructured | grep Version
# Check for vulnerable installations in virtual environments
find /opt -name "unstructured*" -exec pip show unstructured \;
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

