CVE-2023-42503 Overview
CVE-2023-42503 is an Improper Input Validation and Uncontrolled Resource Consumption vulnerability affecting Apache Commons Compress, a widely-used Java library for working with archive formats. The vulnerability exists in the TAR file parsing functionality, specifically in how the library handles file modification time headers in PAX extended headers. A malicious actor can craft a TAR file with manipulated time headers containing extremely large numbers or exponent notation, which when parsed, triggers an algorithmic complexity issue in Java's BigDecimal class, resulting in a denial of service through CPU exhaustion.
Critical Impact
Maliciously crafted TAR files can cause applications using Apache Commons Compress to become unresponsive for hours, leading to complete denial of service via CPU resource exhaustion.
Affected Products
- Apache Commons Compress versions 1.22 to 1.23.x (prior to 1.24.0)
- Applications using CompressorStreamFactory class with auto-detection of file types
- Applications using TarArchiveInputStream and TarFile classes to parse TAR files
Discovery Timeline
- 2023-09-14 - CVE CVE-2023-42503 published to NVD
- 2025-02-13 - Last updated in NVD database
Technical Details for CVE-2023-42503
Vulnerability Analysis
The vulnerability was introduced in Apache Commons Compress version 1.22 when support was added for file modification times with higher precision (tracked as COMPRESS-612). The PAX extended header format for these timestamps consists of two numbers separated by a period, representing seconds and subsecond precision (e.g., "1647221103.5998539"). The impacted fields include atime, ctime, mtime, and LIBARCHIVE.creationtime.
The core issue lies in the lack of input validation prior to parsing these header values. When the library encounters a timestamp value, it passes it directly to Java's BigDecimal class for parsing. This class has a well-documented algorithmic complexity vulnerability (tracked as JDK-6560193) that causes severe performance degradation when processing numbers with very long fractional parts or extreme exponent notation.
An attacker can exploit this by placing a number with an extremely long fraction (300,000+ digits) or using exponent notation (such as "9e9999999") within a file modification time header. When an application attempts to parse such a malicious TAR file, the operation that should complete in seconds instead takes hours, effectively causing a denial of service.
This vulnerability is similar to CVE-2012-2098, which also involved algorithmic complexity attacks against Apache Commons components.
Root Cause
The root cause is the absence of input validation on PAX extended header values before they are processed by the BigDecimal parser. The library trusts that incoming timestamp values conform to reasonable specifications and passes them directly to mathematical parsing functions that have known algorithmic complexity issues with specially crafted inputs. The BigDecimal class performs arithmetic operations that scale poorly with the size of the numbers being processed, making it susceptible to resource exhaustion attacks.
Attack Vector
The attack requires local access where a user or application processes a maliciously crafted TAR file. The attacker creates a TAR archive with manipulated PAX extended headers containing timestamp values designed to trigger the algorithmic complexity vulnerability. When the target application attempts to extract or inspect the archive using vulnerable versions of Apache Commons Compress, the parsing operation consumes excessive CPU cycles for extended periods.
The attack is particularly dangerous in scenarios where applications automatically process uploaded archives, scan directories for new files, or handle TAR files from untrusted sources. Since the malicious payload is embedded in metadata headers rather than file content, traditional content inspection may not detect the threat.
Detection Methods for CVE-2023-42503
Indicators of Compromise
- Unusual CPU utilization spikes when processing TAR archive files
- Application threads stuck in BigDecimal parsing operations for extended periods
- TAR files containing PAX extended headers with abnormally long numeric values in timestamp fields
- Thread dumps showing blocked threads in java.math.BigDecimal operations during archive processing
Detection Strategies
- Implement monitoring for Java applications that process archive files, alerting on sustained high CPU usage during TAR operations
- Use software composition analysis (SCA) tools to identify applications using Apache Commons Compress versions 1.22 through 1.23.x
- Review application logs for parsing errors or timeouts related to TAR file processing
- Deploy runtime application self-protection (RASP) solutions to detect abnormal resource consumption patterns
Monitoring Recommendations
- Configure alerting thresholds for CPU utilization when archive processing services are active
- Monitor thread states in Java applications, flagging threads stuck in arithmetic parsing for extended durations
- Track file processing times and alert when TAR operations exceed normal baselines
- Implement queue depth monitoring for file processing pipelines to detect processing slowdowns
How to Mitigate CVE-2023-42503
Immediate Actions Required
- Upgrade Apache Commons Compress to version 1.24.0 or later immediately across all affected applications
- Audit all Java applications and dependencies to identify usage of vulnerable Apache Commons Compress versions
- Implement file validation at ingestion points to reject TAR files from untrusted sources until patching is complete
- Consider temporarily disabling automatic archive processing features in applications handling files from external sources
Patch Information
Apache has addressed this vulnerability in Apache Commons Compress version 1.24.0. The fix implements proper input validation for PAX extended header values before they are processed, preventing the algorithmic complexity attack from being triggered. Users are strongly recommended to upgrade to version 1.24.0 or later to remediate this vulnerability.
For additional information, refer to the Apache Mailing List Thread and the NetApp Security Advisory NTAP-20231020-0003.
Workarounds
- Implement pre-processing validation of TAR files to inspect PAX header values for abnormally long numeric strings before passing to Apache Commons Compress
- Apply resource limits (CPU time quotas, thread timeouts) to archive processing operations to prevent runaway resource consumption
- Isolate archive processing in sandboxed environments with strict resource constraints
- Restrict archive processing to files from trusted, verified sources only until the library can be upgraded
# Maven dependency update example
# Update pom.xml to use patched version
mvn versions:use-dep-version -Dincludes=org.apache.commons:commons-compress -DdepVersion=1.24.0
# Verify the update
mvn dependency:tree | grep commons-compress
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


