CVE-2022-25168 Overview
CVE-2022-25168 is a critical command injection vulnerability in Apache Hadoop's FileUtil.unTar(File, File) API. The vulnerability exists because the API does not properly escape input file names before passing them to the shell, allowing attackers to inject and execute arbitrary commands. This flaw affects multiple components across the Apache Hadoop ecosystem, including YARN localization in Hadoop 2.x which enables remote code execution, and the InMemoryAliasMap.completeBootstrapTransfer function in Hadoop 3.3. Additionally, Apache Spark is affected through its SQL ADD ARCHIVE command.
Critical Impact
This command injection vulnerability enables remote code execution through unescaped shell commands, potentially allowing complete system compromise in affected Apache Hadoop and Spark deployments.
Affected Products
- Apache Hadoop versions prior to 2.10.2
- Apache Hadoop versions prior to 3.2.4
- Apache Hadoop versions prior to 3.3.3
Discovery Timeline
- August 4, 2022 - CVE-2022-25168 published to NVD
- November 21, 2024 - Last updated in NVD database
Technical Details for CVE-2022-25168
Vulnerability Analysis
This vulnerability is classified as CWE-78 (Improper Neutralization of Special Elements used in an OS Command), commonly known as OS Command Injection. The root issue lies in the FileUtil.unTar() method which processes tar archive files without properly sanitizing the input file names before executing shell commands.
When processing archive files, the vulnerable function constructs shell commands by directly concatenating user-controlled file names. An attacker can craft malicious file names containing shell metacharacters (such as semicolons, backticks, or pipe characters) to break out of the intended command context and execute arbitrary commands with the privileges of the Hadoop process.
The attack surface varies depending on the deployment context. In Hadoop 2.x, the YARN localization process uses this vulnerable function to handle container resources, enabling remote attackers to achieve code execution by submitting malicious job resources. In Hadoop 3.3, the InMemoryAliasMap.completeBootstrapTransfer function is only executed by local users, limiting the attack surface. Apache Spark's ADD ARCHIVE SQL command also utilizes this function, though exploitation through this vector is constrained since the command already allows adding binaries to the classpath.
Root Cause
The root cause is insufficient input validation and improper neutralization of special characters in file names before shell command execution. The FileUtil.unTar() method directly passes file name parameters to shell commands without escaping shell metacharacters, violating secure coding practices for handling external input in command construction.
Attack Vector
The attack is network-accessible with low complexity, requiring no privileges or user interaction. An attacker can exploit this vulnerability by:
- Crafting a malicious tar archive with specially crafted file names containing shell command injection payloads
- Submitting the malicious archive to a vulnerable Hadoop cluster (e.g., through YARN job submission in Hadoop 2.x)
- When the archive is processed by FileUtil.unTar(), the injected commands execute with the privileges of the Hadoop service account
The vulnerability allows injection through file names containing shell metacharacters. For example, a file named test;whoami;.tar would cause the whoami command to execute when the archive is processed. More sophisticated payloads could establish reverse shells or download and execute additional malware.
Detection Methods for CVE-2022-25168
Indicators of Compromise
- Unexpected shell processes spawned by Java processes running Hadoop or Spark services
- Unusual network connections originating from Hadoop NodeManager or YARN container processes
- Archive files with suspicious names containing shell metacharacters (;, |, `, $()) in YARN localization directories
- Anomalous command execution patterns in system audit logs correlating with YARN job submissions
Detection Strategies
- Monitor process execution chains for shell commands spawned as children of Java processes in Hadoop deployments
- Implement file integrity monitoring on Hadoop configuration and binary directories
- Deploy YARA rules to detect malicious archive files with command injection patterns in file names
- Enable audit logging for YARN ResourceManager and NodeManager to track job submissions and resource localization events
Monitoring Recommendations
- Configure centralized logging for all Hadoop cluster nodes with real-time alerting on suspicious command patterns
- Implement network segmentation monitoring to detect unexpected outbound connections from Hadoop worker nodes
- Enable Java security manager logging to capture unauthorized shell execution attempts
- Deploy endpoint detection and response (EDR) solutions like SentinelOne on all Hadoop cluster nodes for behavioral analysis
How to Mitigate CVE-2022-25168
Immediate Actions Required
- Upgrade Apache Hadoop to version 2.10.2, 3.2.4, 3.3.3, or later which include the fix in HADOOP-18136
- For Apache Spark deployments, upgrade to version 3.1.4, 3.2.2, or 3.3.0 which include SPARK-38305
- Audit existing YARN jobs and archives for potentially malicious file names
- Restrict network access to YARN ResourceManager and NodeManager ports using firewall rules
Patch Information
Apache has released security patches addressing this vulnerability. Users should upgrade to the following fixed versions:
- Apache Hadoop: Version 2.10.2, 3.2.4, 3.3.3, or higher (includes HADOOP-18136)
- Apache Spark: Version 3.1.4, 3.2.2, 3.3.0, or higher (includes SPARK-38305 - "Check existence of file before untarring/zipping")
For detailed information, refer to the Apache Mailing List Thread and the NetApp Security Advisory NTAP-20220915-0007.
Workarounds
- Implement strict input validation on all file names processed by Hadoop services at the application level
- Deploy application-level firewalls or API gateways to filter requests containing shell metacharacters in file names
- Restrict YARN job submission privileges to trusted users only through Hadoop ACLs and Kerberos authentication
- Consider running Hadoop services in containerized environments with restricted system call capabilities using seccomp profiles
# Verify Apache Hadoop version to confirm patched status
hadoop version
# Check for vulnerable versions and upgrade if necessary
# For Hadoop 2.x, upgrade to at least 2.10.2
# For Hadoop 3.2.x, upgrade to at least 3.2.4
# For Hadoop 3.3.x, upgrade to at least 3.3.3
# Restrict YARN ResourceManager access via iptables
iptables -A INPUT -p tcp --dport 8088 -s <trusted_network> -j ACCEPT
iptables -A INPUT -p tcp --dport 8088 -j DROP
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


