CVE-2025-66516 Overview
CVE-2025-66516 is a critical XML External Entity (XXE) injection vulnerability affecting Apache Tika across multiple modules including tika-core (versions 1.13-3.2.1), tika-pdf-module (versions 2.0.0-3.2.1), and tika-parsers (versions 1.13-1.28.5). This vulnerability allows remote attackers to perform XXE injection attacks through specially crafted XFA (XML Forms Architecture) files embedded within PDF documents.
This CVE expands upon the previously reported CVE-2025-54988 by clarifying the broader scope of affected packages. Notably, the actual vulnerability and its fix reside in tika-core, meaning organizations that only upgraded tika-parser-pdf-module without upgrading tika-core to version 3.2.2 or later remain vulnerable.
Critical Impact
This XXE vulnerability enables attackers to exfiltrate sensitive data, perform server-side request forgery (SSRF), and potentially achieve remote code execution on systems processing malicious PDF files through Apache Tika.
Affected Products
- Apache Tika tika-core versions 1.13 through 3.2.1
- Apache Tika tika-pdf-module versions 2.0.0 through 3.2.1
- Apache Tika tika-parsers versions 1.13 through 1.28.5
Discovery Timeline
- 2025-12-04 - CVE CVE-2025-66516 published to NVD
- 2025-12-30 - Last updated in NVD database
Technical Details for CVE-2025-66516
Vulnerability Analysis
The vulnerability exists in how Apache Tika processes PDF files containing XFA (XML Forms Architecture) content. When parsing these PDF files, Tika's XML processing components fail to properly restrict the resolution of external entities within the XFA data. This allows attackers to craft malicious PDF documents that, when processed by Tika, can trigger XXE attacks.
The issue is particularly concerning because it affects the core parsing functionality in tika-core, not just the PDF-specific modules. This architectural detail means that even applications that updated their PDF parsing components but retained older versions of tika-core remain vulnerable to exploitation.
Root Cause
The root cause is CWE-611: Improper Restriction of XML External Entity Reference. The XML parser used within tika-core for processing XFA content within PDF files does not properly disable external entity resolution. When processing XFA data from a PDF, the parser will resolve external entities, allowing attackers to reference arbitrary external resources including local files and remote URLs.
Attack Vector
The attack is network-accessible and requires no authentication or user interaction. An attacker can exploit this vulnerability by:
- Creating a malicious PDF file containing crafted XFA content with external entity declarations
- Submitting this PDF to any application or service that uses vulnerable versions of Apache Tika for document processing
- When Tika parses the PDF, the XXE payload executes, potentially allowing data exfiltration, SSRF attacks, or denial of service
The vulnerability is particularly dangerous in document processing pipelines, search indexing systems, content management platforms, and any web application that accepts user-uploaded PDF files and processes them with Apache Tika.
Detection Methods for CVE-2025-66516
Indicators of Compromise
- Unusual outbound network connections from Tika processing servers to external hosts
- Error logs showing XML parsing failures with references to external DTDs or entities
- Unexpected file access attempts on systems running Tika-based document processing
- Anomalous DNS queries originating from document processing infrastructure
Detection Strategies
- Monitor for PDF files containing suspicious XFA content with external entity declarations
- Implement network-level detection for outbound connections from document processing systems to unexpected destinations
- Review application logs for XML parsing errors related to external entity resolution
- Deploy file inspection rules to identify PDFs with embedded XFA payloads containing DOCTYPE declarations
Monitoring Recommendations
- Enable verbose logging on Apache Tika instances to capture XML parsing events
- Monitor egress traffic from systems running document processing workloads
- Implement file integrity monitoring on sensitive directories accessible by Tika processes
- Set up alerting for unusual resource access patterns during document parsing operations
How to Mitigate CVE-2025-66516
Immediate Actions Required
- Upgrade tika-core to version 3.2.2 or later immediately
- Verify that all Tika modules (tika-core, tika-pdf-module, tika-parsers) are updated consistently
- Audit any applications or services that process user-uploaded PDF files using Apache Tika
- Implement network segmentation to limit outbound connectivity from document processing systems
Patch Information
The vulnerability is fixed in Apache Tika tika-core version 3.2.2 and later. Organizations should review the Apache Mailing List Discussion for detailed patch information and upgrade guidance. It is critical to upgrade tika-core specifically, as upgrading only the PDF module without updating core will not remediate the vulnerability. For additional context, the CVE-2025-54988 Record documents the initial disclosure of this vulnerability.
Workarounds
- Configure XML parsers to disable external entity resolution at the application level if immediate patching is not possible
- Implement input validation to reject PDF files containing XFA content until patches can be applied
- Deploy web application firewalls (WAF) with rules to detect XXE payloads in uploaded files
- Isolate document processing services in sandboxed environments with restricted network access
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


