CVE-2025-46762: Apache Parquet RCE Vulnerability

CVE-2025-46762 Overview

CVE-2025-46762 is an arbitrary code execution vulnerability affecting Apache Parquet's parquet-avro module. The vulnerability exists in the schema parsing functionality, which allows malicious actors to execute arbitrary code when processing specially crafted Parquet files. While version 1.15.1 introduced restrictions on untrusted packages, the default configuration of trusted packages still permits the execution of malicious classes, leaving systems vulnerable to exploitation.

This vulnerability specifically impacts applications using the "specific" or "reflect" models for reading Parquet files. The "generic" model remains unaffected by this issue.

Critical Impact
Attackers can achieve arbitrary code execution on vulnerable systems by exploiting the schema parsing functionality in Apache Parquet's parquet-avro module when processing malicious Parquet files.

Affected Products

Apache Parquet versions 1.15.0 and earlier
Apache Parquet version 1.15.1 (with default configuration)
Applications using parquet-avro module with "specific" or "reflect" models

Discovery Timeline

2025-05-06 - CVE-2025-46762 published to NVD
2025-09-02 - Last updated in NVD database

Technical Details for CVE-2025-46762

Vulnerability Analysis

This vulnerability falls under the category of insecure deserialization and arbitrary code execution. The root issue lies in how the parquet-avro module processes schema definitions during file parsing operations. When an application reads Parquet files using either the "specific" or "reflect" data models, the schema parsing mechanism can be manipulated to instantiate and execute arbitrary Java classes.

The initial fix in version 1.15.1 attempted to address this by introducing package restrictions for untrusted classes. However, the default whitelist of trusted packages remained overly permissive, allowing attackers to leverage classes within those trusted packages to achieve code execution. This represents a classic case where security controls were implemented but configured with insufficient restrictions by default.

The network-accessible nature of this vulnerability means that remote attackers can potentially exploit it by providing malicious Parquet files to vulnerable applications, particularly in data processing pipelines that ingest external data.

Root Cause

The vulnerability stems from CWE-73 (External Control of File Name or Path) classification, indicating that the schema parsing mechanism allows external input to influence the loading and instantiation of Java classes. The parquet-avro module fails to properly validate and restrict the classes that can be deserialized during schema parsing, enabling attackers to specify malicious class references within the Parquet file schema that get executed during the parsing process.

Attack Vector

The attack requires network access to deliver a malicious Parquet file to a vulnerable application. The exploitation scenario involves:

An attacker crafts a malicious Parquet file containing a specially constructed schema definition
The schema includes references to classes that, when instantiated, execute arbitrary code
A vulnerable application using the parquet-avro module with "specific" or "reflect" models processes the file
During schema parsing, the malicious classes are loaded and instantiated, executing the attacker's code

The vulnerability does not affect applications using the "generic" model, as this model handles schema parsing differently and does not instantiate arbitrary classes.

The attack requires user interaction and high privileges in certain scenarios, but the potential impact includes compromise of confidentiality, integrity, and availability of the affected system. For detailed technical information, refer to the Apache Thread Discussion.

Detection Methods for CVE-2025-46762

Indicators of Compromise

Unusual class loading activities during Parquet file processing operations
Unexpected network connections or process spawning from applications using parquet-avro
Anomalous Java deserialization patterns in application logs
Evidence of Parquet files with unusually complex or obfuscated schema definitions

Detection Strategies

Monitor application logs for errors or warnings related to schema parsing in parquet-avro components
Implement file integrity monitoring on incoming Parquet files before processing
Deploy runtime application self-protection (RASP) solutions to detect deserialization attacks
Use SentinelOne's behavioral AI to identify anomalous code execution patterns originating from data processing applications

Monitoring Recommendations

Enable verbose logging for parquet-avro schema parsing operations
Implement alerting on the org.apache.parquet.avro.SERIALIZABLE_PACKAGES system property changes
Monitor for unusual class instantiation patterns in Java applications processing Parquet files
Track and baseline network behavior of data processing pipelines to identify anomalies

How to Mitigate CVE-2025-46762

Immediate Actions Required

Upgrade Apache Parquet to version 1.15.2 immediately
For systems running version 1.15.1, set the system property org.apache.parquet.avro.SERIALIZABLE_PACKAGES to an empty string
Audit applications to identify usage of "specific" or "reflect" models in parquet-avro
Restrict network access to systems processing untrusted Parquet files

Patch Information

Apache has released version 1.15.2 which fully addresses this vulnerability. Users who cannot immediately upgrade to 1.15.2 can apply a workaround on version 1.15.1 by configuring the serializable packages system property. Both approaches are sufficient to remediate the vulnerability.

For additional context, refer to the Openwall OSS Security Update.

Workarounds

Set the JVM system property org.apache.parquet.avro.SERIALIZABLE_PACKAGES to an empty string on version 1.15.1
Switch to using the "generic" model if application requirements permit, as it is not affected by this vulnerability
Implement strict input validation on Parquet files from untrusted sources
Deploy network segmentation to isolate data processing systems from untrusted networks

bash

# Configuration example for Java applications on version 1.15.1
# Add this JVM argument when launching your application
java -Dorg.apache.parquet.avro.SERIALIZABLE_PACKAGES="" -jar your-application.jar