CVE-2026-42027 Overview
CVE-2026-42027 is an arbitrary class instantiation vulnerability in the ExtensionLoader.instantiateExtension(Class, String) method of Apache OpenNLP. The flaw allows a crafted model archive to trigger execution of any class's static initializer on the application classpath during model loading. The class name is sourced from the manifest.properties entry of the model archive and passed to Class.forName() before the type-safety check executes. This issue is tracked under CWE-470: Use of Externally-Controlled Input to Select Classes or Code. Affected versions include Apache OpenNLP releases prior to 2.5.9 and 3.0.0-M3.
Critical Impact
Attackers supplying untrusted model archives can force execution of arbitrary class static initializers, enabling JNDI lookups, outbound network I/O, or filesystem operations during model load.
Affected Products
- Apache OpenNLP versions before 2.5.9
- Apache OpenNLP 3.0.0-M1 and 3.0.0-M2
- Applications loading OpenNLP model archives from untrusted or third-party sources
Discovery Timeline
- 2026-05-04 - CVE-2026-42027 published to NVD
- 2026-05-06 - Last updated in NVD database
Technical Details for CVE-2026-42027
Vulnerability Analysis
The ExtensionLoader.instantiateExtension(Class, String) method resolves a fully qualified class name read from the model archive's manifest.properties file. It calls Class.forName() with default initialization semantics, then invokes the no-argument constructor. Apache OpenNLP performs an isAssignableFrom check to verify the loaded class implements BaseToolFactory (for factory= entries) or ArtifactSerializer (for serializer-class-* entries). The check, however, runs after Class.forName() has already loaded and initialized the class.
Root Cause
Class.forName() executes the target class's static initializer before returning the Class object. Since the type validation occurs after initialization, any class on the classpath can have its static block executed, regardless of whether it ultimately passes the subtype check. The trust boundary between manifest-supplied identifiers and JVM class loading was never enforced.
Attack Vector
An attacker crafts a model archive whose manifest.properties references a classpath-resident class with side-effecting static initializers. When a victim application loads the archive, the JVM initializes the named class and runs its static block. Useful gadget classes include those that perform JNDI lookups, open outbound network sockets, or read or write files during initialization. A secondary vector targets deployments that ship legitimate BaseToolFactory or ArtifactSerializer subclasses with side-effecting no-argument constructors. A malicious manifest names such a class to force its constructor to run during model load. Exploitation does not yield drop-in remote code execution. The attack surface expands as community model repositories and Hugging Face-style sharing make untrusted model files routine.
Detection Methods for CVE-2026-42027
Indicators of Compromise
- Model archives sourced from origins outside organizational control, particularly those with manifest.properties entries referencing classes outside the opennlp.* package namespace.
- Unexpected outbound JNDI, LDAP, or DNS traffic from JVM processes immediately following an OpenNLP model load operation.
- Java process spawning child processes or accessing sensitive filesystem paths during model deserialization.
Detection Strategies
- Inspect model archives statically and extract manifest.properties to enumerate factory= and serializer-class-* values for non-opennlp.* class references.
- Audit application classpaths for classes whose static initializers perform JNDI, network, or filesystem operations and correlate against OpenNLP usage.
- Monitor JVM class loading telemetry via -verbose:class or JFR events to detect unexpected class initialization during model load workflows.
Monitoring Recommendations
- Alert on any OpenNLP-hosting JVM process making outbound LDAP, RMI, or DNS requests not associated with normal application traffic.
- Track file integrity on directories where models are stored and flag the introduction of new or modified .zip or .bin model archives.
- Log and review the value of the OPENNLP_EXT_ALLOWED_PACKAGES system property at process startup to confirm a restrictive allowlist is in effect.
How to Mitigate CVE-2026-42027
Immediate Actions Required
- Upgrade Apache OpenNLP 2.x deployments to 2.5.9 and 3.x deployments to 3.0.0-M3.
- Restrict model file ingestion to trusted, signed sources and reject archives originating from public model repositories without review.
- Audit the application classpath for classes containing side-effecting static initializers or no-argument constructors and remove unused dependencies.
Patch Information
The patched releases introduce a package-prefix allowlist that is consulted before Class.forName() is invoked, so the static initializer of a disallowed class never executes. Classes under the opennlp. prefix remain permitted by default. Deployments referencing factories or serializers outside opennlp.* must opt those packages in explicitly, either programmatically via ExtensionLoader.registerAllowedPackage(String) before the first model load, or by setting the OPENNLP_EXT_ALLOWED_PACKAGES system property to a comma-separated list of allowed package prefixes. See the Apache Security Mailing List Thread and the OpenWall OSS Security Discussion for vendor guidance.
Workarounds
- Source all model files from trusted, internally vetted origins and apply cryptographic signing to detect tampering.
- Remove or replace classpath dependencies that perform JNDI lookups, network requests, or filesystem operations inside static initializers or no-arg constructors.
- Run OpenNLP-hosting JVMs with restrictive egress firewall policies blocking outbound LDAP, RMI, and arbitrary DNS to limit exploitation gadget effectiveness.
# Configuration example: restrict ExtensionLoader to specific package prefixes
java -DOPENNLP_EXT_ALLOWED_PACKAGES=opennlp.,com.example.models \
-jar your-application.jar
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


