CVE-2024-39928: Apache Linkis Token Security Vulnerability

CVE-2024-39928 Overview

CVE-2024-39928 affects Apache Linkis versions up to and including 1.5.0. The vulnerability resides in the Spark EngineConn component, where the token used when starting Py4j is generated using Apache Commons Lang's RandomStringUtils. This utility is not cryptographically secure and produces predictable values that an attacker can recover. The weakness is classified under [CWE-326: Inadequate Encryption Strength]. Apache addressed the issue in Linkis 1.6.0. The flaw is network-exploitable without authentication or user interaction and impacts the confidentiality of data exchanged between Linkis and the Py4j gateway.

Critical Impact
An unauthenticated network attacker can predict or brute-force the Py4j token generated by RandomStringUtils and gain access to the Spark EngineConn Py4j gateway, exposing sensitive computation data.

Affected Products

Apache Linkis 1.5.0 and earlier
Spark EngineConn component within Apache Linkis
Deployments relying on Py4j token authentication in Linkis

Discovery Timeline

2024-09-25 - CVE-2024-39928 published to NVD
2025-05-16 - Last updated in NVD database

Technical Details for CVE-2024-39928

Vulnerability Analysis

Apache Linkis is a computation middleware that brokers requests between upper-layer applications and underlying engines such as Spark, Hive, and Presto. The Spark EngineConn module starts a Py4j gateway to bridge Python and JVM processes. Py4j relies on a shared token to authenticate clients connecting to the gateway. In affected versions, Linkis generates this token using org.apache.commons.lang.RandomStringUtils (or its Commons Lang3 equivalent), which seeds its randomness with java.util.Random. java.util.Random produces deterministic output once its internal state is recovered, making the resulting token guessable.

The issue falls under the broader category of insecure random number generation. Tokens used for authentication or session binding must be derived from a cryptographically secure source such as java.security.SecureRandom. The fix in 1.6.0 replaces the weak generator with a secure alternative.

Root Cause

The root cause is the use of RandomStringUtils from Apache Commons Lang to produce a security-sensitive token. This function uses a non-cryptographic pseudo-random number generator. An attacker observing a small number of outputs, or with knowledge of the seeding strategy, can reconstruct the internal state and predict subsequent tokens.

Attack Vector

The attack is performed over the network against the Py4j gateway exposed by the Spark EngineConn. An attacker who can reach the gateway port attempts to authenticate using predicted token values. Once a valid token is recovered, the attacker interacts with the Py4j gateway and reads data processed by the Spark engine. Because the gateway exposes JVM objects to clients, the attacker can also query objects held by the Linkis Spark session.

No verified public proof-of-concept code is available. Refer to the Apache Mailing List Thread and the Openwall OSS Security Update for the official advisory text.

Detection Methods for CVE-2024-39928

Indicators of Compromise

Unexpected inbound connections to the Py4j gateway port opened by Spark EngineConn processes.
Repeated authentication attempts against the Py4j gateway from a single source within a short window, indicating token brute-forcing.
Linkis Spark EngineConn sessions originating from clients that do not match the deployed Linkis web or scheduler hosts.

Detection Strategies

Inventory all Linkis deployments and identify any instance running version 1.5.0 or earlier by querying the linkis-cli --version output or the deployment manifest.
Monitor Linkis and Spark EngineConn logs for Py4j authentication failures and unexpected client IP addresses.
Inspect network flow records for connections to Py4j ports from outside the trusted Linkis subnet.

Monitoring Recommendations

Forward Linkis, Spark, and Py4j gateway logs to a centralized logging platform and alert on authentication anomalies.
Track process creation events for Spark EngineConn workers and correlate their listening ports with expected network policy.
Review firewall and security group rules quarterly to confirm that Py4j gateway ports are not exposed beyond the Linkis cluster.

How to Mitigate CVE-2024-39928

Immediate Actions Required

Upgrade Apache Linkis to version 1.6.0 or later, which replaces the weak random string generator with a cryptographically secure implementation.
Restrict network access to Spark EngineConn Py4j ports using firewall rules, security groups, or Kubernetes network policies so only Linkis components can connect.
Rotate any tokens, credentials, and secrets that may have been exposed through a compromised Py4j gateway.
Audit Linkis access logs for the period prior to patching to identify suspicious gateway sessions.

Patch Information

Apache Linkis 1.6.0 fixes this issue. Download the patched release from the official Apache Linkis distribution channels and follow the project's upgrade procedure. See the Apache Mailing List Thread for the vendor advisory.

Workarounds

If immediate upgrade is not possible, isolate Linkis deployments inside a private network segment with no external reachability to Py4j gateway ports.
Place Linkis behind a reverse proxy or service mesh that enforces mutual TLS between components.
Disable the Spark EngineConn in environments where it is not required until the upgrade can be completed.

bash

# Configuration example: restrict Py4j gateway access with iptables
# Replace 10.0.0.0/24 with the trusted Linkis subnet and 25333 with the actual Py4j port
iptables -A INPUT -p tcp --dport 25333 -s 10.0.0.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 25333 -j DROP