CVE-2026-21728: Tempo DOS Memory Allocation Vulnerability

CVE-2026-21728 Overview

CVE-2026-21728 is a resource exhaustion vulnerability affecting Grafana Tempo, a high-scale distributed tracing backend. The vulnerability allows unauthenticated remote attackers to cause denial of service conditions by submitting queries with excessively large limit values, leading to significant memory allocations that can impact service availability.

This vulnerability is classified as CWE-400 (Uncontrolled Resource Consumption), where the application fails to properly limit the amount of resources that can be consumed by a single operation. The impact severity depends on the deployment strategy, with monolithic deployments being more susceptible to complete service disruption.

Critical Impact
Attackers can exhaust server memory through malicious queries, potentially causing service outages for distributed tracing infrastructure.

Affected Products

Grafana Tempo (versions not specified in advisory)

Discovery Timeline

2026-04-24 - CVE-2026-21728 published to NVD
2026-04-24 - Last updated in NVD database

Technical Details for CVE-2026-21728

Vulnerability Analysis

This denial of service vulnerability stems from improper resource management in Grafana Tempo's query processing logic. When a user submits a trace search query, Tempo allocates memory proportional to the specified result limit parameter. Without proper validation or bounds checking on this limit value, an attacker can specify arbitrarily large limits that cause the service to allocate excessive amounts of memory.

The vulnerability is particularly concerning in environments where Tempo is deployed as a monolithic service, as the memory exhaustion can bring down the entire tracing infrastructure. In microservices or distributed deployments, the impact may be limited to specific query components, though cascading failures remain possible.

Root Cause

The root cause is the absence of a maximum result limit constraint in Tempo's search configuration. The search query handler accepts user-supplied limit values without enforcing an upper boundary, allowing memory allocation requests that exceed available system resources. This represents a classic uncontrolled resource consumption pattern where the application trusts user input to define resource allocation parameters.

Attack Vector

The attack can be executed remotely over the network without authentication. An attacker sends specially crafted trace search queries to the Tempo API endpoint with extremely large limit values. Each malicious query forces the server to allocate memory for the requested number of results, even before any actual trace data is retrieved.

The attack is straightforward to execute and requires minimal technical sophistication. Since Tempo is typically exposed for trace ingestion and querying, the attack surface is often accessible from within the network or even externally in some deployments.

The vulnerability manifests in the query processing pipeline where result limits are applied. When processing search requests, the system allocates memory structures to hold the requested number of results. Without a configured max_result_limit, attackers can specify values like 2^32 or higher, causing memory exhaustion. For technical implementation details, refer to the Grafana Security Advisory CVE-2026-21728.

Detection Methods for CVE-2026-21728

Indicators of Compromise

Unusual spikes in memory consumption on Tempo service instances
Tempo service crashes or out-of-memory (OOM) kills in container logs
High volume of search queries with abnormally large limit parameters
Service unavailability or degraded performance in distributed tracing infrastructure

Detection Strategies

Monitor Tempo query logs for search requests with limit values exceeding normal operational thresholds
Implement alerting on memory utilization metrics for Tempo pods/containers approaching resource limits
Review application logs for OOM killer events or memory allocation failures
Deploy network-level monitoring to identify unusual query patterns against Tempo endpoints

Monitoring Recommendations

Configure Prometheus alerts for Tempo memory usage exceeding 80% of allocated resources
Set up log aggregation rules to flag queries with limit parameters above 262144
Implement request rate limiting at the API gateway level for Tempo search endpoints
Enable distributed tracing observability for the tracing infrastructure itself to identify anomalous query patterns

How to Mitigate CVE-2026-21728

Immediate Actions Required

Configure max_result_limit in Tempo's search configuration immediately
Set the recommended value of 262144 (2^18) or lower based on operational requirements
Review and apply resource limits (memory limits) on Tempo containers/pods
Consider implementing query timeout configurations to prevent long-running malicious queries

Patch Information

The recommended mitigation involves configuration changes rather than a software patch. Administrators should update their Tempo configuration to include the max_result_limit parameter in the search configuration block. This setting enforces an upper bound on the number of results that can be requested in a single query, preventing memory exhaustion attacks.

For detailed guidance, consult the Grafana Security Advisory CVE-2026-21728.

Workarounds

Set max_result_limit to 262144 (2^18) in the search configuration as recommended by Grafana
Deploy Tempo in a distributed architecture to isolate query processing from other components
Implement Kubernetes resource limits to prevent a single pod from exhausting node resources
Use API gateway rate limiting to throttle search query frequency from individual clients

bash

# Example Tempo configuration with max_result_limit
# Add to your tempo.yaml or tempo-config ConfigMap

search:
  max_result_limit: 262144  # Recommended limit of 2^18

# For Kubernetes deployments, ensure resource limits are set:
# resources:
#   limits:
#     memory: "4Gi"
#   requests:
#     memory: "2Gi"