CVE-2026-7482: Ollama Information Disclosure Vulnerability

CVE-2026-7482 Overview

CVE-2026-7482 is a heap out-of-bounds read vulnerability in Ollama versions prior to 0.17.1. The flaw resides in the GGUF model loader, specifically in fs/ggml/gguf.go and the WriteTo() function in server/quantization.go. An attacker submits a crafted GGUF file to the unauthenticated /api/create endpoint with tensor offset and size values that exceed the file's actual length. During quantization, the server reads past the allocated heap buffer. Leaked memory may contain environment variables, API keys, system prompts, and conversation data from concurrent users. Attackers can exfiltrate the leaked contents by pushing the resulting model artifact to an attacker-controlled registry through /api/push [CWE-125].

Critical Impact
Unauthenticated network attackers can read sensitive heap memory, including API keys, system prompts, and other users' conversation data, then exfiltrate it through the model push API.

Affected Products

Ollama versions prior to 0.17.1
Deployments exposing the API via OLLAMA_HOST=0.0.0.0
Any Ollama instance reachable on the network without an authentication proxy

Discovery Timeline

2026-05-04 - CVE-2026-7482 published to NVD
2026-05-05 - Last updated in NVD database

Technical Details for CVE-2026-7482

Vulnerability Analysis

The vulnerability exists in Ollama's GGUF model loading and quantization pipeline. GGUF (GPT-Generated Unified Format) files declare tensor metadata, including offsets and sizes, that the loader uses to map tensor data into memory. The pre-patch code in fs/ggml/gguf.go trusted the declared offset and size without comparing them against the actual file length. When WriteTo() in server/quantization.go later read tensor bytes for quantization, the read extended beyond the mapped file region into adjacent heap memory.

Because Ollama processes multiple concurrent requests in the same process, the leaked heap region can contain residual data from other users, including system prompts, conversation history, environment variables, and API keys held in memory by the inference runtime.

Root Cause

The loader did not validate that tensorOffset + tensor.Offset + tensor.Size() stays within the file bounds. The quantization path additionally failed to verify that the buffer returned for a tensor matched the expected size declared in the model header. Either gap is sufficient to trigger an out-of-bounds read.

Attack Vector

An unauthenticated attacker with network access to the Ollama API uploads a malicious GGUF file via /api/create. The server quantizes the model and reads beyond the file boundary into heap memory. The attacker then calls /api/push to send the resulting artifact, now containing leaked memory bytes baked into tensor data, to a registry they control. Neither endpoint requires authentication in the upstream distribution.

// Security patch in fs/ggml/gguf.go - ensure tensor size is valid
	padding := ggufPadding(offset, int64(alignment))
	llm.tensorOffset = uint64(offset + padding)

	// get file size to validate tensor bounds
	fileSize, err := rs.Seek(0, io.SeekEnd)
	if err != nil {
		return fmt.Errorf("failed to determine file size: %w", err)
	}

	if _, err := rs.Seek(offset, io.SeekStart); err != nil {
		return fmt.Errorf("failed to seek back after size check: %w", err)
	}

	for _, tensor := range llm.tensors {
		tensorEnd := llm.tensorOffset + tensor.Offset + tensor.Size()
		if tensorEnd > uint64(fileSize) {
			return fmt.Errorf("tensor %q offset+size (%d) exceeds file size (%d)", tensor.Name, tensorEnd, fileSize)
		}

Source: GitHub Commit 88d57d0

The companion fix in server/quantization.go rejects buffers smaller than the declared tensor size, preventing the quantizer from operating on truncated reads:

// Security patch in server/quantization.go
	if uint64(len(data)) < q.from.Size() {
		return 0, fmt.Errorf("tensor %s data size %d is less than expected %d from shape %v", q.from.Name, len(data), q.from.Size(), q.from.Shape)
	}

Source: GitHub Commit 88d57d0

Detection Methods for CVE-2026-7482

Indicators of Compromise

POST requests to /api/create carrying GGUF payloads from unexpected source IPs
Outbound /api/push calls referencing registries outside the organization's allowlist
Ollama process error logs containing failed to get current offset or unexplained tensor read errors
Ollama instances bound to 0.0.0.0 reachable from the public internet

Detection Strategies

Inspect HTTP request bodies to /api/create for GGUF magic bytes followed by oversized tensor offset or size fields
Alert on any successful /api/push operation targeting a registry hostname that is not in an approved list
Correlate /api/create and /api/push calls from the same client within a short window, which matches the leak-and-exfiltrate pattern

Monitoring Recommendations

Capture and retain Ollama HTTP access logs for forensic review of model upload activity
Monitor egress traffic from inference hosts to identify pushes to unknown registries
Track environment variables and secrets accessible to the Ollama process and rotate any that may have been resident in memory during exposure

How to Mitigate CVE-2026-7482

Immediate Actions Required

Upgrade Ollama to version 0.17.1 or later on every host running the service
Restrict the Ollama listener to 127.0.0.1 or a private interface unless external access is required
Place an authenticating reverse proxy in front of /api/create and /api/push if remote access is needed
Rotate API keys, tokens, and secrets that were present in the Ollama process environment

Patch Information

The fix shipped in Ollama v0.17.1 and was merged through pull request #14406. The patch adds file-size validation in the GGUF loader and a tensor data length check in the quantizer. Review the commit details for the full diff.

Workarounds

Block inbound access to TCP port 11434 at the host firewall and network edge
Disable /api/create and /api/push at a reverse proxy layer until patched
Run Ollama in an isolated container with no access to host secrets or production credentials
Require authentication and request allowlisting through an API gateway for any exposed instance

bash

# Configuration example: bind Ollama to loopback only
export OLLAMA_HOST=127.0.0.1:11434
systemctl restart ollama

# Verify the listener is not exposed externally
ss -tlnp | grep 11434

CVE-2026-7482 Overview

Critical Impact
Unauthenticated network attackers can read sensitive heap memory, including API keys, system prompts, and other users' conversation data, then exfiltrate it through the model push API.

Affected Products

Ollama versions prior to 0.17.1
Deployments exposing the API via OLLAMA_HOST=0.0.0.0
Any Ollama instance reachable on the network without an authentication proxy

Discovery Timeline

2026-05-04 - CVE-2026-7482 published to NVD
2026-05-05 - Last updated in NVD database

Technical Details for CVE-2026-7482

Vulnerability Analysis

Root Cause

Attack Vector

// Security patch in fs/ggml/gguf.go - ensure tensor size is valid
	padding := ggufPadding(offset, int64(alignment))
	llm.tensorOffset = uint64(offset + padding)

	// get file size to validate tensor bounds
	fileSize, err := rs.Seek(0, io.SeekEnd)
	if err != nil {
		return fmt.Errorf("failed to determine file size: %w", err)
	}

	if _, err := rs.Seek(offset, io.SeekStart); err != nil {
		return fmt.Errorf("failed to seek back after size check: %w", err)
	}

	for _, tensor := range llm.tensors {
		tensorEnd := llm.tensorOffset + tensor.Offset + tensor.Size()
		if tensorEnd > uint64(fileSize) {
			return fmt.Errorf("tensor %q offset+size (%d) exceeds file size (%d)", tensor.Name, tensorEnd, fileSize)
		}

Source: GitHub Commit 88d57d0

The companion fix in server/quantization.go rejects buffers smaller than the declared tensor size, preventing the quantizer from operating on truncated reads:

// Security patch in server/quantization.go
	if uint64(len(data)) < q.from.Size() {
		return 0, fmt.Errorf("tensor %s data size %d is less than expected %d from shape %v", q.from.Name, len(data), q.from.Size(), q.from.Shape)
	}

Source: GitHub Commit 88d57d0

Detection Methods for CVE-2026-7482

Indicators of Compromise

POST requests to /api/create carrying GGUF payloads from unexpected source IPs
Outbound /api/push calls referencing registries outside the organization's allowlist
Ollama process error logs containing failed to get current offset or unexplained tensor read errors
Ollama instances bound to 0.0.0.0 reachable from the public internet

Detection Strategies

Inspect HTTP request bodies to /api/create for GGUF magic bytes followed by oversized tensor offset or size fields
Alert on any successful /api/push operation targeting a registry hostname that is not in an approved list
Correlate /api/create and /api/push calls from the same client within a short window, which matches the leak-and-exfiltrate pattern

Monitoring Recommendations

Capture and retain Ollama HTTP access logs for forensic review of model upload activity
Monitor egress traffic from inference hosts to identify pushes to unknown registries
Track environment variables and secrets accessible to the Ollama process and rotate any that may have been resident in memory during exposure

How to Mitigate CVE-2026-7482

Immediate Actions Required

Upgrade Ollama to version 0.17.1 or later on every host running the service
Restrict the Ollama listener to 127.0.0.1 or a private interface unless external access is required
Place an authenticating reverse proxy in front of /api/create and /api/push if remote access is needed
Rotate API keys, tokens, and secrets that were present in the Ollama process environment

Patch Information

Workarounds

Block inbound access to TCP port 11434 at the host firewall and network edge
Disable /api/create and /api/push at a reverse proxy layer until patched
Run Ollama in an isolated container with no access to host secrets or production credentials
Require authentication and request allowlisting through an API gateway for any exposed instance

bash

# Configuration example: bind Ollama to loopback only
export OLLAMA_HOST=127.0.0.1:11434
systemctl restart ollama

# Verify the listener is not exposed externally
ss -tlnp | grep 11434

CVE-2026-7482: Ollama Information Disclosure Vulnerability

CVE-2026-7482 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-7482

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-7482

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-7482

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2026-7482: Ollama Information Disclosure Vulnerability

CVE-2026-7482 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-7482

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-7482

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-7482

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform