CVE-2026-22691: pypdf Library DoS Vulnerability

CVE-2026-22691 Overview

CVE-2026-22691 is a Denial of Service vulnerability affecting pypdf, a free and open-source pure-Python PDF library. Prior to version 6.6.0, pypdf is susceptible to resource exhaustion when processing PDF files containing malformed startxref entries. An attacker can craft a malicious PDF document that causes extended runtimes when the library attempts to rebuild the cross-reference table, particularly when the file contains excessive whitespace characters. This vulnerability only affects the non-strict reading mode of pypdf.

Critical Impact
Maliciously crafted PDF files can cause prolonged processing times, leading to resource exhaustion and potential denial of service in applications using pypdf in non-strict mode.

Affected Products

pypdf versions prior to 6.6.0
Applications using pypdf in non-strict reading mode
Python-based PDF processing pipelines leveraging pypdf

Discovery Timeline

2026-01-10 - CVE CVE-2026-22691 published to NVD
2026-01-13 - Last updated in NVD database

Technical Details for CVE-2026-22691

Vulnerability Analysis

This vulnerability is classified under CWE-400 (Uncontrolled Resource Consumption). The issue occurs during the cross-reference table rebuilding process in pypdf when operating in non-strict reading mode. When the library encounters a PDF file with an invalid or malformed startxref entry, it attempts to parse and reconstruct the cross-reference table. If the PDF contains large amounts of whitespace characters, this parsing operation can become computationally expensive, consuming excessive CPU time and potentially causing the application to become unresponsive.

The startxref keyword in a PDF file indicates the byte offset where the cross-reference table begins. When this value is malformed or points to an invalid location, pypdf's non-strict mode attempts to recover by searching through the document to locate valid cross-reference information. This recovery mechanism becomes problematic when the document is crafted with excessive whitespace, causing the parser to spend significant time iterating through meaningless content.

Root Cause

The root cause lies in the insufficient bounds checking and optimization during the cross-reference table rebuilding process. When pypdf operates in non-strict mode, it provides lenient parsing to handle slightly malformed PDF documents. However, this leniency creates an algorithmic complexity vulnerability where documents with large whitespace regions combined with invalid startxref entries trigger worst-case parsing scenarios. The library lacks adequate safeguards to limit processing time or detect patterns indicative of malicious documents.

Attack Vector

The attack is network-accessible, requiring no authentication or user interaction beyond the victim application processing a malicious PDF file. An attacker can exploit this vulnerability by:

Creating a PDF file with a deliberately malformed startxref entry pointing to an invalid location
Inserting large quantities of whitespace characters throughout the document
Delivering this malicious PDF to an application using pypdf in non-strict reading mode
The application attempts to parse the PDF, triggering the inefficient cross-reference rebuilding process
The application experiences extended processing times, potentially leading to service degradation or denial

Applications that accept PDF uploads from untrusted sources, automated PDF processing pipelines, and web services that parse user-submitted PDFs are particularly at risk.

Detection Methods for CVE-2026-22691

Indicators of Compromise

Unusually high CPU utilization during PDF processing operations
Extended processing times for PDF file parsing tasks
Application timeouts or unresponsiveness when handling specific PDF files
PDF files with abnormally large file sizes relative to their content
Log entries indicating failures or timeouts in PDF parsing functions

Detection Strategies

Monitor pypdf version in application dependencies and flag any version below 6.6.0
Implement processing time thresholds for PDF parsing operations and alert on exceeding limits
Deploy file size and complexity analysis on incoming PDF files before processing
Use static analysis tools to identify pypdf usage patterns, particularly non-strict mode configurations
Review application logs for recurring PDF parsing failures or performance anomalies

Monitoring Recommendations

Establish baseline metrics for PDF processing times and monitor for significant deviations
Implement resource usage monitoring (CPU, memory) on systems running pypdf-based applications
Configure alerting for PDF processing tasks that exceed expected duration thresholds
Track the ratio of PDF file size to processing time as an anomaly detection metric

How to Mitigate CVE-2026-22691

Immediate Actions Required

Upgrade pypdf to version 6.6.0 or later immediately
Review applications for non-strict mode usage and consider switching to strict mode where feasible
Implement processing timeouts for all PDF parsing operations as a defense-in-depth measure
Validate and sanitize PDF files from untrusted sources before processing
Consider implementing file size limits for uploaded PDF documents

Patch Information

The vulnerability has been addressed in pypdf version 6.6.0. The fix was implemented in commit 294165726b646bb7799be1cc787f593f2fdbcf45 and is tracked in GitHub Pull Request #3594. Users should update to the patched version by running pip install --upgrade pypdf>=6.6.0. Additional details are available in the GitHub Security Advisory GHSA-4f6g-68pf-7vhv and the release notes for version 6.6.0.

Workarounds

Enable strict mode when parsing PDF files by setting strict=True in pypdf reader initialization
Implement processing timeouts using Python's signal module or threading-based timeout mechanisms
Pre-validate PDF structure using alternative tools before passing to pypdf
Deploy input file size restrictions to limit exposure to large malicious documents
Isolate PDF processing in separate worker processes with resource limits

bash

# Upgrade pypdf to patched version
pip install --upgrade pypdf>=6.6.0

# Verify installed version
pip show pypdf | grep Version

# Alternative: Pin to specific patched version in requirements.txt
echo "pypdf>=6.6.0" >> requirements.txt

CVE-2026-22691 Overview

Critical Impact
Maliciously crafted PDF files can cause prolonged processing times, leading to resource exhaustion and potential denial of service in applications using pypdf in non-strict mode.

Affected Products

pypdf versions prior to 6.6.0
Applications using pypdf in non-strict reading mode
Python-based PDF processing pipelines leveraging pypdf

Discovery Timeline

2026-01-10 - CVE CVE-2026-22691 published to NVD
2026-01-13 - Last updated in NVD database

Technical Details for CVE-2026-22691

Vulnerability Analysis

Root Cause

Attack Vector

The attack is network-accessible, requiring no authentication or user interaction beyond the victim application processing a malicious PDF file. An attacker can exploit this vulnerability by:

Creating a PDF file with a deliberately malformed startxref entry pointing to an invalid location
Inserting large quantities of whitespace characters throughout the document
Delivering this malicious PDF to an application using pypdf in non-strict reading mode
The application attempts to parse the PDF, triggering the inefficient cross-reference rebuilding process
The application experiences extended processing times, potentially leading to service degradation or denial

Applications that accept PDF uploads from untrusted sources, automated PDF processing pipelines, and web services that parse user-submitted PDFs are particularly at risk.

Detection Methods for CVE-2026-22691

Indicators of Compromise

Unusually high CPU utilization during PDF processing operations
Extended processing times for PDF file parsing tasks
Application timeouts or unresponsiveness when handling specific PDF files
PDF files with abnormally large file sizes relative to their content
Log entries indicating failures or timeouts in PDF parsing functions

Detection Strategies

Monitor pypdf version in application dependencies and flag any version below 6.6.0
Implement processing time thresholds for PDF parsing operations and alert on exceeding limits
Deploy file size and complexity analysis on incoming PDF files before processing
Use static analysis tools to identify pypdf usage patterns, particularly non-strict mode configurations
Review application logs for recurring PDF parsing failures or performance anomalies

Monitoring Recommendations

Establish baseline metrics for PDF processing times and monitor for significant deviations
Implement resource usage monitoring (CPU, memory) on systems running pypdf-based applications
Configure alerting for PDF processing tasks that exceed expected duration thresholds
Track the ratio of PDF file size to processing time as an anomaly detection metric

How to Mitigate CVE-2026-22691

Immediate Actions Required

Upgrade pypdf to version 6.6.0 or later immediately
Review applications for non-strict mode usage and consider switching to strict mode where feasible
Implement processing timeouts for all PDF parsing operations as a defense-in-depth measure
Validate and sanitize PDF files from untrusted sources before processing
Consider implementing file size limits for uploaded PDF documents

Patch Information

Workarounds

Enable strict mode when parsing PDF files by setting strict=True in pypdf reader initialization
Implement processing timeouts using Python's signal module or threading-based timeout mechanisms
Pre-validate PDF structure using alternative tools before passing to pypdf
Deploy input file size restrictions to limit exposure to large malicious documents
Isolate PDF processing in separate worker processes with resource limits

bash

# Upgrade pypdf to patched version
pip install --upgrade pypdf>=6.6.0

# Verify installed version
pip show pypdf | grep Version

# Alternative: Pin to specific patched version in requirements.txt
echo "pypdf>=6.6.0" >> requirements.txt

CVE-2026-22691: pypdf Library DoS Vulnerability

CVE-2026-22691 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-22691

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-22691

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-22691

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform

CVE-2026-22691: pypdf Library DoS Vulnerability

CVE-2026-22691 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-22691

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-22691

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-22691

Immediate Actions Required

Patch Information

Workarounds

Experience the World’s Most Advanced Cybersecurity Platform