CVE-2026-24009: Docling Core PyYAML RCE Vulnerability

CVE-2026-24009 Overview

CVE-2026-24009 is an Insecure Deserialization vulnerability affecting Docling Core, a Python library that defines core data types and transformations for the Docling document processing application. The vulnerability exposes applications to remote code execution through PyYAML's unsafe deserialization mechanism when processing untrusted YAML data via the DoclingDocument.load_from_yaml() method.

This vulnerability is a downstream exposure of CVE-2020-14343, a known PyYAML deserialization flaw. Applications using docling-core versions 2.21.0 through 2.48.3 are vulnerable if they also use PyYAML versions prior to 5.4 and process untrusted YAML input.

Critical Impact
Successful exploitation allows remote attackers to execute arbitrary code on systems processing malicious YAML documents, potentially leading to complete system compromise, data exfiltration, or lateral movement within affected environments.

Affected Products

Docling Core versions 2.21.0 through 2.48.3
Environments running PyYAML versions prior to 5.4
Applications invoking docling_core.types.doc.DoclingDocument.load_from_yaml() with untrusted input

Discovery Timeline

2026-01-22 - CVE-2026-24009 published to NVD
2026-01-22 - Last updated in NVD database

Technical Details for CVE-2026-24009

Vulnerability Analysis

The vulnerability stems from the use of PyYAML's yaml.FullLoader class when deserializing YAML documents in the load_from_yaml() method. The FullLoader class, while more restrictive than the deprecated Loader, still permits the instantiation of arbitrary Python objects through YAML tags. This capability can be weaponized by attackers who craft malicious YAML payloads containing Python object constructors that execute system commands or load malicious modules.

The attack requires the target application to process attacker-controlled YAML content, making document upload functionality, API endpoints accepting YAML, or file processing pipelines potential attack surfaces. The network-based attack vector combined with no authentication requirements significantly increases the risk exposure for public-facing applications.

Root Cause

The root cause is the selection of an unsafe YAML loader (yaml.FullLoader) that permits arbitrary object deserialization. PyYAML's deserialization mechanism supports YAML tags that can instantiate Python objects, including those that execute code during construction. The FullLoader class was intended to provide some safety by restricting certain dangerous operations, but it remains vulnerable to carefully crafted payloads that exploit permitted object types.

The CWE-502 (Deserialization of Untrusted Data) classification accurately captures this vulnerability class, where untrusted input is deserialized without adequate restrictions on what objects can be instantiated.

Attack Vector

Attackers can exploit this vulnerability by submitting malicious YAML content to any application endpoint that processes YAML using the vulnerable DoclingDocument.load_from_yaml() method. The attack payload typically embeds Python object constructors using YAML's !!python/object tag syntax or related constructs that trigger code execution during the deserialization phase.

The network-accessible nature of this vulnerability means that any application exposing YAML processing functionality—whether through document upload forms, REST APIs, or file import features—may be susceptible to remote exploitation without authentication.

The following code shows the security patch that addresses this vulnerability by switching from yaml.FullLoader to yaml.SafeLoader:

python

         if isinstance(filename, str):
             filename = Path(filename)
         with open(filename, encoding="utf-8") as f:
-            data = yaml.load(f, Loader=yaml.FullLoader)
+            data = yaml.load(f, Loader=yaml.SafeLoader)
         return DoclingDocument.model_validate(data)

     def export_to_dict(

Source: GitHub Commit

Detection Methods for CVE-2026-24009

Indicators of Compromise

YAML files containing !!python/object, !!python/module, or similar Python-specific YAML tags
Unexpected process spawning or command execution originating from Python processes handling document operations
Network connections to external hosts initiated by Docling-based applications during YAML processing
Error logs indicating failed deserialization attempts with Python object instantiation

Detection Strategies

Implement input validation rules to reject YAML content containing Python object constructor tags before processing
Monitor application logs for deserialization errors or unexpected object instantiation attempts
Deploy web application firewall (WAF) rules to detect and block YAML payloads with embedded Python object syntax
Use static code analysis to identify usage of yaml.FullLoader or yaml.Loader in codebases

Monitoring Recommendations

Enable verbose logging for document processing components to capture YAML parsing events
Implement runtime application self-protection (RASP) to detect and block object instantiation during deserialization
Monitor system call patterns for anomalous behavior following YAML file processing operations
Set up alerts for any process execution originating from document processing workflows

How to Mitigate CVE-2026-24009

Immediate Actions Required

Upgrade docling-core to version 2.48.4 or later immediately
If immediate upgrade is not possible, ensure PyYAML is updated to version 5.4 or greater as an interim mitigation
Audit application code to identify all instances where DoclingDocument.load_from_yaml() processes external input
Implement input validation to sanitize or reject YAML content from untrusted sources

Patch Information

The vulnerability has been patched in docling-core version 2.48.4. The fix replaces the use of yaml.FullLoader with yaml.SafeLoader in the YAML deserialization logic, ensuring that arbitrary Python objects cannot be instantiated during document loading. For detailed information, refer to the GitHub Security Advisory and the release notes for v2.48.4.

Workarounds

Upgrade the PyYAML dependency to version 5.4 or later if docling-core cannot be immediately upgraded
Restrict YAML document processing to trusted sources only until patches can be applied
Implement network segmentation to limit the blast radius of potential compromise
Deploy application-layer controls to reject YAML files containing Python object tags before they reach vulnerable code paths

bash

# Upgrade docling-core to patched version
pip install --upgrade docling-core>=2.48.4

# Alternative: Upgrade PyYAML as interim mitigation
pip install --upgrade pyyaml>=5.4

# Verify installed versions
pip show docling-core pyyaml | grep -E "^(Name|Version):"

CVE-2026-24009 Overview

Critical Impact
Successful exploitation allows remote attackers to execute arbitrary code on systems processing malicious YAML documents, potentially leading to complete system compromise, data exfiltration, or lateral movement within affected environments.

Affected Products

Docling Core versions 2.21.0 through 2.48.3
Environments running PyYAML versions prior to 5.4
Applications invoking docling_core.types.doc.DoclingDocument.load_from_yaml() with untrusted input

Discovery Timeline

2026-01-22 - CVE-2026-24009 published to NVD
2026-01-22 - Last updated in NVD database

Technical Details for CVE-2026-24009

Vulnerability Analysis

Root Cause

Attack Vector

The following code shows the security patch that addresses this vulnerability by switching from yaml.FullLoader to yaml.SafeLoader:

python

         if isinstance(filename, str):
             filename = Path(filename)
         with open(filename, encoding="utf-8") as f:
-            data = yaml.load(f, Loader=yaml.FullLoader)
+            data = yaml.load(f, Loader=yaml.SafeLoader)
         return DoclingDocument.model_validate(data)

     def export_to_dict(

Source: GitHub Commit

Detection Methods for CVE-2026-24009

Indicators of Compromise

YAML files containing !!python/object, !!python/module, or similar Python-specific YAML tags
Unexpected process spawning or command execution originating from Python processes handling document operations
Network connections to external hosts initiated by Docling-based applications during YAML processing
Error logs indicating failed deserialization attempts with Python object instantiation

Detection Strategies

Implement input validation rules to reject YAML content containing Python object constructor tags before processing
Monitor application logs for deserialization errors or unexpected object instantiation attempts
Deploy web application firewall (WAF) rules to detect and block YAML payloads with embedded Python object syntax
Use static code analysis to identify usage of yaml.FullLoader or yaml.Loader in codebases

Monitoring Recommendations

Enable verbose logging for document processing components to capture YAML parsing events
Implement runtime application self-protection (RASP) to detect and block object instantiation during deserialization
Monitor system call patterns for anomalous behavior following YAML file processing operations
Set up alerts for any process execution originating from document processing workflows

How to Mitigate CVE-2026-24009

Immediate Actions Required

Upgrade docling-core to version 2.48.4 or later immediately
If immediate upgrade is not possible, ensure PyYAML is updated to version 5.4 or greater as an interim mitigation
Audit application code to identify all instances where DoclingDocument.load_from_yaml() processes external input
Implement input validation to sanitize or reject YAML content from untrusted sources

Patch Information

Workarounds

Upgrade the PyYAML dependency to version 5.4 or later if docling-core cannot be immediately upgraded
Restrict YAML document processing to trusted sources only until patches can be applied
Implement network segmentation to limit the blast radius of potential compromise
Deploy application-layer controls to reject YAML files containing Python object tags before they reach vulnerable code paths

bash

# Upgrade docling-core to patched version
pip install --upgrade docling-core>=2.48.4

# Alternative: Upgrade PyYAML as interim mitigation
pip install --upgrade pyyaml>=5.4

# Verify installed versions
pip show docling-core pyyaml | grep -E "^(Name|Version):"

CVE-2026-24009: Docling Core PyYAML RCE Vulnerability

CVE-2026-24009 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-24009

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-24009

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-24009

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform

CVE-2026-24009: Docling Core PyYAML RCE Vulnerability

CVE-2026-24009 Overview

Critical Impact

Affected Products

Discovery Timeline

Technical Details for CVE-2026-24009

Vulnerability Analysis

Root Cause

Attack Vector

Detection Methods for CVE-2026-24009

Indicators of Compromise

Detection Strategies

Monitoring Recommendations

How to Mitigate CVE-2026-24009

Immediate Actions Required

Patch Information

Workarounds

Experience the Most Advanced Cybersecurity Platform