CVE-2026-33230 Overview
A reflected Cross-Site Scripting (XSS) vulnerability has been discovered in NLTK (Natural Language Toolkit), a widely-used suite of open source Python modules supporting research and development in Natural Language Processing. The vulnerability exists in the nltk.app.wordnet_app module, specifically within the lookup_... route handler, where attacker-controlled input is reflected into HTML responses without proper sanitization.
Critical Impact
Users running the local WordNet Browser server are vulnerable to arbitrary script execution in their browser, potentially leading to session hijacking, credential theft, or malicious actions performed in the context of the application.
Affected Products
- NLTK versions 3.9.3 and prior
- nltk.app.wordnet_app module
- WordNet Browser local server component
Discovery Timeline
- 2026-03-20 - CVE-2026-33230 published to NVD
- 2026-03-23 - Last updated in NVD database
Technical Details for CVE-2026-33230
Vulnerability Analysis
This reflected XSS vulnerability occurs in the WordNet Browser application, a component of NLTK that provides a local web-based interface for exploring the WordNet lexical database. The lookup_... route is designed to accept word queries from users and display dictionary definitions. However, when a searched word is not found in the dictionary, the application reflects the user-supplied input directly into the HTML response without applying proper output encoding.
The vulnerable code path triggers when a user submits a lookup request containing malicious JavaScript or HTML content. Since the word parameter is interpolated directly into the error message using Python's string formatting (%s), an attacker can craft a URL that injects arbitrary code into the rendered page.
Root Cause
The root cause is missing output encoding in the wordnet_app.py module. When generating the error message for words not found in the dictionary, the code directly interpolates user input into an HTML string without calling html.escape() to sanitize special characters. This allows attackers to break out of the text context and inject script tags or event handlers.
Attack Vector
An attacker can exploit this vulnerability by crafting a malicious URL containing JavaScript payload in the lookup route. When a victim clicks this link while running the local WordNet Browser server, the malicious script executes in the context of the victim's browser session with access to the application origin.
# Vulnerable code (before patch)
# Source: https://github.com/nltk/nltk/commit/1c3f799607eeb088cab2491dcf806ae83c29ad8f
except KeyError:
pass
if not body:
- body = "The word or words '%s' were not found in the dictionary." % word
+ body = "The word or words '%s' were not found in the dictionary." % html.escape(
+ word
+ )
return body, word
The fix applies html.escape() to the user-supplied word parameter before including it in the HTML response, preventing XSS payload execution.
Detection Methods for CVE-2026-33230
Indicators of Compromise
- Unusual URL patterns in web server logs containing <script>, javascript:, or HTML event handlers in the lookup route
- Requests to the WordNet Browser containing encoded payloads such as %3Cscript%3E or %22onload%3D
- Client-side security tools reporting XSS attempts originating from the NLTK WordNet Browser application
Detection Strategies
- Monitor HTTP access logs for requests containing suspicious characters like <, >, ", or ' in URL parameters
- Implement Web Application Firewall (WAF) rules to detect and block reflected XSS patterns targeting the WordNet Browser
- Deploy browser-based XSS detection extensions to identify malicious script injection attempts
Monitoring Recommendations
- Enable Content Security Policy (CSP) headers to restrict script execution sources and mitigate XSS impact
- Configure logging to capture full request URLs for forensic analysis of potential attack attempts
- Review network traffic for outbound connections initiated by scripts in the WordNet Browser context
How to Mitigate CVE-2026-33230
Immediate Actions Required
- Upgrade NLTK to a version containing commit 1c3f799607eeb088cab2491dcf806ae83c29ad8f or later
- Avoid exposing the WordNet Browser server to untrusted networks or users
- Restrict access to the local WordNet Browser server to localhost only until patched
Patch Information
The NLTK development team has addressed this vulnerability through commits that add proper HTML escaping to user-supplied input. The fix applies html.escape() to the word parameter before including it in error messages. Users should update to the latest version of NLTK to receive this security fix.
Workarounds
- Disable the WordNet Browser application (nltk.app.wordnet_app) if not actively required
- Run the WordNet Browser server only on localhost and avoid accessing it through links from untrusted sources
- Implement a reverse proxy with XSS filtering capabilities in front of the WordNet Browser server
# Configuration example - Restrict WordNet Browser to localhost only
# When starting the WordNet Browser, ensure it binds only to localhost
python -c "import nltk; nltk.app.wordnet_app.app.run(host='127.0.0.1', port=8000)"
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.


