Description
NLTK version 3.9.4 is vulnerable to a path traversal attack due to an incomplete fix for GitHub Issue #3504. The `_UNSAFE_NO_PROTOCOL_RE` regex in `nltk/data.py` checks for literal `../` sequences but fails to account for percent-encoded traversal sequences such as `..%2f`. The `url2pathname()` function decodes these sequences after the validation step, allowing an attacker to bypass the protection. This vulnerability enables an attacker to read arbitrary files accessible to the Python process by controlling the resource name parameter passed to `nltk.data.load()` or `nltk.data.find()`. The issue affects applications that rely on NLTK for resource loading, including NLP web applications, Jupyter notebooks, and CLI tools. The default `pathsec.ENFORCE=False` setting exacerbates the impact by not blocking the file read at the `open()` stage.
Published: 2026-06-30
Score: 7.5 High
EPSS: n/a
KEV: No
Impact: n/a
Action: n/a
AI Analysis

Impact

The vulnerability is a path traversal flaw in NLTK's data loading functions. Percent‑encoded traversal sequences such as ..%2f bypass the existing check and allow an attacker to read any file that the Python process can access. If an application passes a user‑controlled resource name to nltk.data.load() or nltk.data.find(), the attacker can read arbitrary files, exposing sensitive data.

Affected Systems

NLTK 3.9.4 is affected. Any software that imports NLTK to load resources, such as NLP web applications, Jupyter notebooks, and command‑line tools, is vulnerable if it relies on the default pathsec.ENFORCE setting of False. Other NLTK versions are not known to be impacted.

Risk and Exploitability

The CVSS score of 7.5 indicates high severity. EPSS data is not available, but the flaw is exploitable whenever the attacker can supply a resource name; this is typical for web services or local scripts that do not sanitize input. The vulnerability is not listed in the CISA KEV catalog. By default, pathsec.ENFORCE is False, allowing the open stage to succeed after bypassing the traversal check, increasing the likelihood of exploitation. The flaw allows reading of files but does not provide code execution or privilege escalation.

Generated by OpenCVE AI on June 30, 2026 at 02:20 UTC.

Remediation

No vendor fix or workaround currently provided.

OpenCVE Recommended Actions

  • Upgrade NLTK to the latest release that includes a fix for the traversal check (e.g., version 3.9.5 or later).
  • Configure NLTK to enforce path validation by setting pathsec.ENFORCE=True, which blocks open calls when validation fails.
  • Validate or sanitize any resource name passed to nltk.data.load() or nltk.data.find() before calling the function, rejecting percent‑encoded sequences and ensuring the path does not contain ".." components.

Generated by OpenCVE AI on June 30, 2026 at 02:20 UTC.

Tracking

Sign in to view the affected projects.

Advisories

No advisories yet.

History

Tue, 30 Jun 2026 01:15:00 +0000

Type Values Removed Values Added
Description NLTK version 3.9.4 is vulnerable to a path traversal attack due to an incomplete fix for GitHub Issue #3504. The `_UNSAFE_NO_PROTOCOL_RE` regex in `nltk/data.py` checks for literal `../` sequences but fails to account for percent-encoded traversal sequences such as `..%2f`. The `url2pathname()` function decodes these sequences after the validation step, allowing an attacker to bypass the protection. This vulnerability enables an attacker to read arbitrary files accessible to the Python process by controlling the resource name parameter passed to `nltk.data.load()` or `nltk.data.find()`. The issue affects applications that rely on NLTK for resource loading, including NLP web applications, Jupyter notebooks, and CLI tools. The default `pathsec.ENFORCE=False` setting exacerbates the impact by not blocking the file read at the `open()` stage.
Title Path Traversal via Percent-Encoding in nltk.data.find() and nltk.data.load()
Weaknesses CWE-22
References
Metrics cvssV3_0

{'score': 7.5, 'vector': 'CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N'}


Subscriptions

No data.

cve-icon MITRE

Status: PUBLISHED

Assigner: @huntr_ai

Published:

Updated: 2026-06-30T00:14:35.370Z

Reserved: 2026-06-15T06:24:30.096Z

Link: CVE-2026-12243

cve-icon Vulnrichment

No data.

cve-icon NVD

No data.

cve-icon Redhat

No data.

cve-icon OpenCVE Enrichment

Updated: 2026-06-30T02:30:05Z

Weaknesses
  • CWE-22

    Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')