Impact
The NLTK downloader fails to validate the subdir and id fields in remote XML index files, allowing an attacker to supply path traversal sequences such as ../. When a user downloads corpus data, these sequences can create arbitrary directories, create arbitrary files, or overwrite existing files. The result can be malicious code execution or data corruption, reflecting a classic improper restriction of operations within a file system flaw (CWE‑22).
Affected Systems
Any installation of NLTK version 3.9.3 or earlier that uses the downloader feature to fetch corpora from remote XML indices is affected. The issue exists across all platforms where Python applications invoke the NLTK downloader to obtain data sets.
Risk and Exploitability
The vulnerability carries a CVSS score of 8.1, indicating high severity, but the EPSS score is below 1%, suggesting a low likelihood of exploitation in the wild. It is not listed in the CISA KEV catalog. Based on the description, it is inferred that the attack vector is remote: an adversary must host a malicious XML index server and a victim must trigger the NLTK downloader to retrieve data from that server, which enables the path traversal and file overwrite capability.
OpenCVE Enrichment
Github GHSA