CVE-2026-40682 - Vulnerability Details

- Apache OpenNLP: XXE via Dictionary Parsing in DictionaryEntryPersistor

Description

XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor

Versions Affected: before 2.5.9, before 3.0.0-M3

Description: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support — external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project's own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario.

Mitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.

Published: 2026-05-04

Score: 9.1 Critical

EPSS: < 1% Very Low

KEV: No

Impact:

Action:

Analysis

Impact

Apache OpenNLP’s DictionaryEntryPersistor initializes a SAXParserFactory that allows external entity resolution and DOCTYPE declarations, creating an XXE vulnerability (CWE-611). When the public Dictionary(InputStream) constructor processes a user‑supplied dictionary file, an attacker can craft a malicious DOCTYPE to read local files via file:// references or trigger server‑side request forgery using http:// references. This leads to disclosure of sensitive files or internal resources before any dictionary entry is processed. The impact is a compromise of confidentiality.

Affected Systems

The vulnerability affects Apache OpenNLP versions prior to 2.5.9 and prior to 3.0.0-M3. The affected vendor is the Apache Software Foundation for its OpenNLP library. Users running these older releases and loading dictionaries through the public API are at risk.

Risk and Exploitability

The EPSS score is < 1% and the vulnerability is not listed in the CISA KEV catalog, but the lack of a secure parser makes the flaw exploitable when a trusted dictionary source is not enforced. Exploitation requires the attacker to supply a crafted dictionary file, which is realistic for deployments that load user‑supplied dictionaries. The CVSS score of 9.1 indicates high severity, indicating that successful exploitation can lead to significant confidentiality compromise.

Default status is the baseline for the product, each version can override it (e.g. patched versions marked unaffected).

Vendor Product Default status Versions

Apache Software Foundation

Apache OpenNLP

unaffected

Version	Status	Constraints
`0`	affected	< 2.5.9
`3.0`	affected	< 3.0.0-M3

Configuration 1 [-]

OR	cpe:2.3:a:apache:opennlp::::::::
	cpe:2.3:a:apache:opennlp:3.0.0:m1::::::
	cpe:2.3:a:apache:opennlp:3.0.0:m2::::::

No data.

Vendor Product Confidence Versions

Apache

Opennlp

95%

Version	Status	Scheme	Platform
`[0,2.5.9)`	affected	semver	—
`[3.0,3.0.0-M3)`	affected	semver	—

Found an issue or want to improve our Enrichment? You can suggest it directly by opening an issue on our dedicated GitHub repository .

Remediation

No vendor fix or workaround currently provided.

OpenCVE Recommended Actions

Upgrade Apache OpenNLP to release 2.5.9 or later (for 2.x) or to 3.0.0‑M3 or later (for 3.x).
Validate that all dictionary files come from trusted, authenticated sources before they reach the parsing component; consider refusing to load files from unknown or external locations.
Implement a pre‑validation wrapper around the Dictionary(InputStream) constructor that rejects any XML containing a DOCTYPE declaration, preventing the parser from processing external entities.

Generated by OpenCVE AI on May 5, 2026 at 17:54 UTC.

Tracking

Sign in to view the affected projects.

Advisories

Source	ID	Title
Github GHSA	GHSA-4v8g-86x5-3vrc	Apache OpenNLP DictionaryEntryPersistor Vulnerable to XML External Entity (XXE) via Unsanitized Dictionary Parsing

No CVSS v4.0

Attack Vector Network

Attack Complexity Low

Privileges Required None

Scope Unchanged

Confidentiality Impact High

Integrity Impact High

Availability Impact None

User Interaction None

No CVSS v3.0

No CVSS v2

This CVE is not in the KEV list.

The EPSS score is 0.00403.

Exploitation none

Automatable yes

Technical Impact total

References

Link	Providers
http://www.openwall.com/lists/oss-security/2026/05/01/19
https://lists.apache.org/thread/r6jpt0qr9nj67gqhppqg7jxf8vsbo0w6
https://nvd.nist.gov/vuln/detail/CVE-2026-40682
https://www.cve.org/CVERecord?id=CVE-2026-40682

History

Tue, 26 May 2026 12:15:00 +0000

Type	Values Removed	Values Added
References		https://nvd.nist.gov/vuln/detail/CVE-2026-40682 https://www.cve.org/CVERecord?id=CVE-2026-40682
Metrics	threat_severity `None`	threat_severity `Important`

Wed, 06 May 2026 18:15:00 +0000

Type	Values Removed	Values Added
CPEs		cpe:2.3:a:apache:opennlp:::::::: cpe:2.3:a:apache:opennlp:3.0.0:m1:::::: cpe:2.3:a:apache:opennlp:3.0.0:m2::::::

Tue, 05 May 2026 17:30:00 +0000

Type	Values Removed	Values Added
Metrics		ssvc `{'options': {'Automatable': 'yes', 'Exploitation': 'none', 'Technical Impact': 'total'}, 'version': '2.0.3'}`

Tue, 05 May 2026 16:30:00 +0000

Type	Values Removed	Values Added
Metrics		cvssV3_1 `{'score': 9.1, 'vector': 'CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N'}`

Mon, 04 May 2026 19:15:00 +0000

Type	Values Removed	Values Added
First Time appeared		Apache Apache opennlp
Vendors & Products		Apache Apache opennlp

Mon, 04 May 2026 18:30:00 +0000

Type	Values Removed	Values Added
References		http://www.openwall.com/lists/oss-security/2026/05/01/19

Mon, 04 May 2026 17:15:00 +0000

Type	Values Removed	Values Added
Description		XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor Versions Affected: before 2.5.9, before 3.0.0-M3 Description: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support — external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project's own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario. Mitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.
Title		Apache OpenNLP: XXE via Dictionary Parsing in DictionaryEntryPersistor
Weaknesses		CWE-611
References		https://lists.apache.org/thread/r6jpt0qr9nj67gqhppqg7jxf8vsbo0w6

Subscriptions

Apache Opennlp

MITRE

Status: PUBLISHED

Assigner: apache

Published: 2026-05-04T16:55:55.834Z

Updated: 2026-05-05T15:02:14.483Z

Reserved: 2026-04-14T17:21:09.189Z

Link: CVE-2026-40682

Vulnrichment

Updated: 2026-05-04T17:36:52.681Z

NVD

Status : Analyzed

Published: 2026-05-04T17:16:23.657

Modified: 2026-06-17T10:45:33.903

Link: CVE-2026-40682

Redhat

Severity : Important

Publid Date: 2026-05-04T16:55:55Z

Links: CVE-2026-40682 - Bugzilla

OpenCVE Enrichment

Updated: 2026-05-05T18:00:13Z

Weaknesses

CWE-611
Improper Restriction of XML External Entity Reference

Impact

Affected Systems

Risk and Exploitability

Tracking

Attack Vector Network

Attack Complexity Low

Privileges Required None

Scope Unchanged

Confidentiality Impact High

Integrity Impact High

Availability Impact None

User Interaction None

Exploitation none

Automatable yes

Technical Impact total

Subscriptions

JSON object

JSON object

JSON object

JSON object

JSON object