CVE-2026-42440 - Vulnerability Details

- Apache OpenNLP: OOM DoS via Unbounded Array Allocation in AbstractModelReader

Description

OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader

Versions Affected:

before 1.9.5
before 2.5.9

before 3.0.0-M3

Description:

The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source.

A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load.

The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.

Mitigation:

* 2.x users should upgrade to 2.5.9.

* 3.x users should upgrade to 3.0.0-M3.

Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default.

Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.

Published: 2026-05-04

Score: 7.5 High

EPSS: < 1% Very Low

KEV: No

Impact:

Action:

Analysis

Impact

AbstractModelReader in Apache OpenNLP allocates arrays for outcomes, outcome patterns, and predicates by using a 32‑bit signed integer read from a binary .bin model file, without validating the value. Because the count is attacker‑controlled, a malicious model can set it to Integer.MAX_VALUE or another large number, causing the JVM to attempt an out‑of‑range allocation and fail with an OutOfMemoryError during deserialization. This failure to enforce bounds on resource allocation (CWE-770) and unchecked resource consumption (CWE-789) leads to a denial‑of‑service condition when a process loads a malicious model from an untrusted source.

Affected Systems

The vulnerability affects Apache OpenNLP prior to versions 2.5.9 and 3.0.0-M3. Any component that loads a .bin model file—such as GenericModelReader or higher‑level utilities— is impacted if it processes models from untrusted or semi‑trusted origins.

Risk and Exploitability

The risk is high in environments where user or third‑party supplied model files are accepted, because an attacker can trigger a crash with a single, lightweight file. The CVSS score is 7.5, indicating significant potential for automated exploitation. The EPSS score of <1% indicates a very low but non‑zero likelihood of exploitation, and the vulnerability is not listed in CISA’s KEV catalog, which does not alter the possibility of a denial‑of‑service attack. The attack vector is local or remote, depending on whether the application accepts model files over a network or from untrusted users.

Default status is the baseline for the product, each version can override it (e.g. patched versions marked unaffected).

Vendor Product Default status Versions

Apache Software Foundation

Apache OpenNLP

unaffected

Version	Status	Constraints
`2.0`	affected	< 2.5.9
`3.0.0-M1`	affected	< 3.0.0-M3
`0`	affected	< 1.9.5

Configuration 1 [-]

OR	cpe:2.3:a:apache:opennlp::::::::
	cpe:2.3:a:apache:opennlp:3.0.0:m1::::::
	cpe:2.3:a:apache:opennlp:3.0.0:m2::::::

No data.

Vendor Product Confidence Versions

Apache

Opennlp

95%

Version	Status	Scheme	Platform
`[0,2.5.9)`	affected	semver	—
`[3.0,3.0.0-M3)`	affected	semver	—

Found an issue or want to improve our Enrichment? You can suggest it directly by opening an issue on our dedicated GitHub repository .

Remediation

No vendor fix or workaround currently provided.

OpenCVE Recommended Actions

Upgrade to Apache OpenNLP 2.5.9 or newer 3.0.0-M3 so that count values are bounded before array allocation.
Do not load .bin model files that originate from untrusted sources; verify the provenance or integrity of the file before deserialization.
If larger model entry counts are required, set the OPENNLP_MAX_ENTRIES JVM property to a safe value (e.g., -DOPENNLP_MAX_ENTRIES=50000000) while ensuring the value is positive and within system limits.

Generated by OpenCVE AI on June 29, 2026 at 21:28 UTC.

Tracking

Sign in to view the affected projects.

Advisories

Source	ID	Title
Github GHSA	GHSA-659w-93r5-9j6m	Apache OpenNLP AbstractModelReader has an OOM Denial of Service via Unbounded Array Allocation

No CVSS v4.0

Attack Vector Network

Attack Complexity Low

Privileges Required None

Scope Unchanged

Confidentiality Impact None

Integrity Impact None

Availability Impact High

User Interaction None

No CVSS v3.0

No CVSS v2

This CVE is not in the KEV list.

The EPSS score is 0.00627.

Exploitation none

Automatable yes

Technical Impact partial

References

Link	Providers
http://www.openwall.com/lists/oss-security/2026/05/01/21
https://lists.apache.org/thread/s8xlkx1gqbxfsq48py5h6jphjvgqp1jo
https://nvd.nist.gov/vuln/detail/CVE-2026-42440
https://www.cve.org/CVERecord?id=CVE-2026-42440

History

Mon, 29 Jun 2026 20:15:00 +0000

Type	Values Removed	Values Added
Description	OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader Versions Affected: before 2.5.9 before 3.0.0-M3 Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins. Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.	OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader Versions Affected: before 1.9.5 before 2.5.9 before 3.0.0-M3 Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins. Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.

Tue, 26 May 2026 12:15:00 +0000

Type	Values Removed	Values Added
Weaknesses		CWE-770
References		https://nvd.nist.gov/vuln/detail/CVE-2026-42440 https://www.cve.org/CVERecord?id=CVE-2026-42440
Metrics	threat_severity `None`	threat_severity `Important`

Wed, 06 May 2026 18:15:00 +0000

Type	Values Removed	Values Added
CPEs		cpe:2.3:a:apache:opennlp:::::::: cpe:2.3:a:apache:opennlp:3.0.0:m1:::::: cpe:2.3:a:apache:opennlp:3.0.0:m2::::::

Tue, 05 May 2026 16:15:00 +0000

Type	Values Removed	Values Added
Metrics		cvssV3_1 `{'score': 7.5, 'vector': 'CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H'}` ssvc `{'options': {'Automatable': 'yes', 'Exploitation': 'none', 'Technical Impact': 'partial'}, 'version': '2.0.3'}`

Mon, 04 May 2026 19:30:00 +0000

Type	Values Removed	Values Added
First Time appeared		Apache Apache opennlp
Vendors & Products		Apache Apache opennlp

Mon, 04 May 2026 18:30:00 +0000

Type	Values Removed	Values Added
References		http://www.openwall.com/lists/oss-security/2026/05/01/21

Mon, 04 May 2026 17:15:00 +0000

Type	Values Removed	Values Added
Description		OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader Versions Affected: before 2.5.9 before 3.0.0-M3 Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins. Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.
Title		Apache OpenNLP: OOM DoS via Unbounded Array Allocation in AbstractModelReader
Weaknesses		CWE-789
References		https://lists.apache.org/thread/s8xlkx1gqbxfsq48py5h6jphjvgqp1jo

Subscriptions

Apache Opennlp

MITRE

Status: PUBLISHED

Assigner: apache

Published: 2026-05-04T16:40:32.503Z

Updated: 2026-07-15T00:57:36.565Z

Reserved: 2026-04-27T12:43:14.347Z

Link: CVE-2026-42440

Vulnrichment

Updated: 2026-05-04T17:37:00.275Z

NVD

Status : Analyzed

Published: 2026-05-04T17:16:26.147

Modified: 2026-06-17T10:47:50.823

Link: CVE-2026-42440

Redhat

Severity : Important

Publid Date: 2026-05-04T16:40:32Z

Links: CVE-2026-42440 - Bugzilla

OpenCVE Enrichment

Updated: 2026-06-29T21:30:03Z

Weaknesses

CWE-770
Allocation of Resources Without Limits or Throttling
CWE-789
Memory Allocation with Excessive Size Value

Impact

Affected Systems

Risk and Exploitability

Tracking

Attack Vector Network

Attack Complexity Low

Privileges Required None

Scope Unchanged

Confidentiality Impact None

Integrity Impact None

Availability Impact High

User Interaction None

Exploitation none

Automatable yes

Technical Impact partial

Subscriptions

JSON object

JSON object

JSON object

JSON object

JSON object