Impact
The flaw exists in Sentencepiece releases earlier than 0.2.1 when a model file that is not produced through the normal training workflow is loaded. The improper bounds check during model file parsing triggers an invalid memory access, which can cause the application to crash or corrupt execution flow. An attacker who can supply such a malicious model file may be able to disrupt service availability and, depending on the environment, create an opportunity for more serious compromise.
Affected Systems
Google Sentencepiece software, all versions below 0.2.1. The vulnerability affects any deployment that loads third‑party model files, such as in machine‑learning pipelines that rely on Sentencepiece for tokenization.
Risk and Exploitability
The CVSS score of 8.5 classifies this as high severity. An EPSS score of less than 1% indicates that the likelihood of exploitation is currently very low, and the vulnerability is not listed in the CISA KEV catalog. Exploitation requires the attacker to supply a malformed model file; therefore, the attack vector is inferred to be local or remote where the application accepts external model files. Systems that validate model files or run Sentencepiece in a sandboxed environment will reduce exposure, although the lack of public exploits means that presently the risk is largely theoretical and limited to environments that use vulnerable versions.
OpenCVE Enrichment
Github GHSA