CVE 5.9 MEDIUM

vLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models_CVE-2026-34760

5.9 / 10
MEDIUM
CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:L

Description

vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.

Basic Information

ID CVE-2026-34760
Source GitHub_M
Published Apr 2, 2026 at 18:59
Modified Apr 3, 2026 at 14:42

Affected Product

Vendor vllm-project
Product vllm
Version >= 0.5.5, < 0.18.0
Affected Versions vllm-project vllm >= 0.5.5, < 0.18.0

CWE Classification

References

💭 Join the Security Discussion

🔒 Your email address will not be published. Required fields are marked *

⚠️ Please be respectful and constructive in your comments. Security discussions should remain professional.