PACKETSTORM

📄 PHP 8.5.7 levenshtein() Signed-Integer Overflow_PACKETSTORM:223963

Description

The levenshtein function calculates the Levenshtein distance between two strings, optionally accepting custom costs for insertion, replacement, and deletion operations. In PHP version 8.5.7, the implementation lacks proper bounds checking for these...
Visit Original Source

Basic Information

ID PACKETSTORM:223963
Published Jun 22, 2026 at 00:00

Affected Product

Affected Versions # PHP 8.5.7 `levenshtein()` signed-integer overflow

**Author:** Khashayar Fereidani
**Disclosure Date:** 2026-06-18
**Advisory:** https://fereidani.com/php-857-levenshtein-signed-integer-overflow
**Contact:** https://fereidani.com/contact

## Description

The `levenshtein()` function calculates the Levenshtein distance
between two strings, optionally accepting custom costs for insertion,
replacement, and deletion operations. In PHP 8.5.7, the implementation
lacks proper bounds checking for these cost parameters. When
exceptionally large values (such as `PHP_INT_MAX`) are provided, the
arithmetic operations within the `reference_levdist()` function in
`ext/standard/levenshtein.c` result in a signed-integer overflow. This
triggers undefined behavior in C and causes the function to return a
negative distance, which is mathematically invalid.

## Proof of concept

```php
<?php
/*
* levenshtein() signed-integer overflow
* File: ext/standard/levenshtein.c reference_levdist() lines 47, 50, 53-58
*
* The user-supplied costs (cost_ins / cost_rep / cost_del, all zend_long) are
* added with NO overflow check, e.g.:
* p1[i2] = i2 * cost_ins; // line 47
* p2[0] = p1[0] + cost_del; // line 50
* c1 = p1[i2 + 1] + cost_del;// line 54 <-- PHP_INT_MAX +
PHP_INT_MAX
* c2 = p2[i2] + cost_ins; // line 58
*
* Result: signed overflow (undefined behaviour in C) producing a
* NEGATIVE edit distance, a value that is mathematically impossible.
*/
var_dump(levenshtein('a', 'b', PHP_INT_MAX, PHP_INT_MAX,
PHP_INT_MAX)); // int(-2) (should be PHP_INT_MAX)
var_dump(levenshtein('a', 'abc', PHP_INT_MAX, PHP_INT_MAX,
PHP_INT_MAX)); // int(-4)
var_dump(levenshtein('a', 'b', PHP_INT_MAX, 0,
PHP_INT_MAX)); // int(-2)
echo "All three distances are negative => signed overflow (expected >= 0).\n";
```

## Impact

The primary risk associated with this vulnerability is an application
logic flaw. Applications that rely on the `levenshtein()` function to
determine string similarity or calculate distance metrics might fail
to handle negative returns properly (for instance, treating a negative
number as `< threshold`). This can result in unexpected behavior,
incorrect data processing, or bypasses in business logic. Since it
involves integer overflow producing a negative result rather than a
memory corruption issue, the scope is generally limited to logic
disruption rather than arbitrary code execution.

## Solution

To effectively address this issue, bounds checking should be
implemented either on the cost parameters at the start of the
function, or during intermediate calculations. Utilizing safe
arithmetic macros provided by the Zend Engine can prevent the integer
overflow constraints from being violated:

```c
// Example: Adding overflow safeguards in ext/standard/levenshtein.c
if (UNEXPECTED(ZEND_SIGNED_ADD_OVERFLOWS(p1[i2 + 1], cost_del))) {
php_error_docref(NULL, E_WARNING, "Levenshtein distance
calculation caused an integer overflow");
// Handle error, e.g., return -1 or cap
}
```
An alternative and proactive measure is to restrict the inputs for
`cost_ins`, `cost_rep`, and `cost_del` before computing the distance,
ensuring that they wouldn't exceed `ZEND_LONG_MAX` when scaled
relative to the strings' lengths.

💭 Join the Security Discussion

🔒 Your email address will not be published. Required fields are marked *

⚠️ Please be respectful and constructive in your comments. Security discussions should remain professional.