Skip to content

mb_detect_encoding does not return the first matching encoding anymore #8279

Closed
@come-nc

Description

@come-nc

Description

The following code:

<?php

$str = '/dav/files/admin/%C3%BC%C3%B6%C3%A4%C3%B6%C3%A4%C3%BC%C3%B6%C3%A4%C3%BB%C5%B7%C3%AE';
$rawstr = rawurldecode($str);

var_dump(
    mb_detect_encoding($rawstr, ['UTF-8', 'ISO-8859-1']),
    mb_detect_encoding($rawstr, ['ISO-8859-1', 'UTF-8']),
    mb_check_encoding($rawstr, 'ISO-8859-1'),
    mb_check_encoding($rawstr, 'UTF-8'),
);

https://3v4l.org/kqHre

Resulted in this output:

string(10) "ISO-8859-1"
string(10) "ISO-8859-1"
bool(true)
bool(true)

But I expected this output instead:

string(5) "UTF-8"
string(10) "ISO-8859-1"
bool(true)
bool(true)

It seems the behavior of mb_detect_encoding changed in PHP 8.1, not clear if this is on purpose or not.
The documentation of mb_detect_encoding suggest that it will return the first matching encoding, which it does up until PHP 8.0
But with 8.1 it returns iso even if mb_check_encoding returns true for both utf and iso.

PHP Version

8.1

Operating System

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions