Skip to content

[RFC] Add a locale for grapheme case-insensitive functions #18792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add locale for grapheme_levenshtein function
  • Loading branch information
youkidearitai committed Jun 16, 2025
commit 8f2e5440d22bd194f42822d92c8c536472fba5d1
5 changes: 4 additions & 1 deletion ext/intl/grapheme/grapheme_string.c
Original file line number Diff line number Diff line change
Expand Up @@ -934,6 +934,8 @@ PHP_FUNCTION(grapheme_levenshtein)
zend_long cost_ins = 1;
zend_long cost_rep = 1;
zend_long cost_del = 1;
char *locale = "";
size_t locale_len = 0;

ZEND_PARSE_PARAMETERS_START(2, 5)
Z_PARAM_STR(string1)
Expand All @@ -942,6 +944,7 @@ PHP_FUNCTION(grapheme_levenshtein)
Z_PARAM_LONG(cost_ins)
Z_PARAM_LONG(cost_rep)
Z_PARAM_LONG(cost_del)
Z_PARAM_STRING_OR_NULL(locale, locale_len)
ZEND_PARSE_PARAMETERS_END();

if (cost_ins <= 0 || cost_ins > UINT_MAX / 4) {
Expand Down Expand Up @@ -1058,7 +1061,7 @@ PHP_FUNCTION(grapheme_levenshtein)
RETVAL_FALSE;
goto out_bi2;
}
UCollator *collator = ucol_open("", &ustatus);
UCollator *collator = ucol_open(locale, &ustatus);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the locale parameter is passed in as an actual NULL, wouldn't this fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. I should change to signature of RFC. Thanks.

if (U_FAILURE(ustatus)) {
intl_error_set_code(NULL, ustatus);

Expand Down
2 changes: 1 addition & 1 deletion ext/intl/php_intl.stub.php
Original file line number Diff line number Diff line change
Expand Up @@ -447,7 +447,7 @@ function grapheme_stristr(string $haystack, string $needle, bool $beforeNeedle =

function grapheme_str_split(string $string, int $length = 1): array|false {}

function grapheme_levenshtein(string $string1, string $string2, int $insertion_cost = 1, int $replacement_cost = 1, int $deletion_cost = 1): int|false {}
function grapheme_levenshtein(string $string1, string $string2, int $insertion_cost = 1, int $replacement_cost = 1, int $deletion_cost = 1, ?string $locale = null): int|false {}

/** @param int $next */
function grapheme_extract(string $haystack, int $size, int $type = GRAPHEME_EXTR_COUNT, int $offset = 0, &$next = null): string|false {}
Expand Down
3 changes: 2 additions & 1 deletion ext/intl/php_intl_arginfo.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions ext/intl/tests/grapheme_levenshtein.phpt
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,10 @@ try {
} catch (ValueError $e) {
echo $e->getMessage() . PHP_EOL;
}

echo '--- Locale string ---' . \PHP_EOL;
var_dump(grapheme_stripos("i", "\u{0130}", 0, "tr_TR"));
var_dump(grapheme_stripos("i", "\u{0130}", 0, "en_US"));
?>
--EXPECTF--
--- Equal ---
Expand Down Expand Up @@ -126,3 +130,6 @@ int(0)
grapheme_levenshtein(): Argument #3 ($insertion_cost) must be greater than 0 and less than or equal to %d
grapheme_levenshtein(): Argument #4 ($replacement_cost) must be greater than 0 and less than or equal to %d
grapheme_levenshtein(): Argument #5 ($deletion_cost) must be greater than 0 and less than or equal to %d
--- Locale string ---
int(0)
bool(false)