Skip to content

Any-Latin; Latin-ASCII in replace_non_ascii()#71

Open
dustinstoltz wants to merge 1 commit into
trinker:masterfrom
dustinstoltz:fix-replace-non-ascii-transliteration
Open

Any-Latin; Latin-ASCII in replace_non_ascii()#71
dustinstoltz wants to merge 1 commit into
trinker:masterfrom
dustinstoltz:fix-replace-non-ascii-transliteration

Conversation

@dustinstoltz

Copy link
Copy Markdown

… scripts

Fixes #64

Previously, replace_non_ascii() used stri_trans_general(x, 'latin-ascii'), which only transliterated Latin-script characters. Non-Latin scripts (Cyrillic, CJK, Devanagari, etc.) were either left as byte sequences or stripped entirely by remove.nonconverted.

Now uses 'Any-Latin; Latin-ASCII' to first transliterate any script to Latin, then Latin to ASCII. This is backwards compatible since Any-Latin is a no-op for already-Latin input.

… scripts

Fixes trinker#64

Previously, replace_non_ascii() used stri_trans_general(x, 'latin-ascii'),
which only transliterated Latin-script characters. Non-Latin scripts
(Cyrillic, CJK, Devanagari, etc.) were either left as byte sequences
or stripped entirely by remove.nonconverted.

Now uses 'Any-Latin; Latin-ASCII' to first transliterate any script
to Latin, then Latin to ASCII. This is backwards compatible since
Any-Latin is a no-op for already-Latin input.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant