Unicode Transcriptions

The Wayback Machine - https://web.archive.org/web/20060408204540/http://www.unicode.org:80/standard/UnicodeTranscriptions.html

Consortium

Home | Site Map | Search

Contents

Chart

Notes

Related Links

What is Unicode?

Unicode Transcriptions

Script Name Text Image

Arabic (Arabic) يونِكود

Arabic (Persian) یونی‌کُد

Armenian Յունիկօդ

Bengali য়ূনিকোড

Bopomofo ㄊㄨㄥ˅ ㄧˋ ㄇㄚ˅

ㄨㄢˋ ㄍㄨㄛˊ ㄇㄚ˅

Canadian Aboriginal ᔫᗂᑰᑦ

Cherokee ᏳᏂᎪᏛ

Cyrillic (Russian) Юникод

Deseret (English) ???????

Devanagari (Hindi) यूनिकोड

Ethiopic ዩኒኮድ

Esperanto Unikodo

Georgian უნიკოდი

Greek Γιούνικοντ

Gujarati યૂનિકોડ

Gurmukhi ਯੂਨਿਕੋਡ

Han (Chinese) 统一码

統一碼

万国码

萬國碼

Hangul 유니코드

Hebrew יוניקוד

Hebrew (pointed) יוּנִיקוׁד

Hebrew (Yiddish) יוניקאָד

Hiragana (Japanese) ゆにこおど

Katakana (Japanese) ユニコード

Kannada ಯೂನಿಕೋಡ್

Khmer យូនីគោដ

Latin Unicode Unicode

Latin (IPA) ˈjunɪˌkoːd

Latin (Am. Dict.) Ūnĭcōde̽

Malayalam യൂനികോഡ്

Ogham ᚔᚒᚅᚔᚉᚑᚇ

Oriya ୟୂନିକୋଡ

Runic (Anglo-Saxon) ᛡᚢᚾᛁᚳᚩᛞ

Sinhala යණනිකෞද්
Syriac ܝܘܢܝܩܘܕ

Tamil யூனிகோட்

Telugu యూనికోడ్

Thai ยูนืโคด

Tibetan (Dzongkha) ཨུ་ནི་ཀོཌྲ།

Notes:

There are different ways to transcribe the word “Unicode”, depending on the language and script. In some cases there is only one language that customarily uses a given script; in others there are many languages. The goal here is at a minimum to collect at least one transcription for each script in a language customarily written in that script, with more languages if possible. If the transcription is the same for multiple languages in a script, then a single representative language is used.

Still missing are transcriptions for Braille, Buginese, Buhid, Coptic, Cypriot, Esperanto, Glagolitic, Ghotic, Hanunoo, Kannada, Kharoshthi, Lao, Limbu, Linear B, Mongolian, Myanmar, Old Italic, Old Persian, Osmanya, New Tai Lue, Shavian, Syloti Nagri, Tagalog, Tagbanwa, Tai Le, Thaana, Tifinagh, Ugaritic, and Yi. We would welcome information for these, and corrections to the above. Please follow the directions below and send using the Unicode reporting form.

Supplying Missing Items

Most Latin-script languages will follow the spelling, and change the pronunciation. For any that would not, it would be good to have the alternate spelling.
For non-Latin scripts the goal is to match the English pronunciation — not spelling. Above is the IPA (in phonemic transcription) that should be matched as closely as possible (without sounding affected in the target language)
Text would be best in either the UTF-8 text, or the code points in hex HTML. E.g. either of the following:

"Юникод"
"Юникод"
Note: for supplementary characters, there should be one hex number per code point, not two surrogates:

𐀀 not &#xD800;&xDC00;

Please include a GIF image. It should be 96 x 24 bits, with the text centered, in black on white (plus grays if smoothed).

Other Comments

Because some browsers won't handle the text, both text and GIF image are supplied. If you can’t read the text columns, see Display Problems.
The Chinese versions (inc. Bopomofo) are translations, not transcriptions.
There are other "translations" of Unicode that may be in use, such as the Vietnamese "Thống Nhất Mã".
For sample pages in different languages on the Unicode site, see What is Unicode?
Americans are not generally used to IPA, and find a variety of different systems in their dictionaries. This one leaves the base letters as they are, and uses diacritics for pronunciation.

Contributions

This page was originally produced by Mark Davis, drawing on contributions or comments from: Dixon Au, Joe Becker, Maurice Bauhahn, Abel Cheung, Peter Constable, Michael Everson, Christopher John Fynn, Michael Kaplan, George Kiraz, Abdul Malik, Siva Nataraja, Roozbeh Pournader, Jonathan Rosenne, Jungshik Shin.

Mar	APR	May
	08
2005	2006	2007