Get Unicode Character Value in JavaScript
Here are various ways to get Unicode character values in JavaScript
1. Using charCodeAt() to Get Unicode Values
This code defines a string letter containing the character "A". It then uses the charCodeAt() method to get the Unicode value of the first character (index 0) and logs the result, which is 65 (the Unicode value of "A").
let s = "A";
let uniCode = s.charCodeAt(0);
console.log(uniCode);
Output
65
In this example
- The variable letter is assigned the string value "A", which is a single character.
- The method charCodeAt(0) is used on the string letter. It retrieves the Unicode value of the character at the 0th index, which is "A" in this case.
- The Unicode value of "A" (which is 65) is stored in unicodeValue and logged to the console using console.log().
2. Using charCodeAt() to Work with Multiple Characters
This code iterates through the string "Hello" and for each character, it prints both the character itself and its Unicode value. The charCodeAt() method is used to retrieve the Unicode value of each character.
let s = "Hello";
for (let i = 0; i < str.length; i++) {
console.log(`Character: ${str[i]}, Unicode: ${
s.charCodeAt(i)}`);
}
Output
Character: H, Unicode: 72 Character: e, Unicode: 101 Character: l, Unicode: 108 Character: l, Unicode: 108 Character: o, Unicode: 111
In this example
- The for loop iterates through each character in the string text ("Hello"). The loop runs from index 0 to text.length - 1, meaning it processes each character in the string.
- Inside the loop, text[i] retrieves the character at the current index i. The method charCodeAt(i) is then used to get the Unicode value of the character at the same index.
- For each character, the code logs a string containing both the character and its Unicode value in the format: Character: <char>, Unicode: <value>.
3. Using fromCharCode() to Get Unicode value
This code converts a given Unicode value (65) into its corresponding character using the String .fromCharCode() method. It then logs the character ("A") to the console.
let uniCode = 65;
let char = String.fromCharCode(uniCode);
console.log(char);
Output
A
In this example
- The variable unicodeValue is set to 65, which is the Unicode value for the character "A".
- The method String.fromCharCode(unicodeValue) is used to convert the Unicode value (65) into the corresponding character, which in this case is "A".
- The result of the conversion is stored in the character variable and logged to the console, outputting "A".
4. Using codePointAt()
This code retrieves the Unicode code point of the emoji "π" using the codePointAt() method. It then logs the Unicode value (128522) of the emoji to the console.
let emoji = "π";
let uniCode = emoji.codePointAt(0);
console.log(uniCode);
Output
128522
In this example
- The variable emoji is assigned the string containing the emoji "π".
- The method codePointAt(0) is used to get the Unicode code point of the emoji at index 0, which handles characters like emojis that may have a code point greater than 65535.
- The resulting Unicode code point (128522) is stored in the unicodeValue variable and logged to the console.
5. Using the TextEncoder Class
The TextEncoder class converts a string to its corresponding UTF-8 bytes, which can help derive the Unicode values.
const encoder = new TextEncoder();
const encoded = encoder.encode('A');
console.log(encoded[0]);
Output
65
In this example
- The TextEncoder object is part of the Web API's and is used to convert a string into a sequence of bytes, encoding it in UTF-8 format by default. This is useful when you need to work with raw binary data or communicate over networks.
- encoder.encode('A') takes the string 'A' and converts it into a sequence of bytes. The result is an array of numbers where each number represents the UTF-8 encoded byte of the string.
- The encoded[0] accesses the first byte of the encoded sequence. Since 'A' in UTF-8 has the Unicode value of 65, the first byte in the array is 65.
6. Using Array.from() method
Array.from() converts a string into an array of characters, even handling surrogate pairs correctly.
const chars = Array.from('AπB');
chars.forEach(char => {
console.log(`${char}: ${char.codePointAt(0)}`);
});
Output
A: 65 π: 128522 B: 66
In this example
- Array.from('AπB') Converts the string 'AπB' into an array of individual characters, including the emoji (which is a multi-byte character).
- forEach Iterates over each character in the array, processing one character at a time.
- codePointAt(0) is used for each character, codePointAt(0) returns the Unicode code point (integer value) of that character.
7. Using String.fromCharCode()
This code converts a Unicode value back to a character. Using String.fromCharCode(), it turns the value 65 into the character 'A'.
const uniCode = 65;
const char = String.fromCharCode(uniCode);
console.log(char);
Output
A
In this example
- Unicode value 65 represents the Unicode value for the character 'A'.
- String.fromCharCode() method converts the given Unicode value (65) into its corresponding character ('A').
8. Using Spread Operator with codePointAt()
This code demonstrates how to handle characters with different byte sizes in a string. It converts each character, including the emoji-like symbol, into its respective Unicode value using codePointAt().
const s = "HelloπWorld";
const uniCode = [...s].map(char =>
char.codePointAt(0));
console.log(uniCode);
Output
[ 72, 101, 108, 108, 111, 119070, 87, 111, 114, 108, 100 ]
In this example
- Spread Operator [...str] splits the string str into an array of individual characters, including multi-byte characters like π.
- map() with codePointAt() The map() function applies codePointAt(0) to each character, returning its Unicode code point (e.g., π becomes 119070).