Skip to main content

Questions tagged [character-encoding]

Questions that deal with various representations of characters & character sets, such as: ASCII, UTF-8, EBCDIC, among others. Often encountered when moving files between operating systems that encode new lines with carriage returns and/or newline characters.

2 votes
1 answer
110 views

At work, on an Ubuntu 22.04.1 I'm willing to use apt_auth.conf abilty of apt to ease me getting packages from an artifactory. I've wrote my artifactory.conf file into /etc/apt/apt.conf.d that way: ...
Marc Le Bihan's user avatar
7 votes
1 answer
281 views

There seems to be limited/inconsistent support for unusual but legal characters in zsh (and sh, bash) shell variable names on mac. Is there any way to fix this for full or better support? Perhaps this ...
owengall's user avatar
  • 173
3 votes
2 answers
186 views

The Issue I've been parsing a file with sed trying to tweeze out the desired data. This has worked fine for most lines in the file but there appears to be some embedded special characters that are ...
Gandalf's user avatar
  • 33
1 vote
1 answer
121 views

I'm opening an SSH session from Fedora to Raspberry Pi OS. Accented and special characters are replaced with question marks. Preferably I would like to learn to solve this without changing the server'...
Cutter's user avatar
  • 71
2 votes
1 answer
114 views

Today I connected to a long-running process in tmux over ssh for work, to find that the pane the process was running in seems to have started using the wrong character encoding for its output, leading ...
Patronics's user avatar
  • 125
6 votes
1 answer
399 views

I have a file СМП бваг™вга† The first three letters are proper Cyrllic and the remaining part is mojibake. "Mojibake is the garbled or gibberish text that is the result of text being decoded ...
jsx97's user avatar
  • 1,387
0 votes
1 answer
94 views

It is my understanding that the LANG and LC_CTYPE environment variables define the encoding used by shell commands when writing to stdout. However, after executing LANG=de_DE.iso88591 LC_CTYPE=de_DE....
userAcgJllhSe's user avatar
0 votes
2 answers
270 views

Is it advisable to have or not Byte Order Mark (BOM) in UTF-8 text files on Linux? Is it correct to say byte order (even for multi-byte characters) is already strictly defined/fixed in UTF-8 standard? ...
strider's user avatar
  • 113
0 votes
0 answers
114 views

Looking for advanced CLI tool/code to determine text Codepage/Language (besides enca). Goal: Automate as much as possible conversion of hundreds/thousands of 8-bit text files (including non-ASCII ...
strider's user avatar
  • 113
0 votes
0 answers
35 views

I'm a bit confuse about a behavior of the file -i command. I searched a while and give up since I didn't have a sufficient knowledge regarding encoding as well as linux file command (to stay concise ...
ollie314's user avatar
  • 101
-2 votes
1 answer
91 views

Wrong encoding: 1 00:01:27,879 --> 00:01:31,216 No i dupa. Koniec z darmowym wi-fi. 2 00:01:33,009 --> 00:01:34,972 - Ki-jung! - No? 3 00:01:35,219 --> 00:01:39,183 Kobieta z góry ...
jirafey's user avatar
2 votes
1 answer
180 views

I have a huge number of files spread across a large directory structure that have hidden control characters in their names. ls lists them as, e.g.: '614.7-4-F1-00-090-007-RozvadØ'$'\302\237'' RP1-...
atapaka's user avatar
  • 675
1 vote
1 answer
166 views

I'd like to use my old VT420 terminal as system console. Adding RS232 ports and setting up serial-getty are not a problem, but: For years, almost all Linux distros have been using UTF-8 as the ...
Neppomuk's user avatar
  • 364
11 votes
3 answers
2k views

I would like to include a couple of non-ASCII characters in my POSIX shell script comments. Note this is in no way a duplicate of e.g. "Which character encodings are supported by posix?" as ...
Vlastimil Burián's user avatar
0 votes
1 answer
164 views

Sorry if this is a repeat or basic question but it is hard to search for a ™. I'm writing a script to remove weird characters from file names. How come the trade mark symbol ™ matches [^a-z] ??? $ ...
codywohlers's user avatar

15 30 50 per page
1
2 3 4 5
29