从US-ASCII强制编码为UTF-8(iconv) [英] Force encode from US-ASCII to UTF-8 (iconv)
问题描述
我正在尝试将一堆文件从US-ASCII转码为UTF-8.
I'm trying to transcode a bunch of files from US-ASCII to UTF-8.
为此,我正在使用iconv:
For that, I'm using iconv:
iconv -f US-ASCII -t UTF-8 file.php > file-utf8.php
我的原始文件是US-ASCII编码的,因此无法进行转换.显然是因为ASCII是UTF-8的子集...
My original files are US-ASCII encoded, which makes the conversion not happen. Apparently it occurs because ASCII is a subset of UTF-8...
除非使用非ASCII格式,否则无需显示文本文件 字符介绍
There's no need for the textfile to appear otherwise until non-ASCII characters are introduced
是的.如果我在文件中引入了非ASCII字符并将其保存,请使用 Eclipse ,文件编码(字符集)将切换为UTF-8.
True. If I introduce a non-ASCII character in the file and save it, let's say with Eclipse, the file encoding (charset) is switched to UTF-8.
就我而言,我想强制iconv将文件转码为UTF-8 .是否包含非ASCII字符.
In my case, I'd like to force iconv to transcode the files to UTF-8 anyway. Whether there is non-ASCII characters in it or not.
注意:原因是我的PHP代码(非ASCII文件...)正在处理一些非ASCII字符串,这导致字符串无法正确解释(法语):
Note: The reason is my PHP code (non-ASCII files...) is dealing with some non-ASCII string, which causes the strings not to be well interpreted (french):
Ilétait une fois ... l'hommesérieanimée mythique d'Albert
Il était une fois... l'homme série animée mythique d'Albert
Barillé(Procidis),1ére
Barillé (Procidis), 1ère
...
-
US ASCII
-是-UTF-8
的子集(请参见内德的回答下方) - 这意味着美国ASCII文件 实际上是用
UTF-8
编码的
- 我的问题出在其他地方
US ASCII
-- is -- a subset ofUTF-8
(see Ned's answer below)- Meaning that US ASCII files are actually encoded in
UTF-8
- My problem came from somewhere else
推荐答案
ASCII是UTF-8的子集,因此所有ASCII文件均已采用UTF-8编码. ASCII文件中的字节和将其编码为UTF-8"所产生的字节将完全相同.它们之间没有区别,所以不需要做任何事情.
ASCII is a subset of UTF-8, so all ASCII files are already UTF-8 encoded. The bytes in the ASCII file and the bytes that would result from "encoding it to UTF-8" would be exactly the same bytes. There's no difference between them, so there's no need to do anything.
您的问题似乎是文件实际上不是ASCII.您需要确定他们正在使用哪种编码,并对其进行正确的转码.
It looks like your problem is that the files are not actually ASCII. You need to determine what encoding they are using, and transcode them properly.
这篇关于从US-ASCII强制编码为UTF-8(iconv)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!