从US-ASCII强制编码为UTF-8(iconv) [英] Force encode from US-ASCII to UTF-8 (iconv)

查看:1159
本文介绍了从US-ASCII强制编码为UTF-8(iconv)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将一堆文件从US-ASCII转码为UTF-8.

I'm trying to transcode a bunch of files from US-ASCII to UTF-8.

为此,我正在使用iconv:

For that, I'm using iconv:

iconv -f US-ASCII -t UTF-8 file.php > file-utf8.php

我的原始文件是US-ASCII编码的,因此无法进行转换.显然是因为ASCII是UTF-8的子集...

My original files are US-ASCII encoded, which makes the conversion not happen. Apparently it occurs because ASCII is a subset of UTF-8...

并引用:

除非使用非ASCII格式,否则无需显示文本文件 字符介绍

There's no need for the textfile to appear otherwise until non-ASCII characters are introduced

是的.如果我在文件中引入了非ASCII字符并将其保存,请使用 Eclipse ,文件编码(字符集)将切换为UTF-8.

True. If I introduce a non-ASCII character in the file and save it, let's say with Eclipse, the file encoding (charset) is switched to UTF-8.

就我而言,我想强制iconv将文件转码为UTF-8 .是否包含非ASCII字符.

In my case, I'd like to force iconv to transcode the files to UTF-8 anyway. Whether there is non-ASCII characters in it or not.

注意:原因是我的PHP代码(非ASCII文件...)正在处理一些非ASCII字符串,这导致字符串无法正确解释(法语):

Note: The reason is my PHP code (non-ASCII files...) is dealing with some non-ASCII string, which causes the strings not to be well interpreted (french):

Ilétait une fois ... l'hommesérieanimée mythique d'Albert

Il était une fois... l'homme série animée mythique d'Albert

Barillé(Procidis),1ére

Barillé (Procidis), 1ère

...

  • US ASCII--UTF-8的子集(请参见内德的回答下方)
  • 这意味着美国ASCII文件 实际上是用UTF-8
  • 编码的
  • 我的问题出在其他地方
    • US ASCII -- is -- a subset of UTF-8 (see Ned's answer below)
    • Meaning that US ASCII files are actually encoded in UTF-8
    • My problem came from somewhere else
    • 推荐答案

      ASCII是UTF-8的子集,因此所有ASCII文件均已采用UTF-8编码. ASCII文件中的字节和将其编码为UTF-8"所产生的字节将完全相同.它们之间没有区别,所以不需要做任何事情.

      ASCII is a subset of UTF-8, so all ASCII files are already UTF-8 encoded. The bytes in the ASCII file and the bytes that would result from "encoding it to UTF-8" would be exactly the same bytes. There's no difference between them, so there's no need to do anything.

      您的问题似乎是文件实际上不是ASCII.您需要确定他们正在使用哪种编码,并对其进行正确的转码.

      It looks like your problem is that the files are not actually ASCII. You need to determine what encoding they are using, and transcode them properly.

      这篇关于从US-ASCII强制编码为UTF-8(iconv)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆