在字符集之间转换文本文件的最佳方法? [英] Best way to convert text files between character sets?

查看:110
本文介绍了在字符集之间转换文本文件的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在字符集之间转换文本文件的最快,最简单的工具或方法是什么?

What is the fastest, easiest tool or method to convert text files between character sets?

具体来说,我需要从UTF-8转换为ISO-8859-15,反之亦然.

Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa.

一切皆有:用您喜欢的脚本语言,命令行工具或其他适用于OS,网站等的实用工具进行一线操作

Everything goes: one-liners in your favorite scripting language, command-line tools or other utilities for OS, web sites, etc.

在Linux/UNIX/OS X/cygwin上:

On Linux/UNIX/OS X/cygwin:

  • Gnu iconv 由<最好使用href ="https://stackoverflow.com/questions/64860/best-way-to-convert-text-files-between-character-sets#64889"> Troels Arvin 过滤器.它似乎是普遍可用的.示例:

  • Gnu iconv suggested by Troels Arvin is best used as a filter. It seems to be universally available. Example:

$ iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt

Ben 指出,有一个使用iconv的在线转换器.

Gnu 重新编码( Cheekysoft 原位转换一个或多个文件.示例:

Gnu recode (manual) suggested by Cheekysoft will convert one or several files in-place. Example:

$ recode UTF8..ISO-8859-15 in.txt

此别名使用较短的别名:

This one uses shorter aliases:

$ recode utf8..l9 in.txt

Recode还支持 surfaces ,可用于在不同的行尾类型和编码之间进行转换:

Recode also supports surfaces which can be used to convert between different line ending types and encodings:

将换行符从LF(Unix)转换为CR-LF(DOS):

Convert newlines from LF (Unix) to CR-LF (DOS):

$ recode ../CR-LF in.txt

Base64编码文件:

Base64 encode file:

$ recode ../Base64 in.txt

您也可以将它们组合在一起.

You can also combine them.

将带有Unix行尾的Base64编码的UTF8文件转换为带有Dos行尾的Base64编码的Latin 1文件:

Convert a Base64 encoded UTF8 file with Unix line endings to Base64 encoded Latin 1 file with Dos line endings:

$ recode utf8/Base64..l1/CR-LF/Base64 file.txt

在具有 Powershell (Jay Bazuzi ):

  • PS C:\> gc -en utf8 in.txt | Out-File -en ascii out.txt

(尽管没有ISO-8859-15支持;它说支持的字符集是unicode,utf7,utf8,utf32,ascii,bigendianunicode,default和oem.)

(No ISO-8859-15 support though; it says that supported charsets are unicode, utf7, utf8, utf32, ascii, bigendianunicode, default, and oem.)

您的意思是对iso-8859-1的支持吗?使用字符串"可以做到这一点,例如反之亦然

Do you mean iso-8859-1 support? Using "String" does this e.g. for vice versa

gc -en string in.txt | Out-File -en utf8 out.txt

注意:可能的枚举值为未知,字符串,Unicode,字节,BigEndianUnicode,UTF8,UTF7,Ascii".

Note: The possible enumeration values are "Unknown, String, Unicode, Byte, BigEndianUnicode, UTF8, UTF7, Ascii".

  • CsCvt - Kalytta's Character Set Converter is another great command line based conversion tool for Windows.

推荐答案

独立实用程序方法

iconv -f ISO-8859-1 -t UTF-8 in.txt > out.txt

-f ENCODING  the encoding of the input
-t ENCODING  the encoding of the output

您不必指定这两个参数.它们将默认为您当前的语言环境,通常为UTF-8.

You don't have to specify either of these arguments. They will default to your current locale, which is usually UTF-8.

这篇关于在字符集之间转换文本文件的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆