使用groovy将ISO-8859-1转换为UTF-8 [英] Convert ISO-8859-1 to UTF-8 using groovy

查看:357
本文介绍了使用groovy将ISO-8859-1转换为UTF-8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将ISO-8859-1文件转换为utf-8编码,而不会丢失内容配置...

i need to convert a ISO-8859-1 file to utf-8 encoding, without loosing content intormations...

我有一个看起来像这样的文件:

i have a file which looks like this:

<?xml version="1.0" encoding="ISO-8859-1" ?> 
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>

我不想将其编码为UTF-8。
我尝试以下:

Not i want to encode it into UTF-8. I tried following:

f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
ts=new String(f.getBytes("UTF-8"), "UTF-8")
g=new File('c:/temp/myutf8.xml').write(ts)

由于String不兼容,没有工作。
然后我读了一些关于bytestreamerers / writers / streamingmarkupbuilder等的信息...

didnt work due to String incompatibilities. Then i read something about bytestreamreaders/writers/streamingmarkupbuilder and other...

然后我试过

f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
mb = new groovy.xml.StreamingMarkupBuilder()
mb.encoding = "UTF-8"

new OutputStreamWriter(new FileOutputStream('c:/temp/myutf8.xml'),'utf-8') << mb.bind {
    mkp.xmlDeclaration()
    out << f
}

这完全不是我想要的..

this was totally not that what i wanted..

我只想使用ISO-8859-1读取器获取xml读取的内容,然后将其放入新的(旧)文件中...为什么这么复杂: /

I just want to get the content of an xml read with an ISO-8859-1 reader and then put it into a new (old) file... why this is so complicated :-/

结果应该是,文件应该是真正的编码在utf-8:

The result should just be, and the file should be really encoded in utf-8:

<?xml version="1.0" encoding="UTF-8" ?> 
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>

感谢任何答案
干杯

Thanks for any answers Cheers

推荐答案

def f=new File('c:/data/myiso88591.xml').getText('ISO-8859-1')
new File('c:/data/myutf8.xml').write(f,'utf-8')

(我刚刚试了一下,它的作用是: - )

(I just gave it a try, it works :-)

与java相同:图书馆为您进行转换.. 。
作为deceze说:当你指定一个编码,它将被转换为内部格式(utf-16 afaik)。当您编写字符串时指定其他编码时,将转换为该编码。

same as in java: the libraries do the conversion for you... as deceze said: when you specify an encoding, it will be converted to an internal format (utf-16 afaik). When you specify another encoding when you write the string, it will be converted to this encoding.

但是,如果使用XML,则不必担心编码,因为XML解析器将照顾它。它会读取第一个字符<?xml 并确定这些字符的基本编码。之后,可以从你的xml头文件读取编码信息,并使用这个。

But if you work with XML, you shouldn't have to worry about the encoding anyway because the XML parser will take care of it. It will read the first characters <?xml and determines the basic encoding from those characters. After that, it is able to read the encoding information from your xml header and use this.

这篇关于使用groovy将ISO-8859-1转换为UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆