使用groovy将ISO-8859-1转换为UTF-8 [英] Convert ISO-8859-1 to UTF-8 using groovy
问题描述
我需要将ISO-8859-1文件转换为utf-8编码,而不会丢失内容配置...
i need to convert a ISO-8859-1 file to utf-8 encoding, without loosing content intormations...
我有一个看起来像这样的文件:
i have a file which looks like this:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>
我不想将其编码为UTF-8。
我尝试以下:
Not i want to encode it into UTF-8. I tried following:
f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
ts=new String(f.getBytes("UTF-8"), "UTF-8")
g=new File('c:/temp/myutf8.xml').write(ts)
由于String不兼容,没有工作。
然后我读了一些关于bytestreamerers / writers / streamingmarkupbuilder等的信息...
didnt work due to String incompatibilities. Then i read something about bytestreamreaders/writers/streamingmarkupbuilder and other...
然后我试过
f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
mb = new groovy.xml.StreamingMarkupBuilder()
mb.encoding = "UTF-8"
new OutputStreamWriter(new FileOutputStream('c:/temp/myutf8.xml'),'utf-8') << mb.bind {
mkp.xmlDeclaration()
out << f
}
这完全不是我想要的..
this was totally not that what i wanted..
我只想使用ISO-8859-1读取器获取xml读取的内容,然后将其放入新的(旧)文件中...为什么这么复杂: /
I just want to get the content of an xml read with an ISO-8859-1 reader and then put it into a new (old) file... why this is so complicated :-/
结果应该是,文件应该是真正的编码在utf-8:
The result should just be, and the file should be really encoded in utf-8:
<?xml version="1.0" encoding="UTF-8" ?>
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>
感谢任何答案
干杯
Thanks for any answers Cheers
推荐答案
def f=new File('c:/data/myiso88591.xml').getText('ISO-8859-1')
new File('c:/data/myutf8.xml').write(f,'utf-8')
(我刚刚试了一下,它的作用是: - )
(I just gave it a try, it works :-)
与java相同:图书馆为您进行转换.. 。
作为deceze说:当你指定一个编码,它将被转换为内部格式(utf-16 afaik)。当您编写字符串时指定其他编码时,将转换为该编码。
same as in java: the libraries do the conversion for you... as deceze said: when you specify an encoding, it will be converted to an internal format (utf-16 afaik). When you specify another encoding when you write the string, it will be converted to this encoding.
但是,如果使用XML,则不必担心编码,因为XML解析器将照顾它。它会读取第一个字符<?xml
并确定这些字符的基本编码。之后,可以从你的xml头文件读取编码信息,并使用这个。
But if you work with XML, you shouldn't have to worry about the encoding anyway because the XML parser will take care of it. It will read the first characters <?xml
and determines the basic encoding from those characters. After that, it is able to read the encoding information from your xml header and use this.
这篇关于使用groovy将ISO-8859-1转换为UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!