在Go中解组ISO-8859-1 XML输入 [英] Unmarshal an ISO-8859-1 XML input in Go

查看:432
本文介绍了在Go中解组ISO-8859-1 XML输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当您的XML输入未以UTF-8编码时,xml包的 Unmarshal 函数似乎需要 CharsetReader

解决方案

使用 go-charset 扩展@ anschel-schaffer-cohen建议和@ mjibson的评论
, / a>包可以使用这三行

  decoder:= xml.NewDecoder(reader)
decoder.CharsetReader = charset.NewReader
err = decoder.Decode(& parsed)

实现所需的结果。只需记住通过调用

charset 知道它的数据文件在哪里

  charset.CharsetDir =... / src / code.google.com / p / go-charset / datafiles


$ b



>而不是上面的, charset.CharsetDir = 等,它是更明智的只是导入数据文件。它们被视为嵌入资源:

  import(
code.google.com/p/go-charset / charset
_code.google.com/p/go-charset/data
...

go install 只会做它的事情,这也避免了部署头痛(其中/如何获取数据文件相对于执行的应用程序?)。



使用import带下划线只是调用包的 init() func which将所需的内容加载到内存中。


When your XML input isn't encoded in UTF-8, the Unmarshal function of the xml package seems to require a CharsetReader.

Where do you find such a thing ?

解决方案

Expanding on @anschel-schaffer-cohen suggestion and @mjibson's comment, using the go-charset package as mentioned above allows you to use these three lines

decoder := xml.NewDecoder(reader)
decoder.CharsetReader = charset.NewReader
err = decoder.Decode(&parsed)

to achieve the required result. just remember to let charset know where its data files are by calling

charset.CharsetDir = ".../src/code.google.com/p/go-charset/datafiles"

at some point when the app starts up.

EDIT

Instead of the above, charset.CharsetDir = etc. it's more sensible to just import the data files. they are treated as an embedded resource:

import (
    "code.google.com/p/go-charset/charset"
    _ "code.google.com/p/go-charset/data"
    ...
)

go install will just do its thing, this also avoids the deployment headache (where/how do I get data files relative to the executing app?).

using import with an underscore just calls the package's init() func which loads the required stuff into memory.

这篇关于在Go中解组ISO-8859-1 XML输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆