XML编码问题 [英] XML encoding issue

查看:211
本文介绍了XML编码问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有快速的方法来找到一个XML文档是否正确连接codeD的UTF-8,不包含未在XML允许的任何字符,UTF-8编码。

I want to know whether there is quick way to find whether an XML document is correctly encoded in UTF-8 and does not contains any characters which is not allowed in XML UTF-8 encoding.

<?xml version="1.0" encoding="utf-8"?>

在此先感谢, 乔治

thanks in advance, George

EDIT1:这里是我的XML文件的内容,以文本形式和二进制形式

here is the content of my XML file, in both text form and in binary form.

http://tinypic.com/view.php?pic=2r2akvr& S = 5

我曾尝试使用工具,如xmlstarlet检查,结果是正确的(因为出UTF-8的范围内无效),但错误信息是不正确的,因为在我上面贴的链接,没有任何字符其值是0xDFDD。任何想法?

I have tried to use tools like xmlstarlet to check, the result is correct (invalid because of out of range of UTF-8), but the error message is not correct, because in my posted link above, there is no char whose value is 0xDFDD. Any ideas?

BTW:我可以把XML文件给任何人,但我没有找到一个方法来这里上传的文件作为附件。如果有人需要这个文件进行分析,请随时告诉我。

BTW: I can send the XML file to anyone, but I did not find a way to upload the file as attachment here. If anyone needs this file for analysis, please feel free to let me know.

D:\xmlstarlet-1.0.1-win32\xmlstarlet-1.0.1>xml val a.xml
a.xml:2: parser error : Char 0xDFDD out of allowed range
<URL>student=1砜濏磦</URL>
              ^
a.xml:2: parser error : Char 0xDFDD out of allowed range
<URL>student=1砜濏磦</URL>
              ^
a.xml:2: parser error : internal error
<URL>student=1砜濏磦</URL>
              ^
a.xml:2: parser error : Extra content at the end of the document
<URL>student=1砜濏磦</URL>
              ^
a.xml - invalid

EDIT2:我已经使用了libxml的工具来检查XML文件的有效性为好,但遇到了一个错误,当启动该工具。这里是一个屏幕快照。任何想法?

I have used the tool libxml to check the validation of XML file as well, but met with an error when start this tool. Here is a screen snapshot. Any ideas?

http://tinypic.com/view.php?pic=2ildjpe& S = 5

操作系统为Windows Server 2003 64位。

OS is Windows Server 2003 x64.

推荐答案

的libxml2 可以做到这一点,它可以作为库(集成到你的程序)或通过命令行工具xmllint。下面是xmllint一个例子:

libxml2 can do it, it is available as a library (to integrate into your programs) or through the command-line tool xmllint. Here is an example with xmllint:

[Proper file] 
% head test.xml
<?xml version="1.0" encoding="utf-8"?>
<café>Ils s'étaient ...

% xmllint --noout test.xml
% 

[One byte in a multibyte character removed]
% xmllint --noout test.xml
test.xml:2: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xC3 0x74 0x61 0x69
<café>Ils s'Ãtaient ...
             ^

这篇关于XML编码问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆