什么是 XML BOM 以及如何检测它? [英] What is XML BOM and how do I detect it?

查看:40
本文介绍了什么是 XML BOM 以及如何检测它?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

ANSI XML 文档中的 BOM 到底是什么,是否应该将其删除?XML 文档应该使用 UTF-8 吗?谁能告诉我检测 BOM 的 Java 方法?BOM 由字符 EF BB BF 组成.

What exactly is the BOM in a ANSI XML document and should it be removed? Should a XML document be in UTF-8 instead? Can anyone tell me a Java method that will detect the BOM? The BOM consists of the characters EF BB BF .

推荐答案

对于 ANSI XML 文件,它实际上应该被删除.如果你想使用 UTF-8,你真的不需要它.只有 UTF-16 和 UTF-32 才需要.

For a ANSI XML file it should actually be removed. If you want to use UTF-8 you don't really need it. Only for UTF-16 and UTF-32 it is needed.

字节顺序标记(或 BOM)是一个特殊标记添加在非常编码的 Unicode 文件的开头UTF-8、UTF-16 或 UTF-32.它被使用指示文件是否使用大端或小端字节命令.UTF-16 必须使用 BOM和 UTF-32,但它是可选的UTF-8.

The Byte-Order-Mark (or BOM), is a special marker added at the very beginning of an Unicode file encoded in UTF-8, UTF-16 or UTF-32. It is used to indicate whether the file uses the big-endian or little-endian byte order. The BOM is mandatory for UTF-16 and UTF-32, but it is optional for UTF-8.

(来源:https://www.opentag.com/xfaq_enc.htm#enc_bom)

关于如何在java中检测这个问题.

Regarding the question on how detect this in java.

检查这个问题的以下答案:Java:如何确定流的正确字符集编码

Check the following answer to this question: Java : How to determine the correct charset encoding of a stream

基本上只需自己读取前几个字节,然后确定您是否可能找到了 BOM.

Basically just read in the first few bytes yourself and then determine if you may have found a BOM.

这篇关于什么是 XML BOM 以及如何检测它?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆