php 输出 xml 产生解析错误“’" [英] php output xml produces parse error "’"
问题描述
是否有任何函数可以用来解析任何字符串以确保它不会导致 xml 解析问题?我有一个 php 脚本输出一个 xml 文件,内容是从表单中获取的.
Is there any function that I can use to parse any string to ensure it won't cause xml parsing problems? I have a php script outputting a xml file with content obtained from forms.
问题是,除了 php 表单中通常的字符串检查之外,一些用户文本会导致 xml 解析错误.我特别面对这个’
".这是我收到的错误 Entity 'rsquo' not defined
The thing is, apart from the usual string checks from a php form, some of the user text causes xml parsing errors. I'm facing this "’
" in particular. This is the error I'm getting Entity 'rsquo' not defined
有没有人有过为 xml 输出编码文本的经验?
Does anyone have any experience in encoding text for xml output?
谢谢!
一些说明:我从 xml 文件中的表单输出内容,随后由 javascript 解析.
Some clarification: I'm outputting content from forms in a xml file, which is subsequently parsed by javascript.
我处理所有表单输入:htmlentities(trim($_POST['content']), ENT_QUOTES, 'UTF-8');
当我想将此内容输出到一个 xml 文件时,我应该如何对其进行编码以使其不会引发 xml 解析错误?
When I want to output this content into a xml file, how should I encode it such that it won't throw up xml parsing errors?
到目前为止,以下 2 个解决方案有效:
So far the following 2 solutions work:
1) echo '
2) echo '
以上两种解决方案安全吗?哪个更好?
Are the above 2 solutions safe? Which is better?
谢谢,很抱歉没有提前提供此信息.
Thanks, sorry for not providing this information earlier.
推荐答案
你误会了 - 不要寻找不会出错的解析器.而是尝试使用格式良好的 xml.
You take it the wrong way - don't look for a parser which doesn't give you errors. Instead try to have a well-formed xml.
您是如何从用户那里获得 ’
的?如果他按字面输入,则说明您没有正确处理输入 - 例如,您应该转义 &&
.如果是您将实体放在那里(可能代替某个撇号),请在 DTD 中定义它()或使用数字符号 (
’
),因为几乎每个命名实体都是 HTML 的一部分.正如 Gumbo 指出的那样,XML 只定义了一些基本的.
How did you get ’
from the user? If he literally typed it in, you are not processing the input correctly - for example you should escape & to &
. If it is you who put the entity there (perhaps in place of some apostrophe), either define it in DTD (<!ENTITY rsquo "&x2019;">
) or write it using a numeric notation (’
), because almost every of the named entities are a part of HTML. XML defines only a few basic ones, as Gumbo pointed out.
根据对问题的补充进行
- 在#1 中,如果用户输入
]]]><°)))><
,你有问题. - 在 #2 中,
您正在进行编码和解码,从而得到 $content 的原始值.解码不是必需的(如果您不希望用户发布类似&
应该被解释为 &).
- In #1, you escape the content in the way that if user types in
]]> <°)))><
, you have a problem. - In #2,
you are doing the encoding and decoding which result in the original value of the $content.the decoding should not be necessary (if you don't expect users to post values like&
which should be interpreted like &).
如果您将 htmlspecialchars() 与 ENT_QUOTES 一起使用,应该没问题,但请参阅 如何Drupal 做到了.
If you use htmlspecialchars() with ENT_QUOTES, it should be ok, but see how Drupal does it.
这篇关于php 输出 xml 产生解析错误“&rsquo;"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!