php 输出 xml 产生解析错误“’" [英] php output xml produces parse error "’"

查看:23
本文介绍了php 输出 xml 产生解析错误“’"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有任何函数可以用来解析任何字符串以确保它不会导致 xml 解析问题?我有一个 php 脚本输出一个 xml 文件,内容是从表单中获取的.

Is there any function that I can use to parse any string to ensure it won't cause xml parsing problems? I have a php script outputting a xml file with content obtained from forms.

问题是,除了 php 表单中通常的字符串检查之外,一些用户文本会导致 xml 解析错误.我特别面对这个’".这是我收到的错误 Entity 'rsquo' not defined

The thing is, apart from the usual string checks from a php form, some of the user text causes xml parsing errors. I'm facing this "’" in particular. This is the error I'm getting Entity 'rsquo' not defined

有没有人有过为 xml 输出编码文本的经验?

Does anyone have any experience in encoding text for xml output?

谢谢!

一些说明:我从 xml 文件中的表单输出内容,随后由 javascript 解析.

Some clarification: I'm outputting content from forms in a xml file, which is subsequently parsed by javascript.

我处理所有表单输入:htmlentities(trim($_POST['content']), ENT_QUOTES, 'UTF-8');

当我想将此内容输出到一个 xml 文件时,我应该如何对其进行编码以使其不会引发 xml 解析错误?

When I want to output this content into a xml file, how should I encode it such that it won't throw up xml parsing errors?

到目前为止,以下 2 个解决方案有效:

So far the following 2 solutions work:

1) echo '</content>';

2) echo ''.htmlspecialchars(html_entity_decode($content, ENT_QUOTES, 'UTF-8'),ENT_QUOTES, 'UTF-8').'</content>'."\n";

以上两种解决方案安全吗?哪个更好?

Are the above 2 solutions safe? Which is better?

谢谢,很抱歉没有提前提供此信息.

Thanks, sorry for not providing this information earlier.

推荐答案

你误会了 - 不要寻找不会出错的解析器.而是尝试使用格式良好的 xml.

You take it the wrong way - don't look for a parser which doesn't give you errors. Instead try to have a well-formed xml.

您是如何从用户那里获得 &rsquo; 的?如果他按字面输入,则说明您没有正确处理输入 - 例如,您应该转义 &&amp;.如果是您将实体放在那里(可能代替某个撇号),请在 DTD 中定义它()或使用数字符号 (&#x2019;),因为几乎每个命名实体都是 HTML 的一部分.正如 Gumbo 指出的那样,XML 只定义了一些基本的.

How did you get &rsquo; from the user? If he literally typed it in, you are not processing the input correctly - for example you should escape & to &amp;. If it is you who put the entity there (perhaps in place of some apostrophe), either define it in DTD (<!ENTITY rsquo "&x2019;">) or write it using a numeric notation (&#x2019;), because almost every of the named entities are a part of HTML. XML defines only a few basic ones, as Gumbo pointed out.

根据对问题的补充进行

  • 在#1 中,如果用户输入]]]><°)))><,你有问题.
  • 在 #2 中,您正在进行编码和解码,从而得到 $content 的原始值. 解码不是必需的(如果您不希望用户发布类似&amp; 应该被解释为 &).
  • In #1, you escape the content in the way that if user types in ]]> <°)))><, you have a problem.
  • In #2, you are doing the encoding and decoding which result in the original value of the $content. the decoding should not be necessary (if you don't expect users to post values like &amp; which should be interpreted like &).

如果您将 htmlspecialchars() 与 ENT_QUOTES 一起使用,应该没问题,但请参阅 如何Drupal 做到了.

If you use htmlspecialchars() with ENT_QUOTES, it should be ok, but see how Drupal does it.

这篇关于php 输出 xml 产生解析错误“&amp;rsquo;"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆