XML 提要中的非法字符? [英] Illegal character in XML feed?

查看:26
本文介绍了XML 提要中的非法字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个 Wordpress/WooCommerce 插件,它从我们的产品中创建了一个 XML 文件.

但在某些行中存在非法字符.

<块引用>

第 15622 行第 22 列的错误:输入的 UTF-8 不正确,请指出编码!字节:0x03 0xC3 0xB6 0x73

如何解决这个问题,以便正确解析 XML?

XML FEED 文件

生成代码如下:

$dom = new DOMDocument('1.0', 'UTF-8');//创建根元素$root = $dom->createElement("termeklista");$dom->appendChild($root);$dom->formatOutput=true;

然后是一个填充数据的while循环.问题出在描述标签中.

//说明$description = $dom->createElement("leiras");$producta->appendChild($description);//创建 CDATA 部分$cdata = $dom->createCDATASection("
".$loop->post->post_excerpt."
");$description->appendChild($cdata);

我尝试过使用 iconv、utf8_encode、自定义函数来替换错误的字符,但我不知道是什么问题.

WooCommerce 产品帖子摘录中没有任何非法字符.

解决方案

0x03 (aka ^C aka ETX aka end传输) 不是XML 中允许的字符 :

<块引用>

[2] 字符 ::= #x9 |#xA |#xD |[#x20-#xD7FF] |[#xE000-#xFFFD] |[#x10000-#x10FFFF]

因此,您的数据不是 XML,任何符合标准的 XML 处理器都必须报告错误,例如您收到的错误.

在将数据与任何 XML 库一起使用之前,您必须手动或自动将数据视为文本,而不是 XML,,通过删除任何非法字符来修复数据.

I have created a Wordpress/WooCommerce plugin which creates an XML file from our products.

But in some rows there are illegal characters.

error on line 15622 at column 22: Input is not proper UTF-8, indicate encoding !
Bytes: 0x03 0xC3 0xB6 0x73

How can I solve this, so the XML is parsed correctly?

XML FEED FILE

The code for generating is something like:

$dom = new DOMDocument('1.0', 'UTF-8');

// create root element
$root = $dom->createElement("termeklista");
$dom->appendChild($root);
$dom->formatOutput=true;

then a while loop with filling the data. The issue is in the description tag.

// DESCRIPTION

$description = $dom->createElement("leiras");
$producta->appendChild($description);
// create CDATA section
$cdata = $dom->createCDATASection("
".$loop->post->post_excerpt."
");
$description->appendChild($cdata);

I have tried iconv, utf8_encode, custom function to replace the wrong characters, but I cannot figure it out what the issue.

The WooCommerce product post excerpt does not have any illegal characters in it.

解决方案

0x03 (aka ^C aka ETX aka end of transmission) is not an allowed character in XML :

[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Therefore your data is not XML, and any conformant XML processor must report an error such as the one you received.

You must repair the data by removing any illegal characters by treating it as text, not XML, manually or automatically before using it with any XML libraries.

这篇关于XML 提要中的非法字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆