PHP生成的XML文件中的编码错误 [英] Encoding Error in PHP Generated XML File

查看:78
本文介绍了PHP生成的XML文件中的编码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用DOMDocument类在PHP中生成了一个XML文件,该数据是从MySQL数据库中获取的.许多数据包含HTML标记,但是我已经将所有标记都包含在CDATA部分中.

I have generated an XML file in PHP using the DOMDocument class, the data was grabbed from a MySQL database. A lot of the data contains HTML markup, but I've encased all of it in a CDATA section.

最初,该文件存在很多编码错误,但是在将其放入文件之前,通过utf8_encode()运行所有内容似乎已经解决了除一个错误之外的所有错误.

At first the file had a lot of encoding errors, but running everything through utf8_encode() before putting it into the file seems to have fixed all the errors except one.

这是我现在遇到的错误:

Here is the error I have right now:

    error on line 5113 at column 450: Input is not proper UTF-8, indicate encoding !
    Bytes: 0x14 0x31 0x30 0x30

我在此处找到了类似错误的帖子,但没有一个可以解决我的问题,也没有建议使用utf_encode().这是似乎触发错误的部分:

I found some posts on here with similar errors, but none have solved my problem, or suggest using utf_encode(). Here is the section that seems to be triggering the error:

    ...quiet portable package. ]]></Summary><Features><![CDATA[The EF4500iSE was designed for maximum fuel...

错误似乎在CDATA [和The之间,尽管我看不到它们之间的任何字符,并且该字符与文件中的所有其他CDATA块相同.如果我删除整个Feature元素及其内容,则文件会正常加载.

The error seem to be between CDATA[ and The, although I can't see any characters between there and that piece is the same as every other CDATA block in the file. If I remove the entire Features element and it's contents, the file loads up fine.

以下是文件的链接: http://test.hhdev.hothousemarketing.com/库存.xml

推荐答案

问题最终是CDATA标签中存在一个非ASCII字符,正如科林在问题评论中所指出的那样.

The problem ended up being a non-ASCII character present within the CDATA tag, as pointed out by Colin in the comments of the question.

我急于解决此问题,所以我只使用了蛮力方法,并且除了使用utf8_encode()之外,还通过正则表达式替换来运行所有内容,我使用了: $ output = preg_replace('/[^(\ x20- \ x7F)] */','',$ output); 我在这里找到了它: http://www.stemkoski .com/php-remove-non-ascii-characters-from-a/string/

I was in a rush to solve this so I just used a brute force method and ran everything through a regex replacement in addition to utf8_encode(), I used: $output = preg_replace('/[^(\x20-\x7F)]*/','', $output); I found this here: http://www.stemkoski.com/php-remove-non-ascii-characters-from-a-string/

感谢科林和弗朗西斯的贡献.

Thanks to Colin and Francis for their contributions.

这篇关于PHP生成的XML文件中的编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆