PHP生成的XML文件中的编码错误 [英] Encoding Error in PHP Generated XML File
问题描述
我已经使用DOMDocument类在PHP中生成了一个XML文件,该数据是从MySQL数据库中获取的.许多数据包含HTML标记,但是我已经将所有标记都包含在CDATA部分中.
I have generated an XML file in PHP using the DOMDocument class, the data was grabbed from a MySQL database. A lot of the data contains HTML markup, but I've encased all of it in a CDATA section.
最初,该文件存在很多编码错误,但是在将其放入文件之前,通过utf8_encode()运行所有内容似乎已经解决了除一个错误之外的所有错误.
At first the file had a lot of encoding errors, but running everything through utf8_encode() before putting it into the file seems to have fixed all the errors except one.
这是我现在遇到的错误:
Here is the error I have right now:
error on line 5113 at column 450: Input is not proper UTF-8, indicate encoding !
Bytes: 0x14 0x31 0x30 0x30
我在此处找到了类似错误的帖子,但没有一个可以解决我的问题,也没有建议使用utf_encode().这是似乎触发错误的部分:
I found some posts on here with similar errors, but none have solved my problem, or suggest using utf_encode(). Here is the section that seems to be triggering the error:
...quiet portable package. ]]></Summary><Features><![CDATA[The EF4500iSE was designed for maximum fuel...
错误似乎在CDATA [和The之间,尽管我看不到它们之间的任何字符,并且该字符与文件中的所有其他CDATA块相同.如果我删除整个Feature元素及其内容,则文件会正常加载.
The error seem to be between CDATA[ and The, although I can't see any characters between there and that piece is the same as every other CDATA block in the file. If I remove the entire Features element and it's contents, the file loads up fine.
以下是文件的链接: http://test.hhdev.hothousemarketing.com/库存.xml
推荐答案
问题最终是CDATA标签中存在一个非ASCII字符,正如科林在问题评论中所指出的那样.
The problem ended up being a non-ASCII character present within the CDATA tag, as pointed out by Colin in the comments of the question.
我急于解决此问题,所以我只使用了蛮力方法,并且除了使用utf8_encode()之外,还通过正则表达式替换来运行所有内容,我使用了: $ output = preg_replace('/[^(\ x20- \ x7F)] */','',$ output); 我在这里找到了它: http://www.stemkoski .com/php-remove-non-ascii-characters-from-a/string/
I was in a rush to solve this so I just used a brute force method and ran everything through a regex replacement in addition to utf8_encode(), I used: $output = preg_replace('/[^(\x20-\x7F)]*/','', $output); I found this here: http://www.stemkoski.com/php-remove-non-ascii-characters-from-a-string/
感谢科林和弗朗西斯的贡献.
Thanks to Colin and Francis for their contributions.
这篇关于PHP生成的XML文件中的编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!