用单引号解析XML? [英] Parsing XML With Single Quotes?

查看:190
本文介绍了用单引号解析XML?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我当前遇到一个问题,其中元素从我的xml文件返回且带有单引号.这导致xml_parse分成多个块,例如:连线,您已被雇用! 然后将其表示为"Get Wired,You"是一个对象,单引号是第二个对象,然后"re Hired!"第三.

I am currently running into a problem where an element is coming back from my xml file with a single quote in it. This is causing xml_parse to break it up into multiple chunks, example: Get Wired, You're Hired! Is then enterpreted as 'Get Wired, You' being one object, the single quote being a second, and 're Hired!' as a third.

我想做的是:

while($data = fread($fp, 4096)){
        if(!xml_parse($xml_parser, htmlentities($data,ENT_QUOTES), feof($fp))) {
            break;
        }
    }

但是这一直在打破.我可以运行一个str_replace来代替htmlentities,它可以正常运行,但不想使用htmlentities.

But that keeps breaking. I can run a str_replace in place of htmlentities and it runs without issue, but does not want to with htmlentities.

有什么想法吗?

更新: 根据以下JimmyJ的回复,我尝试了以下解决方案,但没有遇到任何麻烦(仅供参考,在链接的帖子上方有一个或两个响应,用于更新直接链接的代码):

Update: As per JimmyJ's response below, I have attempted the following solution with no luck (FYI there is a response or two above the linked post that update the code that is linked directly):

function XMLEntities($string)
    {
        $string = preg_replace('/[^\x09\x0A\x0D\x20-\x7F]/e', '_privateXMLEntities("$0")', $string);
        return $string;
    }

    function _privateXMLEntities($num)
    {
    $chars = array(
        39  => ''',
        128 => '€',
        130 => '‚',
        131 => 'ƒ',
        132 => '„',
        133 => '…',
        134 => '†',
        135 => '‡',
        136 => 'ˆ',
        137 => '‰',
        138 => 'Š',
        139 => '‹',
        140 => 'Œ',
        142 => 'Ž',
        145 => '‘',
        146 => '’',
        147 => '“',
        148 => '”',
        149 => '•',
        150 => '–',
        151 => '—',
        152 => '˜',
        153 => '™',
        154 => 'š',
        155 => '›',
        156 => 'œ',
        158 => 'ž',
        159 => 'Ÿ');
        $num = ord($num);
        return (($num > 127 && $num < 160) ? $chars[$num] : "&#".$num.";" );
    }
if(!xml_parse($xml_parser, XMLEntities($data), feof($fp))) {
            break;
        }

更新:按照以下有关汤姆的问题,魔术引号确实已经关闭.

Update: As per tom's question below, magic quotes is/was indeed turned off.

解决方案:我最终要解决的问题是:

Solution: What I have ended up doing to solve the problem is the following:

在为每个单独的项目/帖子/等收集数据之后,我将该数据存储到稍后用于输出的数组中,然后清除收集期间使用的局部变量.我添加了一个步骤,检查数据是否已经存在,如果存在,我将其连接到末尾,而不是覆盖它.

After collecting the data for each individual item/post/etc, I store that data to an array that I use later for output, then clear the local variables used during collection. I added in a step that checks if data is already present, and if it is, I concatenate it to the end, rather than overwriting it.

因此,如果我最后得到三个块(如上所述,让我们继续坚持获取连线,您就被雇用了!",那么我将不做任何事情

So, if I end up with three chunks (as above, let's stick with 'Get Wired, You're Hired!', I will then go from doing

$x = 'Get Wired, You'
$x = "'"
$x = 're Hired!'

要做的事

$x = 'Get Wired, You' . "'" . 're Hired!'

这不是最佳解决方案,但似乎可以正常工作.

This isn't the optimal solution, but appears to be working.

推荐答案

为什么不使用诸如simplexml_load_file之类的文件轻松地解析文件?

Why don't you use something like simplexml_load_file to parse your file easily ?

这篇关于用单引号解析XML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆