Simplexml_load_string()无法解析错误 [英] Simplexml_load_string() fail to parse error

查看:1945
本文介绍了Simplexml_load_string()无法解析错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试加载解析Google Weather API响应(中文响应).

此处是API调用. /p>

// This code fails with the following error
$xml = simplexml_load_file('http://www.google.com/ig/api?weather=11791&hl=zh-CN');

(!)警告:simplexml_load_string() [function.simplexml-load-string]: 实体:第1行:解析器错误:输入 不正确的UTF-8,表示编码 !字节:0xB6 0xE0 0xD4 0xC6 in 第11行的C:\ htdocs \ weather.php

为什么加载此响应失败?

我如何编码/解码响应,以便simplexml正确加载?

这是代码和输出.

<?php
$googleData = file_get_contents('http://www.google.com/ig/api?weather=11102&hl=zh-CN');
$xml = simplexml_load_string($googleData);

(!)警告:simplexml_load_string() [function.simplexml-load-string]: 实体:第1行:解析器错误:输入 不正确的UTF-8,表示编码 !字节:0xB6 0xE0 0xD4 0xC6 in 第3行的C:\ htdocs \ test4.php 堆 时间记忆功能位置1 0.0020 314264 {main}( ).. \ test4.php:0 2 0.1535 317520 simplexml_load_string (string(1364)).. \ test4.php:3

(!)警告:simplexml_load_string() [function.simplexml-load-string]: 系统 data ="SI"/>

(!)警告:simplexml_load_string() [function.simplexml-load-string]:^ in 第3行的C:\ htdocs \ test4.php 堆 时间记忆功能位置1 0.0020 314264 {main}( ).. \ test4.php:0 2 0.1535 317520 simplexml_load_string (string(1364)).. \ test4.php:3

解决方案

这里的问题是,SimpleXML不会查看HTTP标头来确定文档中使用的字符编码,并且即使Google使用了UTF-8,也只是假定它是UTF-8.服务器确实将其广告为

Content-Type: text/xml; charset=GB2312

您可以编写一个函数,使用超级秘密的魔术变量$http_response_header查看该标头,并相应地转换响应.像这样的东西:

function sxe($url)
{   
    $xml = file_get_contents($url);
    foreach ($http_response_header as $header)
    {   
        if (preg_match('#^Content-Type: text/xml; charset=(.*)#i', $header, $m))
        {   
            switch (strtolower($m[1]))
            {   
                case 'utf-8':
                    // do nothing
                    break;

                case 'iso-8859-1':
                    $xml = utf8_encode($xml);
                    break;

                default:
                    $xml = iconv($m[1], 'utf-8', $xml);
            }
            break;
        }
    }

    return simplexml_load_string($xml);
}

I'm trying to load parse a Google Weather API response (Chinese response).

Here is the API call.

// This code fails with the following error
$xml = simplexml_load_file('http://www.google.com/ig/api?weather=11791&hl=zh-CN');

( ! ) Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 1: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xB6 0xE0 0xD4 0xC6 in C:\htdocs\weather.php on line 11

Why does loading this response fail?

How do I encode/decode the response so that simplexml loads it properly?

Edit: Here is the code and output.

<?php
$googleData = file_get_contents('http://www.google.com/ig/api?weather=11102&hl=zh-CN');
$xml = simplexml_load_string($googleData);

( ! ) Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 1: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xB6 0xE0 0xD4 0xC6 in C:\htdocs\test4.php on line 3 Call Stack Time Memory Function Location 1 0.0020 314264 {main}( ) ..\test4.php:0 2 0.1535 317520 simplexml_load_string ( string(1364) ) ..\test4.php:3

( ! ) Warning: simplexml_load_string() [function.simplexml-load-string]: t_system data="SI"/>

( ! ) Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in C:\htdocs\test4.php on line 3 Call Stack Time Memory Function Location 1 0.0020 314264 {main}( ) ..\test4.php:0 2 0.1535 317520 simplexml_load_string ( string(1364) ) ..\test4.php:3

解决方案

The problem here is that SimpleXML doesn't look at the HTTP header to determine the character encoding used in the document and simply assumes it's UTF-8 even though Google's server does advertise it as

Content-Type: text/xml; charset=GB2312

You can write a function that will take a look at that header using the super-secret magic variable $http_response_header and transform the response accordingly. Something like that:

function sxe($url)
{   
    $xml = file_get_contents($url);
    foreach ($http_response_header as $header)
    {   
        if (preg_match('#^Content-Type: text/xml; charset=(.*)#i', $header, $m))
        {   
            switch (strtolower($m[1]))
            {   
                case 'utf-8':
                    // do nothing
                    break;

                case 'iso-8859-1':
                    $xml = utf8_encode($xml);
                    break;

                default:
                    $xml = iconv($m[1], 'utf-8', $xml);
            }
            break;
        }
    }

    return simplexml_load_string($xml);
}

这篇关于Simplexml_load_string()无法解析错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆