删除xml输出中的wierd字符 [英] Remove wierd characters in xml output

查看:119
本文介绍了删除xml输出中的wierd字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图删除xml输出中的古怪字符。这里是代码和输出:

I'm trying to remove weird characters in xml output. Here's the code and output:

似乎有编码问题。我试过添加和这是从ical转换为xml:

It seems there are encoding issues. I have tried adding and this a convert from ical to xml:

http://flourishhosting.co.uk/test.php

    <xml version="1.0" encoding="UTF-8">
            <html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
            <body style="font-family:Arial;font-size:12pt;background-color:#EEEEEE">
            <xsl:for-each select="VCALENDAR">
              <div style="background-color:teal;color:white;padding:4px">
                <span style="font-weight:bold"><xsl:value-of select="URL"/> - </span>
                <xsl:value-of select="DTSTART"/>
                </div>
              <div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
                <p>
                <xsl:value-of select="SUMMARY"/>
                <span style="font-style:italic"> (<xsl:value-of select="calories"/> calories per serving)</span>
                </p>
              </div>
            </xsl:for-each>
            <?php

            function iCalendarToXML($icalendarData) {

                // Detecting line endings
                if (strpos($icalendarData,"\r\n")) $lb = "\r\n";
                elseif (strpos($icalendarData,"\n")) $lb = "\n";
                else $lb = "\r\n";

                // Splitting up items per line
                $lines = explode($lb,$icalendarData);

                // Properties can be folded over 2 lines. In this case the second
                // line will be preceeded by a space or tab.
                $lines2 = array();
                foreach($lines as $line) {

                    if ($line[0]==" " || $line[0]=="\t") {
                        $lines2[count($lines2)-1].=substr($line,1);
                        continue;
                    }

                    $lines2[]=$line;

                }

                $xml = '<?xml version="1.0"?>' . "\n";

                $spaces = 0;
                foreach($lines2 as $line) {

                    $matches = array();
                    // This matches PROPERTYNAME;ATTRIBUTES:VALUE
                    if (preg_match('/^([^:^;]*)(?:;([^:]*))?:(.*)$/',$line,$matches)) {
                        $propertyName = strtoupper($matches[1]);
                        $attributes = $matches[2];
                        $value = $matches[3];

                        // If the line was in the format BEGIN:COMPONENT or END:COMPONENT, we need to special case it.
                        if ($propertyName == 'BEGIN') {
                            $xml.=str_repeat(" ",$spaces);
                            $xml.='<' . strtoupper($value) . ">\n";
                            $spaces+=2;
                            continue;
                        } elseif ($propertyName == 'END') {
                            $spaces-=2;
                            $xml.=str_repeat(" ",$spaces);
                            $xml.='</' . strtoupper($value) . ">\n";
                            continue;
                        }

                        $xml.=str_repeat(" ",$spaces);
                        $xml.='<' . $propertyName;
                        if ($attributes) {
                            // There can be multiple attributes
                            $attributes = explode(';',$attributes);
                            foreach($attributes as $att) {

                                list($attName,$attValue) = explode('=',$att,2);
                                $xml.=' ' . $attName . '="' . htmlspecialchars($attValue) . '"';

                            }
                        }

                        $xml.='>'. htmlspecialchars($value) . '</' . $propertyName . ">\n";

                    }

                }

                return $xml;

            }
            // read in the artist from the form
            $a = urlencode($_GET["VEVENT"]);
            $var = htmlentities($var,ENT_QUOTES, "Windows-1252");
            $connection = curl_init();


            // Specify the URL to connect to
            curl_setopt($connection, CURLOPT_URL, "http://mosaic-church.onthecity.org/plaza/events/ical_feed");


            // This option ensures that the HTTP response is *returned* from curl_exec(),
            // (see below) rather than being output to screen.  
            curl_setopt($connection,CURLOPT_RETURNTRANSFER,1);

            // Do not include the HTTP header in the response.
            curl_setopt($connection,CURLOPT_HEADER, 0);

            // Actually connect to the remote URL. The response is 
            // returned from curl_exec() and placed in $response.
            $response = curl_exec($connection);

            $xml_output = iCalendarToXML($response);
            echo "XML Output <pre>".$xml_output."</pre>";

            // Close the connection.
            curl_close($connection);

            //parse code:
            $xml = simplexml_load_string($xml_output);
            for($index=0; $index < count($xml->VEVENT); $index++)
            {
              echo $xml->VEVENT[$index]-> SUMMARY . "<br />";
              echo $xml->VEVENT[$index]-> DESCRIPTION . "<br />";
            }
            ?>
            </body>
            </html>


推荐答案

您的HTML文档不完整。您缺少整个 head 标记,它应该具有指定内容和编码的内容类型元标记。现在浏览器必须猜测编码是什么,并猜测是错误的。

Your HTML document is incomplete. You are missing the entire head tag, which should have a content type meta tag that specifies the content and encoding. Right now the browser has to guess what the encoding is, and guesses wrong.

正确的HTML文档有标记标记

A correct HTML document has a head tag with a title tag, that's where you add the meta tag:

<head>
  <title>Page name</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
</head>

这篇关于删除xml输出中的wierd字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆