替换 <text:s/>有空格 [英] Replace <text:s/> with whitespace

查看：24 发布时间：2021/10/2 18:49:24 php xml xmlreader

本文介绍了替换 <text:s/>有空格的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我尝试解析这个xml信息:

I try to parse this xml-information:

<text:p >Lorem<text:s/>ipsum.</text:p>

因此我使用 XMLReader.几乎一切都在我需要的时候工作.但是 <text:s/> 元素给我带来了一些麻烦.因为我想删除任何格式标签(即粗体)，所以我使用 expand()->textContent 来获取文本:

Therefore I'm using XMLReader. Nearly everything is working as I need it. But the <text:s/>-element makes some trouble for me. As I want to remove any formatting tags (i.e. bold) I'm using expand()->textContent to get just the text:

$reader = new XMLReader();
if (!$reader->open("content.xml");
while ($reader->read()) {
    if ($reader->nodeType == XMLREADER::ELEMENT && $reader->name === 'text:p') echo utf8_decode($reader->expand()->textContent);
}

在这种情况下，我会得到Loremipsum".而不是Lorem ipsum".如何替换每个带有空格.

In this case I would get 'Loremipsum.' instead of 'Lorem ipsum.'. How can I replace every <text:s/> with a whitespace.

更新:我是这样做的: preg_replace("/<\\/?text:s(\\s+.*?>|>)/", " ", utf8_decode($reader->readInnerXML()))

更新:

如果我使用 DOMDocument 进行解析，我该如何更改语法?

If I'm using DOMDocument for parsing, how do I have to change the syntax?

$reader = new DOMDocument();
$reader->load("zip://folder/".$file.".odt#content.xml");

while ($reader->read()){ 
    if ($reader->nodeType == XMLREADER::ELEMENT && $reader->name === 'text:h') { 
        if ($reader->getAttribute('text:outline-level')=="2") $html .= '<h2>'.$reader->expand()->textContent.'</h2>';
    }
    elseif ($reader->nodeType == XMLREADER::ELEMENT && $reader->name === 'text:p') { 
        if ($reader->getAttribute('text:style-name')=="Standard") {
            $str = $reader->readInnerXML(); 
            // replace text:s-elements with " " at this point
        }
    }
}

推荐答案

您不想输出 <text:p> 元素，但只想输出 text-nodes 和元素只是一个空格:

You don't want to output the <text:p> elements, but you want to output just text-nodes and the <text:s> element as just a space:

 $reader = new XMLReader();
 $result = $reader->open("content.xml");
 if (!$result) {
     throw new UnexpectedValueException('Could not open XML file for reading.');
 }

while ($reader->read()) {
    if ($reader->nodeType == XMLREADER::ELEMENT && $reader->name === 'text:s') {
        echo " "; // SPACE
    }
    if ($reader->nodeType == XMLREADER::TEXT) {
        echo $reader->textContent;
    }
}

因此，XMLReader 与其说是技术问题，不如说是处理逻辑的问题.

So it's more a problem with the processing logic and less a technical issue with XMLReader.

关于我在示例中遗漏的字符编码的一些说明:

Some note on the character encoding I've left out in my example:

如果您以 UTF-8 格式提供输出，则通常不需要转换为 Latin-1 (utf8_decode).请参阅字符编码.

The conversion to Latin-1 you do (utf8_decode) should normally not be necessary if you deliver the output as UTF-8. See Character encodings.

如果你的目标输出是必要的，那么很可能不需要在那个地方处理它，参见 ob_iconv_handler.

If it's necessary for your target output, it's most likely not necessary to take care of it at that place, see ob_iconv_handler.

这篇关于替换 <text:s/>有空格的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

替换 <text:s/>有空格 [英] Replace <text:s/> with whitespace

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

替换 &lt;text:s/&gt;有空格 [英] Replace &lt;text:s/&gt; with whitespace

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

替换 <text:s/>有空格 [英] Replace <text:s/> with whitespace

登录关闭