在使用DOM解析HTML时保持文件偏移量？ [英] Keeping file offsets while parsing HTML with the DOM?

查看：121 发布时间：2017/6/25 4:56:31 php dom html-parsing

本文介绍了在使用DOM解析HTML时保持文件偏移量？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在不太格式错误的HTML（WordPress文章）中修改< img src => 属性。我知道我可以采取简单的方式并使用正则表达式，但我害怕蓝色毛茸茸的西装人会在我的睡眠中困扰我。

I want to modify <img src=""> attributes in not-too-malformed HTML (WordPress posts). I know I can take the simple way and use regexes, but I'm afraid people in blue furry suits will come haunt me in my sleep.

如果我使用DOM解析器读取HTML并修改< img> 标签，恐怕我无法完全按照原样重建帖子（只有我的修改），因为DOM解析器可能会做太多的清理，并可能删除基本数据。一个SAX解析器可能无法处理无效的XML，所以这也不起作用。

If I use the DOM parser to read the HTML and modify the <img> tags, I'm afraid I can't reconstruct the post exactly as it was (with only my modification), because the DOM parser will probably do too much cleanup and maybe remove essential data. A SAX parser can probably not handle invalid XML, so this will also not work.

那么，有没有中间的方式，我可以使用一个DOM解析器，但是一个知道每个元素在哪里开始，所以我可以做字符串替换或类似的东西？我知道DOM树中的一些节点不会存在于源文档中（ Some bizarre格式化 触发这个），但这是否意味着永远是不可能的？我看到有一个 DOMNode :: getLineNo（） function 添加在PHP 5.3中，但我使用的是5.2.x。

So, is there a middle way, where I can use a DOM parser, but one that knows where each element started, so I can do string replacements or something similar from there? I know some nodes in the DOM tree will not exist in the source document (Some bizarre formatting will probably trigger this), but does this mean it is always impossible? I see there is a DOMNode::getLineNo() function added in PHP 5.3, but I'm using 5.2.x.

在使用DOM解析HTML时保持文件偏移量？ [英] Keeping file offsets while parsing HTML with the DOM?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

在使用DOM解析HTML时保持文件偏移量？ [英] Keeping file offsets while parsing HTML with the DOM?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭