php：parse html：从body提取脚本标签，然后在< / body&gt [英] php : parse html : extract script tags from body and inject before </body>?

查看：216 发布时间：2017/6/25 4:13:28 php dom html-content-extraction

本文介绍了php：parse html：从body提取脚本标签，然后在< / body&gt的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我不在乎图书馆是什么，但是我需要一种方法来从页面的< .body。> 中提取< .script。>元素（作为字符串）。然后我想在< ./ body。>之前插入提取的< .script。>。

I don't care what the library is, but I need a way to extract <.script.> elements from the <.body.> of a page (as string). I then want to insert the extracted <.script.>s just before <./body.>.

理想情况下，我想提取< .script。> s分为2种类型;

1）外部（具有src属性的）
2）嵌入式（代码在< .script。>< ./ script之间）。]）

Ideally, I'd like to extract the <.script.>s into 2 types;
1) External (those that have the src attribute) 2) Embedded (those with code between <.script.><./script.>)

到目前为止，我已经尝试过phpDOM，Simple HTML DOM和Ganon。

我没有任何运气（我可以找到链接并删除/打印它们 - 但是每次都使用脚本失败！）。

So far I've tried with phpDOM, Simple HTML DOM and Ganon.
I've had no luck with any of them (I can find links and remove/print them - but fail with scripts every time!).

替代

https://stackoverflow.com/questions/23414887 / php-simple-html-dom-strip-scripts-and-append-to-bottom-of-body
（抱歉转发，但已经24小时尝试和失败，使用替代库，失败更多等）。

Alternative to
https://stackoverflow.com/questions/23414887/php-simple-html-dom-strip-scripts-and-append-to-bottom-of-body (Sorry to repost, but it's been 24 Hours of trying and failing, using alternative libs, failing more etc.).

根据@ alreadycoded.com上可爱的RegEx答案，我设法吞噬togeth呃以下;

Based on the lovely RegEx answer from @alreadycoded.com, I managed to botch together the following;

$output = "<html><head></head><body><!-- Your stuff --></body></html>"
$content = '';
$js = '';

// 1) Grab <body>
preg_match_all('#(<body[^>]*>.*?<\/body>)#ims', $output, $body);
$content = implode('',$body[0]);

// 2) Find <script>s in <body>
preg_match_all('#<script(.*?)<\/script>#is', $content, $matches);
foreach ($matches[0] as $value) {
    $js .= '<!-- Moved from [body] --> '.$value;
}

// 3) Remove <script>s from <body>
$content2 = preg_replace('#<script(.*?)<\/script>#is', '<!-- Moved to [/body] -->', $content); 

// 4) Add <script>s to bottom of <body>
$content2 = preg_replace('#<body(.*?)</body>#is', '<body$1'.$js.'</body>', $content2);

// 5) Replace <body> with new <body>
$output = str_replace($content, $content2, $output);

哪个工作不是那么慢（几分一秒）

Which does the job, and isn't that slow (fraction of a second)

羞耻没有一个DOM的东西正在工作（或者我不能通过naffed对象和操纵进行）。

Shame none of the DOM stuff was working (or I wasn't up to wading through naffed objects and manipulating).

php：parse html：从body提取脚本标签，然后在< / body&gt [英] php : parse html : extract script tags from body and inject before </body>?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

php：parse html：从body提取脚本标签，然后在&lt; / body&gt [英] php : parse html : extract script tags from body and inject before &lt;/body&gt;?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

php：parse html：从body提取脚本标签，然后在< / body&gt [英] php : parse html : extract script tags from body and inject before </body>?

登录关闭