可以在某些网站的多个标题标签之间提取文本的PHP脚本吗? [英] PHP script that can extract text between multiple title tags of certain website?

查看:47
本文介绍了可以在某些网站的多个标题标签之间提取文本的PHP脚本吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我发现的人很少,尝试了几次,但对我来说真的没有用.我发现最好能提取页面的标题,但是页面上有很多标题标签,并且只提取了第一个.我需要它来提取所有标题.如果也可以 就是这样的代码:

Hello I found few and tried few, but nothing really works for me. Best I found was able to extract title of the page, but there are many title tags on the page and it extracted only the first one. I need it to extract all titles. If it also could It is this code:

<?php
$text = file_get_contents("http://www.example.com");
if (preg_match('~<title[^>]*>(.*?)</title>~si', $text, $body)){
echo $body[1];
}

?> 

推荐答案

尝试此解决方案

$text = file_get_contents("http://www.example.com");
preg_match_all('/<title>.*?<\/title>/is', $text, $matches);
foreach($matches[0] as $m)
{
    echo htmlentities($m)."<br />";
}

例如:

// input text
$text = <<<EOT
<title>Lorem ipsum dolor</title>
sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua.
Ut enim <title>ad minim</title> veniam,
quis nostrud exercitation ullamco laboris nisi ut
aliquip <title>ex ea</title> commodo consequat.
EOT;

// solution
preg_match_all('/<title>(.+?)<\/title>/is', $text, $matches);
foreach($matches[0] as $m)
{
    echo htmlentities($m)."<br />";
}

输出:

<title>Lorem ipsum dolor</title>
<title>ad minim</title>
<title>ex ea</title>

POST UPDATED(以反映问题中的更改).

例如,您要加载一些"a.html"文件:

For example you want to load some "a.html" file:

<html>
<body>
Lorem ipsum dolor
<a title="Ravellavegas.com Analysis" href="http://somewebsite.com/" />
sit amet, consectetur adipisicing elit, sed do eiusmod tempor
<a title="Articlesiteslist.com Analysis" href="http://someanotherwebsite.com/" />
incididunt ut labore et dolore magna aliqua.
</body>
</html>

然后,您必须按如下所示编写脚本:

Then, you have to write the script as follows:

<?php

$dom = new DOMDocument();
$dom->load('a.html');

foreach ($dom->getElementsByTagName('a') as $tag) {
    echo $tag->getAttribute('title').'<br/>';
}

?>

这将输出:

Ravellavegas.com Analysis
Articlesiteslist.com Analysis

这篇关于可以在某些网站的多个标题标签之间提取文本的PHP脚本吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆