通过链接获取网站的标题 [英] Get title of website via link
问题描述
请注意, Google新闻在每篇文章摘录的底部都有来源。
Bloomberg b
我试图模仿这种情况。
例如,在提交URLhttp:// www。 washingtontimes.com/news/2010/dec/3/debt-panel-fails-test-vote /
我想返回华盛顿时报
这怎么可能与php?
解决方案我的答案是扩展在@AI W的使用页面标题的答案。
<?php
function get_title($ url){
$ str = file_get_contents($ url);
if(strlen($ str)> 0){
$ str = trim(preg_replace('/ \ s + /','',$ str)); //在< title>内部支持换行符
preg_match(/ \< title \>(。*)\< \ / title\> / i,$ str,$ title); //忽略大小写
return $ title [1];
}
}
//例如:
echo get_title(http://www.washingtontimes.com/);
?>
OUTPUT
<美国和世界新闻
当你可以看到,这不完全是谷歌正在使用的,所以这让我相信他们得到一个URL的主机名,并将其匹配到自己的列表。
http://www.washingtontimes.com/ =>华盛顿时报
Notice how Google News has sources on the bottom of each article excerpt.
The Guardian - ABC News - Reuters - Bloomberg
I'm trying to imitate that.
For example, upon submitting the URL http://www.washingtontimes.com/news/2010/dec/3/debt-panel-fails-test-vote/
I want to return The Washington Times
How is this possible with php?
My answer is expanding on @AI W's answer of using the title of the page. Below is the code to accomplish what he said.
<?php
function get_title($url){
$str = file_get_contents($url);
if(strlen($str)>0){
$str = trim(preg_replace('/\s+/', ' ', $str)); // supports line breaks inside <title>
preg_match("/\<title\>(.*)\<\/title\>/i",$str,$title); // ignore case
return $title[1];
}
}
//Example:
echo get_title("http://www.washingtontimes.com/");
?>
OUTPUT
Washington Times - Politics, Breaking News, US and World News
As you can see, it is not exactly what Google is using, so this leads me to believe that they get a URL's hostname and match it to their own list.
http://www.washingtontimes.com/ => The Washington Times
这篇关于通过链接获取网站的标题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!