从url获取元数据 [英] Fetching metadata from url

查看:207
本文介绍了从url获取元数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Jsoup库从url获取元数据。

  Document doc = Jsoup.connect(http://www.google.com).get(); 
String keywords = doc.select(meta [name = keywords])。first()。attr(content);
System.out.println(Meta关键字:+关键字);
String description = doc.select(meta [name = description])。get(0).attr(content);
Elements images = doc.select(img [src〜=(?i)\\。(png | jpe?g | gif)]);

String src = images.get(0).attr(src);
System.out.println(Meta description:+ description);
System.out.println(Meta image URl:+ src);

但我想在客户端一侧使用javascript

解决方案

你不能仅仅因为 cross-origin 问题。您需要一个服务器端脚本来获取页面的内容。



您可以使用 YQL 。这样, YQL 将用作代理。



例如:

$('button')。click(function(){var query ='select * from html where url ='+ $('input').val()+'and xpath =*'; var url ='https://query.yahooapis.com/v1/public/yql?q='+ encodeURIComponent (query); $ .get(url,function(data){var html = $(data).find('html'); $('#kw')。html(html.find('meta [name = keywords ]')。attr('content')||''找不到关键字'); $('#des')。html(html.find('meta [name = description]')。attr('content')| |'no description found'); $('#img')。html(html.find('img')。attr('src')||'no image found');});}); code>

 < script src =https:// aj ax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js\"> ;</script><input type =textplaceholder =在此输入网址value =http:/ /www.html5rocks.com/en/tutorials/cors//><按钮>获取元数据< / button>< pre> < div> Meta关键字:< div id =kw>< / div>< / div> < div>说明:< div id =des>< / div>< / div> < div> image:< div id =img>< / div>< / div>< / pre>  


I have used Jsoup library to fetch the metadata from url.

Document doc = Jsoup.connect("http://www.google.com").get();  
String keywords = doc.select("meta[name=keywords]").first().attr("content");  
System.out.println("Meta keyword : " + keywords);  
String description = doc.select("meta[name=description]").get(0).attr("content");  
Elements images = doc.select("img[src~=(?i)\\.(png|jpe?g|gif)]");  

String src = images.get(0).attr("src");
System.out.println("Meta description : " + description); 
System.out.println("Meta image URl : " + src);

But I want to do it in client side using javascript

解决方案

You can't do it client only because of the cross-origin issue. You need a server side script to get the content of the page.

OR You can use YQL. In this way, the YQL will used as proxy.

For example:

$('button').click(function(){
  var query = 'select * from html where url="' + $('input').val() + '" and xpath="*"';
  var url = 'https://query.yahooapis.com/v1/public/yql?q=' + encodeURIComponent(query);

  $.get(url, function(data) {
    var html = $(data).find('html');
    $('#kw').html(html.find('meta[name=keywords]').attr('content') || 'no keywords found');
    $('#des').html(html.find('meta[name=description]').attr('content') || 'no description found');
    $('#img').html(html.find('img').attr('src') || 'no image found');
  });
});

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

<input type="text" placeholder="Type URL here" value="http://www.html5rocks.com/en/tutorials/cors/" />
<button>Get Meta Data</button>

<pre>
  <div>Meta Keyword: <div id="kw"></div></div>
  <div>Description: <div id="des"></div></div>
  <div>image: <div id="img"></div></div>
</pre>

这篇关于从url获取元数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆