如何搜索Google Programmatically Java API [英] How can you search Google Programmatically Java API

查看:112
本文介绍了如何搜索Google Programmatically Java API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有人知道是否以及如何以编程方式搜索Google - 特别是如果有Java API?

Does anyone know if and how it is possible to search Google programmatically - especially if there is a Java API for it?

推荐答案

一些事实:


  1. Google提供了一个公共搜索网络服务API,它返回 JSON http://ajax.googleapis.com/ajax/services/search/web 此处的文档

Java提供 java.net .URL java.net.URLConnection 来触发和处理HTTP请求。

Java offers java.net.URL and java.net.URLConnection to fire and handle HTTP requests.

JSON可以在Java中使用任意Java JSON API将其转换为完全可用的Javabean对象。其中最好的是 Google Gson

JSON can in Java be converted to a fullworthy Javabean object using an arbitrary Java JSON API. One of the best is Google Gson.

现在算一算:

public static void main(String[] args) throws Exception {
    String google = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=";
    String search = "stackoverflow";
    String charset = "UTF-8";

    URL url = new URL(google + URLEncoder.encode(search, charset));
    Reader reader = new InputStreamReader(url.openStream(), charset);
    GoogleResults results = new Gson().fromJson(reader, GoogleResults.class);

    // Show title and URL of 1st result.
    System.out.println(results.getResponseData().getResults().get(0).getTitle());
    System.out.println(results.getResponseData().getResults().get(0).getUrl());
}

这个Javabean类代表Google返回的最重要的JSON数据(它实际上会返回更多的数据,但是由于你需要相应地扩展这个Javabean代码,所以:

With this Javabean class representing the most important JSON data as returned by Google (it actually returns more data, but it's left up to you as an exercise to expand this Javabean code accordingly):

public class GoogleResults {

    private ResponseData responseData;
    public ResponseData getResponseData() { return responseData; }
    public void setResponseData(ResponseData responseData) { this.responseData = responseData; }
    public String toString() { return "ResponseData[" + responseData + "]"; }

    static class ResponseData {
        private List<Result> results;
        public List<Result> getResults() { return results; }
        public void setResults(List<Result> results) { this.results = results; }
        public String toString() { return "Results[" + results + "]"; }
    }

    static class Result {
        private String url;
        private String title;
        public String getUrl() { return url; }
        public String getTitle() { return title; }
        public void setUrl(String url) { this.url = url; }
        public void setTitle(String title) { this.title = title; }
        public String toString() { return "Result[url:" + url +",title:" + title + "]"; }
    }

}



参见:




  • 如何使用 java.net.URLConnection 触发和处理HTTP请求

  • 如何将JSON转换为Java

  • See also:

    • How to fire and handle HTTP requests using java.net.URLConnection
    • How to convert JSON to Java
    • 更新自2010年11月(上述答案后2个月),公共搜索网络服务已被弃用(提供服务的最后一天是2014年9月29日)。您最好的选择是直接与诚实的用户一起查询 http://www.google.com/search 代理然后使用解析结果HTML解析器。如果省略用户代理,则返回403。如果您在用户代理中并且模拟Web浏览器(例如Chrome或Firefox),那么您将获得更大的HTML响应,这会浪费带宽和性能。

      Update since November 2010 (2 months after the above answer), the public search webservice has become deprecated (and the last day on which the service was offered was September 29, 2014). Your best bet is now querying http://www.google.com/search directly along with a honest user agent and then parse the result using a HTML parser. If you omit the user agent, then you get a 403 back. If you're lying in the user agent and simulate a web browser (e.g. Chrome or Firefox), then you get a way much larger HTML response back which is a waste of bandwidth and performance.

      以下是使用 Jsoup 作为HTML解析器的启动示例:

      Here's a kickoff example using Jsoup as HTML parser:

      String google = "http://www.google.com/search?q=";
      String search = "stackoverflow";
      String charset = "UTF-8";
      String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!
      
      Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a");
      
      for (Element link : links) {
          String title = link.text();
          String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
          url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");
      
          if (!url.startsWith("http")) {
              continue; // Ads/news/etc.
          }
      
          System.out.println("Title: " + title);
          System.out.println("URL: " + url);
      }
      

      这篇关于如何搜索Google Programmatically Java API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆