如何以编程方式搜索 Google Java API [英] How can you search Google Programmatically Java API

查看:22
本文介绍了如何以编程方式搜索 Google Java API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有谁知道是否可以以及如何以编程方式搜索 Google - 特别是如果有 Java API 的话?

Does anyone know if and how it is possible to search Google programmatically - especially if there is a Java API for it?

推荐答案

一些事实:

  1. Google 提供了一个公共搜索网络服务 API,它返回 JSON:http://ajax.googleapis.com/ajax/services/搜索/网络.此处的文档

Java 提供 java.net.URLjava.net.URLConnection 来触发和处理 HTTP 请求.

Java offers java.net.URL and java.net.URLConnection to fire and handle HTTP requests.

可以使用任意 Java JSON API 将 Java 中的 JSON 转换为完整的 Javabean 对象.最好的工具之一是 Google Gson.

JSON can in Java be converted to a fullworthy Javabean object using an arbitrary Java JSON API. One of the best is Google Gson.

现在做数学:

public static void main(String[] args) throws Exception {
    String google = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=";
    String search = "stackoverflow";
    String charset = "UTF-8";
    
    URL url = new URL(google + URLEncoder.encode(search, charset));
    Reader reader = new InputStreamReader(url.openStream(), charset);
    GoogleResults results = new Gson().fromJson(reader, GoogleResults.class);
    
    // Show title and URL of 1st result.
    System.out.println(results.getResponseData().getResults().get(0).getTitle());
    System.out.println(results.getResponseData().getResults().get(0).getUrl());
}

这个 Javabean 类代表 Google 返回的最重要的 JSON 数据(它实际上返回了更多的数据,但留给您作为练习来相应地扩展此 Javabean 代码):

With this Javabean class representing the most important JSON data as returned by Google (it actually returns more data, but it's left up to you as an exercise to expand this Javabean code accordingly):

public class GoogleResults {

    private ResponseData responseData;
    public ResponseData getResponseData() { return responseData; }
    public void setResponseData(ResponseData responseData) { this.responseData = responseData; }
    public String toString() { return "ResponseData[" + responseData + "]"; }

    static class ResponseData {
        private List<Result> results;
        public List<Result> getResults() { return results; }
        public void setResults(List<Result> results) { this.results = results; }
        public String toString() { return "Results[" + results + "]"; }
    }

    static class Result {
        private String url;
        private String title;
        public String getUrl() { return url; }
        public String getTitle() { return title; }
        public void setUrl(String url) { this.url = url; }
        public void setTitle(String title) { this.title = title; }
        public String toString() { return "Result[url:" + url +",title:" + title + "]"; }
    }

}

###另见:

  • How to fire and handle HTTP requests using java.net.URLConnection
  • How to convert JSON to Java

更新自 2010 年 11 月(上述答案后 2 个月)以来,公共搜索网络服务已弃用(提供该服务的最后一天是 9 月29, 2014).现在最好的办法是直接查询 http://www.google.com/search用户代理,然后使用 HTML 解析器.如果省略用户代理,则会返回 403.如果您在用户代理中模拟 Web 浏览器(例如 Chrome 或 Firefox),那么您会得到更大的 HTML 响应,这会浪费带宽和性能.

Update since November 2010 (2 months after the above answer), the public search webservice has become deprecated (and the last day on which the service was offered was September 29, 2014). Your best bet is now querying http://www.google.com/search directly along with a honest user agent and then parse the result using a HTML parser. If you omit the user agent, then you get a 403 back. If you're lying in the user agent and simulate a web browser (e.g. Chrome or Firefox), then you get a way much larger HTML response back which is a waste of bandwidth and performance.

这是一个使用 Jsoup 作为 HTML 解析器的启动示例:

Here's a kickoff example using Jsoup as HTML parser:

String google = "http://www.google.com/search?q=";
String search = "stackoverflow";
String charset = "UTF-8";
String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!

Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a");

for (Element link : links) {
    String title = link.text();
    String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
    url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");
    
    if (!url.startsWith("http")) {
        continue; // Ads/news/etc.
    }
    
    System.out.println("Title: " + title);
    System.out.println("URL: " + url);
}

这篇关于如何以编程方式搜索 Google Java API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆