如何在Google应用脚本中从urlfetch获取Google搜索结果 [英] How do I get Google search results from urlfetch in google apps script

查看:165
本文介绍了如何在Google应用脚本中从urlfetch获取Google搜索结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试以下代码:

  var response = UrlFetchApp.fetch(https://www.google。 COM /#q =此+是+ A +试验); 
var contentText = response.getContentText();
Logger.log(contentText);
var thisdoc = DocumentApp.getActiveDocument()。getBody();
thisdoc.setText(contentText);
Logger.log(contentText.indexOf(About));

但它似乎只返回标题和空主体,而没有返回任何搜索结果。至少我应该能够看到浏览器顶部的关于xxx结果,但这不会出现在文本中,indexOf也不会返回正面屏幕。我想知道如果搜索结果是填充后页面加载意味着身体标记将确实是空的,如果是这样有一个解决方法?



编辑:不,它不' t打破TOS,因为这是一个GAFE应用程序(这是一个商业应用程序)和企业帐户,他们有免费和高级模型的访问他们的API。

解决方案

Google为授权搜索提供了一个API,因此不要大惊小怪地抓取网页。



例如,您可以使用< 的说明操作。创建自定义搜索引擎后,请将搜索引擎ID 复制到您的代码中。


I have been trying the following code

var response = UrlFetchApp.fetch("https://www.google.com/#q=this+is+a+test");      
var contentText = response.getContentText();
      Logger.log(contentText);
      var thisdoc=DocumentApp.getActiveDocument().getBody() ;
      thisdoc.setText(contentText);
     Logger.log(contentText.indexOf("About"));

But it only seems to return the header, and empty body, and none of the search results. At minimum I should be able to see the "About xxx results" at the top of the browser but this doesn't appear in the text nor does the indexOf return a positive screen. I'm wondering if the search results are populated post page load meaning the body tag would indeed be empty and if so is there a workaround?

Edit: No it doesn't break the TOS as this is a GAFE app (which is a business app) and for business accounts they have both free and premium models of access to their API.

解决方案

Google provides an API for authorized searches, so don't fuss with scraping web pages.

For example, you can use the Custom Search API with UrlFetch().

From the script editor, go to Resources -> Developer's Console Project... -> View Developer's Console. Create a new key for Public API access. Follow the instructions from the Custom Search API docs to create a Custom search engine. Enter the key and ID into the script where indicated. (More details below.)

This example script will return an object containing the results of a successful search; you can navigate the object to pull out whatever info you want.

/**
 * Use Google's customsearch API to perform a search query.
 * See https://developers.google.com/custom-search/json-api/v1/using_rest.
 *
 * @param {string} query   Search query to perform, e.g. "test"
 *
 * returns {object}        See response data structure at
 *                         https://developers.google.com/custom-search/json-api/v1/reference/cse/list#response
 */
function searchFor( query ) {

  // Base URL to access customsearch
  var urlTemplate = "https://www.googleapis.com/customsearch/v1?key=%KEY%&cx=%CX%&q=%Q%";

  // Script-specific credentials & search engine
  var ApiKey = "--get from developer's console--";
  var searchEngineID = "--get from developer's console--";

  // Build custom url
  var url = urlTemplate
    .replace("%KEY%", encodeURIComponent(ApiKey))
    .replace("%CX%", encodeURIComponent(searchEngineID))
    .replace("%Q%", encodeURIComponent(query));

  var params = {
    muteHttpExceptions: true
  };

  // Perform search
  Logger.log( UrlFetchApp.getRequest(url, params) );  // Log query to be sent
  var response = UrlFetchApp.fetch(url, params);
  var respCode = response.getResponseCode();

  if (respCode !== 200) {
    throw new Error ("Error " +respCode + " " + response.getContentText());
  }
  else {
    // Successful search, log & return results
    var result = JSON.parse(response.getContentText());
    Logger.log( "Obtained %s search results in %s seconds.",
               result.searchInformation.formattedTotalResults,
               result.searchInformation.formattedSearchTime);
    return result;
  }
}

Example:

[15-05-04 18:26:35:958 EDT] {
  "headers": {
    "X-Forwarded-For": "216.191.234.70"
  },
  "useIntranet": false,
  "followRedirects": true,
  "payload": "",
  "method": "get",
  "contentType": "application/x-www-form-urlencoded",
  "validateHttpsCertificates": true,
  "url": "https://www.googleapis.com/customsearch/v1?key=--redacted--&cx=--redacted--&q=test"
}
[15-05-04 18:26:36:812 EDT] Obtained 132,000,000 search results in 0.74 seconds.

 


Identify your application to Google with API key

(excerpted from Google's documentation.)

  1. Go to the Google Developers Console.

  2. Select a project, or create a new one.

  3. In the sidebar on the left, expand APIs & auth. Next, click APIs. In the list of APIs, make sure the status is ON for the Custom Search API.

    . . .

  4. In the sidebar on the left, select Credentials.

    Create your application's API key by clicking Create new Key under Public API access. For Google Script use, create a Browser key.

  5. Once the Key for browser applications is created, copy the API key into your code.

Create a custom search engine

Follow the instructions here. Once you've created your custom search engine, copy the Search engine ID into your code.

这篇关于如何在Google应用脚本中从urlfetch获取Google搜索结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆