在Google自定义搜索API中指定日期范围 [英] Specifying a Date Range in Google Custom Search API

查看:149
本文介绍了在Google自定义搜索API中指定日期范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在特定日期范围内的某个网站中搜索一组关键字很简单:在Google搜索框中输入

It's trivial to search for a set of keywords in a certain website in a specific date range: in the google search box you enter

desired-kewords site:desired-website

然后从工具"菜单中选择日期范围.

then from the Tools menu you pick the date range.

例如在2011年1月1日至2013年12月31日之间在www.cnn.com上搜索"arab spring":

e.g. "arab spring" search term in www.cnn.com between 1th Jan 2011 and 31th Dec 2013:

如您在第二张图片中看到的,大约有773个结果! 搜索URI看起来像这样:

As you can see in the second picture there are about 773 results! The search URI looks like this :

可以在tbs参数的 cd_min cd_max 中看到日期范围(只要使用工具菜单,URI就会出现在URI中).

The date range could be seen in cd_min and cd_max of the tbs parameter (which appears in URI whenever the tools menu is used).

我想使用Google的python自定义搜索API客户端以编程方式获得相同的功能.

I would like to get the same functionality programmatically using Google's custom search API client for python.

我定义了一个自定义搜索引擎:

I defined a custom search engine:

然后尝试了我在Web/堆栈溢出中发现的不同建议:

Then tried different suggestions I found on the web/stack overflow:

关于使用Google自定义搜索的日期范围搜索的信息API 引用了此处,并建议使用排序'参数表示帮忙(sort ='date:r:yyyymmdd:yyyymmdd').它不起作用:"totalResults"为"44900".

This post about Date range search using Google Custom Search API referred to here and suggests using the 'sort' parameter to do the favour (sort = 'date:r:yyyymmdd:yyyymmdd'). It did not work: "totalResults" is "44900".

这篇文章建议使用日期限制字段,效果不佳.

This post suggests using date restrict field which does not work as well.

好吧!任何有效的解决方案?

Well! Any working solution?

推荐答案

我可能来晚了,但是对于其他寻求解决方案的人,您可以尝试以下方法:

I might be late, but for other people searching for the solution, you can try this:

from googleapiclient.discovery import build

my_api_key = "YOUR_API_KEY"
my_cse_id = "YOUR_CSE_ID"

def google_results_count(query):
    service = build("customsearch", "v1",
                    developerKey=my_api_key)
    result = service.cse().list(q=query, cx=my_cse_id, sort="date:r:20110101:20131231").execute()
    return result["searchInformation"]["totalResults"]

print google_results_count('arab spring site:www.cnn.com')

此代码将返回大约1500多个结果.

This code will return around 1500+ results.

离网络结果还差得远, Google解释了为什么.

It is still far from the web results, Google has an explanation why.

此外,如果您尚未设置CSE来搜索整个网络,请

Also, if you haven't setup your CSE to search the entire web, here's a guide on how to set it up.

P.S.如果您仍想获取Web版本的结果/数据,则可以使用BeautifulSoup或其他库将其抓取.

这篇关于在Google自定义搜索API中指定日期范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆