在Google自定义搜索API中指定日期范围 [英] Specifying a Date Range in Google Custom Search API
问题描述
在特定日期范围内的某个网站中搜索一组关键字很简单:在Google搜索框中输入
It's trivial to search for a set of keywords in a certain website in a specific date range: in the google search box you enter
desired-kewords site:desired-website
然后从工具"菜单中选择日期范围.
then from the Tools menu you pick the date range.
例如在2011年1月1日至2013年12月31日之间在www.cnn.com上搜索"arab spring":
e.g. "arab spring" search term in www.cnn.com between 1th Jan 2011 and 31th Dec 2013:
如您在第二张图片中看到的,大约有773个结果! 搜索URI看起来像这样:
As you can see in the second picture there are about 773 results! The search URI looks like this :
可以在tbs参数的 cd_min 和 cd_max 中看到日期范围(只要使用工具菜单,URI就会出现在URI中).
The date range could be seen in cd_min and cd_max of the tbs parameter (which appears in URI whenever the tools menu is used).
我想使用Google的python自定义搜索API客户端以编程方式获得相同的功能.
I would like to get the same functionality programmatically using Google's custom search API client for python.
我定义了一个自定义搜索引擎:
I defined a custom search engine:
然后尝试了我在Web/堆栈溢出中发现的不同建议:
Then tried different suggestions I found on the web/stack overflow:
关于使用Google自定义搜索的日期范围搜索的信息API 引用了此处,并建议使用排序'参数表示帮忙(sort ='date:r:yyyymmdd:yyyymmdd').它不起作用:"totalResults"为"44900".
This post about Date range search using Google Custom Search API referred to here and suggests using the 'sort' parameter to do the favour (sort = 'date:r:yyyymmdd:yyyymmdd'). It did not work: "totalResults" is "44900".
这篇文章建议使用日期限制字段,效果不佳.
This post suggests using date restrict field which does not work as well.
好吧!任何有效的解决方案?
Well! Any working solution?
推荐答案
我可能来晚了,但是对于其他寻求解决方案的人,您可以尝试以下方法:
I might be late, but for other people searching for the solution, you can try this:
from googleapiclient.discovery import build
my_api_key = "YOUR_API_KEY"
my_cse_id = "YOUR_CSE_ID"
def google_results_count(query):
service = build("customsearch", "v1",
developerKey=my_api_key)
result = service.cse().list(q=query, cx=my_cse_id, sort="date:r:20110101:20131231").execute()
return result["searchInformation"]["totalResults"]
print google_results_count('arab spring site:www.cnn.com')
此代码将返回大约1500多个结果.
This code will return around 1500+ results.
离网络结果还差得远, Google解释了为什么.
It is still far from the web results, Google has an explanation why.
Also, if you haven't setup your CSE to search the entire web, here's a guide on how to set it up.
P.S.如果您仍想获取Web版本的结果/数据,则可以使用BeautifulSoup或其他库将其抓取.
这篇关于在Google自定义搜索API中指定日期范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!