github 搜索限制结果 [英] github search limit results

查看:54
本文介绍了github 搜索限制结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在 Github 上进行非常大的搜索以获取我论文中的统计数据.

I need to do a very large search on Github for a statistic in my thesis.

比如我需要在 GitHub 上探索大量的 Android 项目,但是该站点将搜索结果限制为 1000 个(例如 https://github.com/search?l=java&q=onCreate&ref=searchresults&type=代码&utf8=%E2%9C%93).同样使用 Java GitHub API,我尝试了使用 GitHubClient.searchRepositories() 方法的库 org.eclipse.egit.github.core.client.GitHubClient 但即使在那里结果数量也是有限的.

For example, I need to explore a large number of Android projects on GitHub, but the site limits the search result to 1000 (ex. https://github.com/search?l=java&q=onCreate&ref=searchresults&type=Code&utf8=%E2%9C%93). Also using the Java GitHub API I tried the library org.eclipse.egit.github.core.client.GitHubClient using the method GitHubClient.searchRepositories() but even there the number of results is limited.

有人知道如何获得所有结果吗?

Does anyone know how to get all results?

推荐答案

Search API 将为每个查询返回多达 1000 个结果(包括分页),如下所述:

The Search API will return up to 1000 results per query (including pagination), as documented here:

https://developer.github.com/v3/search/#about-the-search-api

但是,在执行存储库搜索时,您可以使用一个巧妙的技巧来获取 1000 多个结果.您可以按照创建存储库的日期将搜索拆分为多个段.例如,您可以先搜索在 2013 年 10 月的第一周创建的代码库,然后是第二周,然后是 9 月,依此类推.

However, there's a neat trick you could use to fetch more than 1000 results when executing a repository search. You could split up your search into segments, by the date when the repositories were created. For example, you could first search for repositories that were created in the first week of October 2013, then second week, then September, and so on.

因为您将搜索限制在一个狭窄的时期内,所以您可能会得到少于 1000 个结果,因此能够获得所有结果.如果您发现某个时间段返回的结果超过 1000 个,则必须进一步缩小该时间段,以便收集所有结果.

Because you would be restricting search to a narrow period, you will probably get less than 1000 results, and would therefore be able to get all of them. In case you notice that more than 1000 results are returned for a period, you would have to narrow the period even more, so that you can collect all results.

https://help.github.com/articles/searching-repositories/#search-based-on-when-a-repository-was-created-or-last-updated

您应该能够通过 API 自动执行此操作.

You should be able to automate this via the API.

这篇关于github 搜索限制结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆