使用 GitHub GraphQL Api 获取按星级排名前 10 的 javascript/开源存储库 [英] get the top 10 javascript/opensource repositories ranked by star using GitHub GraphQL Api

查看:23
本文介绍了使用 GitHub GraphQL Api 获取按星级排名前 10 的 javascript/开源存储库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 Python 项目中使用 GitHub GraphQL Api 获取 按星级排名前 10 位的 javascript/开源存储库(以及一些相关信息).到目前为止,我有这个查询:

I would like to get the top 10 javascript/opensource repositories ranked by star (and some related informations) using GitHub GraphQL Api in a python project. I have this query so far:

query{
  search(type: REPOSITORY, query: "language:javascript", first:10) {
    userCount
    edges {
      node {
        ... on Repository {
          name
          url
          stargazers {
            totalCount
          }
          owner{
            login
          }
        }
      }
    }
  }
}

问题在于它并不总是返回相同的结果:它将返回 10 个随机存储库,在每次查询时按 starcount 排序,而不是绝对前 10 个.

The problem is that it does not always return the same result: it will return 10 random repositories ordered by starcount at each query rather than the absolute top 10.

最重要的是,我想要那些开源的.

And on top of that I’d like to get the ones that are open source.

我使用查询

query{
licenses{name}
}

获取许可证列表,但我不知道这是否是一个详尽的列表(似乎缺少一些许可证,例如 MIT).根据文档,它是

to get a list of licences but I don’t know if this is an exhaustive list (seems like it's missing some licenses like MIT). According to the doc it is

返回已知开源许可证的列表.

Return a list of known open source licenses.

如何获取一份详尽的许可证列表并将其添加到我上面的主要查询中,以使我的研究更加精确?

How to get an exhaustive lists of the licences and add it to my main query above to make my research more precise?

我似乎找不到明确的答案,因为有关 GitHub GraphQl api 的文档很少而且相当模糊.

I can't seem to find clear answers as the documentation about the GraphQl api for GitHub is scarce and quite vague.

谢谢

推荐答案

我从 GitHub Support 得到了部分解释结果不一致的原因:是由于查询运行时超时很长.

I got an partial explanation from GitHub Support about the reason of why the results are inconsistent: it's due to the fact that there is a timeout when queries run for too long.

对于我们的搜索基础架构执行而言,某些查询的计算成本很高.为了让每个人都能快速搜索,我们限制了任何单个查询可以运行的时间.在查询超过时间限制的极少数情况下,搜索会返回超时前找到的所有匹配项,并通知您超时发生.

Some queries are computationally expensive for our search infrastructure to execute. To keep search fast for everyone, we limit how long any individual query can run. In rare situations when a query exceeds the time limit, search returns all matches that were found prior to the timeout and informs you that a timeout occurred.

达到超时并不一定意味着搜索结果不完整.这只是意味着在搜索所有可能的数据之前查询已停止.

Reaching a timeout does not necessarily mean that search results are incomplete. It just means that the query was discontinued before it searched through all possible data.

我们的团队在这里写了这篇文章:

Our team wrote about this here:

https://help.github.com/articles/故障排除搜索查询/#potential-timeouts

鉴于这一现实,这些超时可能会导致在翻阅结果时出现不一致.我们看到了在未来的搜索迭代中可以如何改进这一点,所以我们已经让我们的团队知道,让他们知道,尽管我们不能对具体的变化做出任何承诺.

Given this reality, these timeouts may cause inconsistencies while paging through the results. We see how this could be improved in future iterations of search, so we've let our team know so they're aware though we can't make any promises on specific changes.

由支持提供,添加query: "language:javascript stars:>1600"(1600 或多或少是前 3000 个代表的最小星数,但需要很大足以缩小搜索范围)将始终提供按星级排序的前 10 个存储库.

Provided by the support, adding query: "language:javascript stars:>1600" (1600 is more or less the minimum star count of the top 3000 reps but need to be big enough to narrow the search) will provide consistently the top 10 repos ordered by star.

这篇关于使用 GitHub GraphQL Api 获取按星级排名前 10 的 javascript/开源存储库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆