使用GitHub GraphQL API v4查询单个存储库中的所有提交 [英] Querying all commits in a single repository with the GitHub GraphQL API v4

查看:121
本文介绍了使用GitHub GraphQL API v4查询单个存储库中的所有提交的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过GitHub的GraphQL API v4查询对GitHub上指定存储库的所有提交.

I'm trying to query all of the commits to a specified repository on GitHub via GitHub's GraphQL API v4.

我只想提取它们的提交日期,以估算对该存储库贡献的总时间(类似于

I only want to pull the dates they were committed at, in order to estimate the total time that was contributed to that repository (something along the lines of git-hours)

这是我的初始查询:(注意:您可以尝试在 Explorer中运行它)

Here's my initial query: (note: you can try to run it in the Explorer)

{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        history {
          nodes {
            committedDate
          }
        }
      }
    }
  }
}

不幸的是,由于API的资源限制<, /a>:

Unfortunately it returns only the latest 100 commits, because of the API's resource limitations:

节点限制

要通过架构验证,所有GraphQL API v4调用都必须满足以下标准:

Node Limit

To pass schema validation, all GraphQL API v4 calls must meet these standards:

  • 客户端必须在任何连接上提供第一个或最后一个参数.
  • 第一个和最后一个的值必须在1到100之间.
  • 单个呼叫最多只能请求500,000个节点.
  • Clients must supply a first or last argument on any connection.
  • Values of first and last must be within 1-100.
  • Individual calls cannot request more than 500,000 total nodes.

因此,由于我没有提供firstlast参数,因此该API假定我正在查询history(first: 100).而且我无法在单个连接中查询100个以上的节点.

So since I'm not supplying a first or last argument, the API assumes I'm querying for history(first: 100). And I can't query more than 100 nodes in a single connection.

但是,总节点限制要高得多(500,000),我应该能够以100为一组查询提交,直到我拥有所有提交为止.

However, the total node limit being much higher (500,000), I should be able to query commits in groups of 100 until I have all of them.

我能够使用以下查询查询最近的200次提交:

I was able to query the latest 200 commits using this query:

{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        total: history {
          totalCount
        }
        first100: history(first: 100) {
          edges {
            cursor
            node {
              committedDate
            }
          }
        }
        second100: history(after: "700f17be6752a13a8ead86458e343d2d637ee3ee 99") {
          edges {
            cursor
            node {
              committedDate
            }
          }
        }
      }
    }
  }
}

但是我必须手动输入在第二个连接中传递的游标String:second100: history(after: "cursor-string") {}.

However I had to manually enter the cursor String that I'm passing in the second connection: second100: history(after: "cursor-string") {}.

如何在查询存储库中所有提交的所有committedDate提交之前递归运行此连接?

How can I recursively run this connection until I have a query for all the committedDates of commits in a repository?

推荐答案

尽管可以通过某种方式递归查询存储库上的所有提交,但找不到有效的解决方案.

Although there could be a way of recursively querying all commits on a repo, I couldn't find a working solution.

我的需要是

我只想提取它们的提交日期,以便估算贡献给该存储库的总时间(大约为git-hours)

I only want to pull the dates they were committed at, in order to estimate the total time that was contributed to that repository (something along the lines of git-hours)

由于我无法查询完整的提交历史记录,因此我不得不假设最近100次提交的贡献时间与任何100次提交的贡献时间相同.

Since I couldn't query the full commit history, I had to make the assumption that the contributed time over the latest 100 commits is the same as over any 100 commits.

  • 提交历史记录的totalCount
  • 最近100次提交中的committedDate
  • the commit history's totalCount
  • the committedDate of the latest 100 commits
{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        history {
          totalCount
          nodes {
            committedDate
          }
        }
      }
    }
  }
}

今天运行,查询返回:

{
  "data": {
    "repository": {
      "object": {
        "history": {
          "totalCount": 10807,
          "nodes": [
            {
              "committedDate": "2019-04-04T01:15:33Z"
            },
            {
              "committedDate": "2019-04-03T22:07:09Z"
            },
            {
              "committedDate": "2019-04-03T20:21:27Z"
            },
            // 97 other committed dates
          ]
        }
      }
    }
  }
}

估计总贡献时间

我使用与 git-hours的自述文件.

Estimating total contributed time

I estimated the time contributed in the latest 100 commits using an algorith similar to the one explained on git-hours's README.

然后将其缩放为totalCount:

const timeContributedTotal = timeContributedLatest100 * totalCount / 100;

我估计截止到今天,Twitter的Bootstrap上有13152个小时,而7个月前,其中git-hours估计有9959个小时.听起来还不错.

I estimated that 13152 hours were put on Twitter's Bootstrap as of today, where git-hours estimated 9959 hours 7 months ago. Doesn't sound too bad.

关于React,我总共有15097小时(或629天).

As for React, I get a total 15097 hours, or 629 days.

估算值很粗略,但与我所能达到的要求差不多.如果您看到任何可能的改进,请随时发表评论或回答.

The estimate is very rough, but it's as close as I could get to what I needed. Feel free to comment or answer if you see any possible improvement.

这篇关于使用GitHub GraphQL API v4查询单个存储库中的所有提交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆