使用 GitHub GraphQL API v4 查询单个存储库中的所有提交 [英] Querying all commits in a single repository with the GitHub GraphQL API v4

查看:27
本文介绍了使用 GitHub GraphQL API v4 查询单个存储库中的所有提交的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过 GitHub 的 GraphQL API v4 查询对 GitHub 上指定存储库的所有提交.

I'm trying to query all of the commits to a specified repository on GitHub via GitHub's GraphQL API v4.

我只想提取他们提交的日期,以估计贡献给该存储库的总时间(类似于 git-hours)

I only want to pull the dates they were committed at, in order to estimate the total time that was contributed to that repository (something along the lines of git-hours)

这是我的初始查询:(注意:您可以尝试在 Explorer 中运行它)

Here's my initial query: (note: you can try to run it in the Explorer)

{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        history {
          nodes {
            committedDate
          }
        }
      }
    }
  }
}

不幸的是,由于 API 的资源限制,它只返回最新的 100 次提交:

Unfortunately it returns only the latest 100 commits, because of the API's resource limitations:

要通过模式验证,所有 GraphQL API v4 调用都必须满足以下标准:

Node Limit

To pass schema validation, all GraphQL API v4 calls must meet these standards:

  • 客户端必须在任何连接上提供第一个或最后一个参数.
  • first 和 last 的值必须在 1-100 之间.
  • 单个调用请求的节点总数不能超过 500,000.

因此,由于我没有提供 firstlast 参数,API 假定我正在查询 history(first: 100).而且我不能在单个连接中查询超过 100 个节点.

So since I'm not supplying a first or last argument, the API assumes I'm querying for history(first: 100). And I can't query more than 100 nodes in a single connection.

但是,总节点限制要高得多(500,000),我应该能够以 100 个为一组查询提交,直到我拥有所有提交.

However, the total node limit being much higher (500,000), I should be able to query commits in groups of 100 until I have all of them.

我能够使用此查询查询最新的 200 次提交:

I was able to query the latest 200 commits using this query:

{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        total: history {
          totalCount
        }
        first100: history(first: 100) {
          edges {
            cursor
            node {
              committedDate
            }
          }
        }
        second100: history(after: "700f17be6752a13a8ead86458e343d2d637ee3ee 99") {
          edges {
            cursor
            node {
              committedDate
            }
          }
        }
      }
    }
  }
}

但是我必须手动输入我在第二个连接中传递的光标字符串:second100: history(after: "cursor-string") {}.

However I had to manually enter the cursor String that I'm passing in the second connection: second100: history(after: "cursor-string") {}.

如何递归地运行此连接,直到我对存储库中提交的所有 committedDate 进行查询?

How can I recursively run this connection until I have a query for all the committedDates of commits in a repository?

推荐答案

虽然可以通过递归查询存储库上的所有提交,但我找不到可行的解决方案.

Although there could be a way of recursively querying all commits on a repo, I couldn't find a working solution.

我的需求是:

我只想提取他们提交的日期,以估计贡献给该存储库的总时间(类似于 git-hours)

I only want to pull the dates they were committed at, in order to estimate the total time that was contributed to that repository (something along the lines of git-hours)

由于我无法查询完整的提交历史,我不得不假设最近 100 次提交的贡献时间与任何 100 次提交的贡献时间相同.

Since I couldn't query the full commit history, I had to make the assumption that the contributed time over the latest 100 commits is the same as over any 100 commits.

  • 提交历史的totalCount
  • 最近 100 次提交的 committedDate
{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        history {
          totalCount
          nodes {
            committedDate
          }
        }
      }
    }
  }
}

今天运行,查询返回:

{
  "data": {
    "repository": {
      "object": {
        "history": {
          "totalCount": 10807,
          "nodes": [
            {
              "committedDate": "2019-04-04T01:15:33Z"
            },
            {
              "committedDate": "2019-04-03T22:07:09Z"
            },
            {
              "committedDate": "2019-04-03T20:21:27Z"
            },
            // 97 other committed dates
          ]
        }
      }
    }
  }
}

估计总贡献时间

我使用类似于 git-hours 的自述文件.

Estimating total contributed time

I estimated the time contributed in the latest 100 commits using an algorith similar to the one explained on git-hours's README.

然后我将其缩放到 totalCount:

const timeContributedTotal = timeContributedLatest100 * totalCount / 100;

我估计截至今天在 Twitter 的 Bootstrap 上投入了 13152 小时,而 git-hours 在 7 个月前估计为 9959 小时.听起来还不错.

I estimated that 13152 hours were put on Twitter's Bootstrap as of today, where git-hours estimated 9959 hours 7 months ago. Doesn't sound too bad.

至于 React,我总共得到 15097 小时,或 629 天.

As for React, I get a total 15097 hours, or 629 days.

估算非常粗略,但已尽可能接近我的需要.如果您发现任何可能的改进,请随时发表评论或回答.

The estimate is very rough, but it's as close as I could get to what I needed. Feel free to comment or answer if you see any possible improvement.

这篇关于使用 GitHub GraphQL API v4 查询单个存储库中的所有提交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆