使用 GitHub GraphQL API v4 查询单个存储库中的所有提交 [英] Querying all commits in a single repository with the GitHub GraphQL API v4
问题描述
我正在尝试通过 GitHub 的 GraphQL API v4 查询对 GitHub 上指定存储库的所有提交.
I'm trying to query all of the commits to a specified repository on GitHub via GitHub's GraphQL API v4.
我只想提取他们提交的日期,以估计贡献给该存储库的总时间(类似于 git-hours)
I only want to pull the dates they were committed at, in order to estimate the total time that was contributed to that repository (something along the lines of git-hours)
这是我的初始查询:(注意:您可以尝试在 Explorer 中运行它)
Here's my initial query: (note: you can try to run it in the Explorer)
{
repository(owner: "facebook", name: "react") {
object(expression: "master") {
... on Commit {
history {
nodes {
committedDate
}
}
}
}
}
}
不幸的是,由于 API 的资源限制,它只返回最新的 100 次提交:
Unfortunately it returns only the latest 100 commits, because of the API's resource limitations:
要通过模式验证,所有 GraphQL API v4 调用都必须满足以下标准:
Node Limit
To pass schema validation, all GraphQL API v4 calls must meet these standards:
- 客户端必须在任何连接上提供第一个或最后一个参数.
- first 和 last 的值必须在 1-100 之间.
- 单个调用请求的节点总数不能超过 500,000.
因此,由于我没有提供 first
或 last
参数,API 假定我正在查询 history(first: 100)
.而且我不能在单个连接中查询超过 100 个节点.
So since I'm not supplying a first
or last
argument, the API assumes I'm querying for history(first: 100)
. And I can't query more than 100 nodes in a single connection.
但是,总节点限制要高得多(500,000),我应该能够以 100 个为一组查询提交,直到我拥有所有提交.
However, the total node limit being much higher (500,000), I should be able to query commits in groups of 100 until I have all of them.
我能够使用此查询查询最新的 200 次提交:
I was able to query the latest 200 commits using this query:
{
repository(owner: "facebook", name: "react") {
object(expression: "master") {
... on Commit {
total: history {
totalCount
}
first100: history(first: 100) {
edges {
cursor
node {
committedDate
}
}
}
second100: history(after: "700f17be6752a13a8ead86458e343d2d637ee3ee 99") {
edges {
cursor
node {
committedDate
}
}
}
}
}
}
}
但是我必须手动输入我在第二个连接中传递的光标字符串:second100: history(after: "cursor-string") {}
.
However I had to manually enter the cursor String that I'm passing in the second connection: second100: history(after: "cursor-string") {}
.
如何递归地运行此连接,直到我对存储库中提交的所有 committedDate
进行查询?
How can I recursively run this connection until I have a query for all the committedDate
s of commits in a repository?
推荐答案
虽然可以通过递归查询存储库上的所有提交,但我找不到可行的解决方案.
Although there could be a way of recursively querying all commits on a repo, I couldn't find a working solution.
我的需求是:
我只想提取他们提交的日期,以估计贡献给该存储库的总时间(类似于 git-hours)
I only want to pull the dates they were committed at, in order to estimate the total time that was contributed to that repository (something along the lines of git-hours)
由于我无法查询完整的提交历史,我不得不假设最近 100 次提交的贡献时间与任何 100 次提交的贡献时间相同.
Since I couldn't query the full commit history, I had to make the assumption that the contributed time over the latest 100 commits is the same as over any 100 commits.
- 提交历史的
totalCount
- 最近 100 次提交的
committedDate
{
repository(owner: "facebook", name: "react") {
object(expression: "master") {
... on Commit {
history {
totalCount
nodes {
committedDate
}
}
}
}
}
}
今天运行,查询返回:
{
"data": {
"repository": {
"object": {
"history": {
"totalCount": 10807,
"nodes": [
{
"committedDate": "2019-04-04T01:15:33Z"
},
{
"committedDate": "2019-04-03T22:07:09Z"
},
{
"committedDate": "2019-04-03T20:21:27Z"
},
// 97 other committed dates
]
}
}
}
}
}
估计总贡献时间
我使用类似于 git-hours
的自述文件.
Estimating total contributed time
I estimated the time contributed in the latest 100 commits using an algorith similar to the one explained on git-hours
's README.
然后我将其缩放到 totalCount
:
const timeContributedTotal = timeContributedLatest100 * totalCount / 100;
我估计截至今天在 Twitter 的 Bootstrap 上投入了 13152 小时,而 git-hours
在 7 个月前估计为 9959 小时.听起来还不错.
I estimated that 13152 hours were put on Twitter's Bootstrap as of today, where git-hours
estimated 9959 hours 7 months ago. Doesn't sound too bad.
至于 React,我总共得到 15097 小时,或 629 天.
As for React, I get a total 15097 hours, or 629 days.
估算非常粗略,但已尽可能接近我的需要.如果您发现任何可能的改进,请随时发表评论或回答.
The estimate is very rough, but it's as close as I could get to what I needed. Feel free to comment or answer if you see any possible improvement.
这篇关于使用 GitHub GraphQL API v4 查询单个存储库中的所有提交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!