GitHub API-jq过滤响应的结果数不同 [英] GitHub API - Different number of results for jq filtered response

查看:97
本文介绍了GitHub API-jq过滤响应的结果数不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用GitHub API来列出一系列有据可查的开源Java库.为此,我浏览了GitHub API文档并进行了简单的卷曲.

I am trying to use the GitHub API to make a list of well documented, open source Java libraries. To do so, I went through the GitHub API documentation and made this simple curl.

curl -G  https://api.github.com/search/repositories?q=language:Java+stars:%3E=500+library+java+in:readme > output1.txt

此输出是一个巨大的txt文件,其中包含有关找到的所有存储库的信息.在此示例中,总共有736个匹配项.但是,上面命令中的文件非常不可读,因此我决定使用jq进行一些解析,结果产生了以下代码:

The output of this is a giant txt file, containing information about all of the repositories found. In this example, there was a total of 736 matches. However, the file from the command above is quite unreadable, so I decided to do some parsing using jq, which resulted in the following code:

curl -G  https://api.github.com/search/repositories?q=language:Java+stars:%3E=500+library+java+in:readme \
 | jq ".items[] | {name, description, language, watchers_count, html_url}" > parsedOutput1.txt

在这之后,我得到了大约30个存储库,而不是736个结果,这对我来说是不可接受的.

After this, instead of 736 results, I got something around 30 repositories, which is unacceptable for my purposes.

进行此搜索:在GitHub搜索框中的language:java stars:>=500 java library in:readme给了我相同的736个结果.我真的不知道我在做什么错,所以我可以使用帮助.

Doing this search: language:java stars:>=500 java library in:readme in the GitHub search box gives me the same 736 results. I don't really know what i am doing wrong so I could use the help.

推荐答案

这是一个分页问题,​​如文档中所述,api每个请求仅给您30个项目,因此您需要添加一些代码以包含所有页面.我正在使用bash,因此我的代码最终如下所示:

It was a paging problem, as presented in the documentation, the api only gives you 30 items per request, so you need to add some code to include all the pages. I was using bash so my code ended up like this:

 for i in `seq 1 34`;
        do
            URL="https://api.github.com/search/repositories?q=language:Java+stars:%3E=500+library+java+in:readme&page=$i"
            echo $URL
            curl -G  $URL \
            | jq ".items[] | {name, description, language, watchers_count, html_url}" >> parsedOutput1.txt
        done

另一方面,当执行许多请求时,您应该进行身份验证,否则最终将收到超出API速率限制的消息.

On another note, when doing a lot of requests you should authenticate, otherwise you will end up with the API rate limit exceeded message.

这篇关于GitHub API-jq过滤响应的结果数不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆