列出 npm 注册表中的所有公共包 [英] List all public packages in the npm registry

查看:40
本文介绍了列出 npm 注册表中的所有公共包的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

出于研究目的,我想列出所有在 npm 上可用的包.我该怎么做?

For research purposes, I'd like to list all the packages that are available on npm. How can I do this?

一些旧文档位于 https://github.com/npm/registry/blob/master/docs/REGISTRY-API.md#get-all 提到了一个 /-/all 端点,它可能曾经工作过,但是 http://registry.npmjs.org/-/all 现在只返回 {"message":已弃用"}.

Some old docs at https://github.com/npm/registry/blob/master/docs/REGISTRY-API.md#get-all mention an /-/all endpoint that presumably once worked, but http://registry.npmjs.org/-/all now just returns {"message":"deprecated"}.

推荐答案

http://blog.npmjs.org/post/157615772423/deprecating-the-all-registry-endpoint 描述了 http://registry.npmjs.org/-/all 端点,以及 https://github.com/npm/registry/blob/master/docs/follower.md 作为替代方法.该教程描述了如何设置一个follower"来接收对 NPM 注册表所做的所有更改.老实说,这……有点奇怪.显然,如果您想对整个 NPM 生态系统进行数据分析,这样的跟随者并不足以替代获取所有包的列表.

http://blog.npmjs.org/post/157615772423/deprecating-the-all-registry-endpoint describes the deprecation of the http://registry.npmjs.org/-/all endpoint, and links to the tutorial at https://github.com/npm/registry/blob/master/docs/follower.md as an alternative approach. That tutorial describes how to set up a "follower" that receives all changes made to the NPM registry. That's... a bit odd, honestly. Clearly such a follower is not an adequate substitute for getting a list of all packages if you want to do data analysis on the entire NPM ecosystem.

但是,在该代码库中,我们了解到 NPM 注册表的核心是位于 https://replicate 的 CouchDB 数据库.npmjs.com._all_docs 端点未禁用,因此我们可以在 https://replicate.npmjs.com/_all_docs 返回一个 JSON 对象,其 rows 属性包含 NPM 上所有公共包的列表.每个包看起来像:

However, within that codebase we learn that at the heart of the NPM registry is a CouchDB database located at https://replicate.npmjs.com. The _all_docs endpoint is not disabled, so we can hit it at https://replicate.npmjs.com/_all_docs to get back a JSON object whose rows property contains a list of all public packages on NPM. Each package looks like:

{"id":"lodash","key":"lodash","value":{"rev":"634-9273a19c245f088da22a9e4acbabc213"}},

在我写这篇文章的时候,响应中有 618660 行,大约 64MB.

At the point that I write this, there are 618660 rows in that response and it comes to around 64MB.

如果您需要有关特定包的更多数据,可以使用其 key 查找特定包 - 例如点击 https://replicate.npmjs.com/lodash 获得一个巨大的文档,其中包含 Lodash 的描述和发布历史.

If you want more data about a particular package, you can look up a particular package using its key - e.g. hit https://replicate.npmjs.com/lodash to get a huge document containing stuff like Lodash's description and release history.

如果你想要所有包的所有当前数据,你可以使用 include_docs 参数给 _all_docs 以在响应中包含实际的文档正文 - 即点击 https://replicate.npmjs.com/_all_docs?include_docs=true.为大量数据做好准备.

If you want all the current data about all packages, you could use the include_docs parameter to _all_docs to include the actual document bodies in the response - i.e. hit https://replicate.npmjs.com/_all_docs?include_docs=true. Be ready for a lot of data.

如果您需要更多数据,例如下载计数,这些 CouchDB 文档中未包含这些数据,那么值得仔细阅读 https://github.com/npm/registry/tree/master/docs 详细介绍了其他一些可用的 API - 问题中指出的警告,不是所有的记录在那里确实有效.

If you need yet more data, like download counts, that is not included in these CouchDB documents, then it is worth perusing the docs at https://github.com/npm/registry/tree/master/docs which detail some other available APIs - with the caveat, noted in the question, that not everything documented there actually works.

这篇关于列出 npm 注册表中的所有公共包的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆