缓存Github API调用 [英] Caching Github API calls

查看:124
本文介绍了缓存Github API调用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于缓存API调用的常见问题,在这种情况下调用Github API。

假设我的应用程序中有一个页面,显示回购的文件名以及README的内容。这意味着我将不得不进行一些API调用以检索它。



现在,假设我想添加诸如memcached之类的东西,如果我不需要的话,我一遍又一遍地不要再打这些电话。

你通常会怎么做?如果我没有在Github上启用webhook,我无法知道缓存是否应该过期。我总是可以打一个电话来获取HEAD当前的sha,如果它没有改变,请改为使用缓存。但是,这是在回购层面上,而不是在文件层面。



我可以想象我可以用object-sha的方式做类似的事情,但是如果我需要无论如何调用API来获取这些信息,它会破坏缓存的目的。



你会怎么做?我知道像prose.io这样的服务现在没有缓存,但是如果它应该的话,这种方法会是什么?



谢谢

解决方案

仅仅使用HTTP缓存对你的用例足够好了吗? HTTP缓存的目的不仅仅是如果你已经有了一个新的响应,而是提供一种不发出请求的方式 - 而且它还使你能够快速验证缓存中已有的响应是否有效(服务器没有发送完成如果它是新鲜的,则再次响应)。



查看GitHub API响应,我可以看到GitHub正确设置了相关的HTTP头(ETag,Last-modified,Cache-control)。 b
$ b

所以,你只需要做一个GET,例如for:

  GET https://api.github.com/users/izuzak/repos 

并返回:

  200 OK 
...
ETag:df739f00c5053d12ef3c625ad6b0fd08
上次修改时间:2013年2月14日星期四22:31:14 GMT
...

下一次 - 您为相同的资源执行GET操作,但也提供相关的HTTP缓存头文件,以便它实际上是一个条件GET :

  GET https://api.github.com/users/izuzak/repos 
...
If-Modified-Since:Thu,2013年2月14日22:31:14 GMT
If-None-Match:df739f00c5053d12ef3c625ad6b0fd08
...

另外,服务器返回一个304 Not修改响应,你的HTTP客户端会从缓存中获取响应:

  304未修改

,GitHub API会正确执行HTTP缓存,您应该使用它。当然,您必须使用支持HTTP缓存的HTTP客户端。最好的情况是,如果你得到一个304 Not Modified响应 - GitHub不会减少你剩余的API调用配额。请参阅: http://developer.github.com/v3/#conditional-requests



GitHub API还设置了 Cache-Control:private,max-age = 60 标题,所以你有60秒的新鲜感 - 这意味着相同资源的请求间隔不到60秒甚至不会发送到服务器。

您的理由是,如果对资源进行单条件GET请求,如果资源库中的任何内容发生更改(例如显示HEAD的资源),这听起来是合理的 - 因为如果资源没有改变,那么你不必检查单个文件,因为它们没有确定地改变。


I have a general question related to caching of API calls, in this instance calls to the Github API.

Let's say I have a page in my app that shows the filenames of a repo, and the content of the README. This means that I will have to do a few API calls in order to retrieve that.

Now, let's say I want to add something like memcached in between, so I'm not doing these calls over and over, if I don't need to.

How would you normally go about this? If I don't enable a webhook on Github, I have no way of knowing whether the cache should expire. I could always make a single call to get the current sha of HEAD, and if it hadn't changed, use cache instead. But that's on a repo-level, and not on a file level.

I can imagine I could do something like that with the object-sha's, but if I need to call the API anyway to get those, it defeats the purpose of caching.

How would you go about it? I know a service like prose.io has no caching right now, but if it should, what would the approach be?

Thanks

解决方案

Would just using HTTP caching be good enough for your use case? The purpose of HTTP caching is not just to provide a way of not making requests if you already have a fresh response, rather - it also enables you to quickly validate if the response you already have in cache is valid (without the server sending the complete response again if it is fresh).

Looking at GitHub API responses, I can see that GitHub is correctly setting the relevant HTTP headers (ETag, Last-modified, Cache-control).

So, you just do a GET, e.g. for:

GET https://api.github.com/users/izuzak/repos

and this returns:

200 OK
...
ETag:"df739f00c5053d12ef3c625ad6b0fd08"
Last-Modified:Thu, 14 Feb 2013 22:31:14 GMT
...

Next time - you do a GET for the same resource, but also supply the relevant HTTP caching headers so that it is actually a conditional GET:

GET https://api.github.com/users/izuzak/repos
...
If-Modified-Since:Thu, 14 Feb 2013 22:31:14 GMT
If-None-Match:"df739f00c5053d12ef3c625ad6b0fd08"
...

And lo and behold - the server returns a 304 Not modified response and your HTTP client will pull the response from its cache:

304 Not Modified

So, GitHub API does HTTP caching right and you should use it. Granted, you have to use an HTTP client that supports HTTP caching also. The best thing is that if you get a 304 Not modified response - GitHub does not decrease your remaining API calls quota. See: http://developer.github.com/v3/#conditional-requests

GitHub API also sets the Cache-Control: private, max-age=60 header, so you have 60 seconds of freshness -- which means that requests for the same resource made less than 60 seconds apart will not even be made to the server.

Your reasoning about using a single conditional GET request to a resource that surely changes if anything in the repo changed (a resource showing the sha of HEAD, for example) sounds reasonable -- since if that resource hasn't changed, then you don't have to check the individual files since they haven't surely changed.

这篇关于缓存Github API调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆