云运行-请求延迟 [英] Cloud Run - Requests latency

查看:61
本文介绍了云运行-请求延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Cloud Run运行连接到Firestore的微服务.微服务基于 s2geometry 创建对象,以创建具有特定属性的多个地理区域,从而帮助本地化用户按照以下方式向他们发送信息我在其中找到它们的区域.

I am trying to use Cloud Run to run a microservice connected to Firestore. The microservice creates objects based on s2geometry to create multiple geographical zones with specific attributes and thus help localizing users to send them information according to the zone I locate them in.

我使用Python 3.7和 FastAPI 来创建微服务以及与其进行通信的路由.

I used Python 3.7 and FastAPI to create the microservice and the routes to communicate with it.

微服务在我的本地计算机和Compute Engines上运行都非常顺利,因为我的大多数路由在测试它们时都需要不到150毫秒的时间来回答.但是,在通过Cloud Run进行部署时,存在延迟问题.微服务有时会花很长时间(最多15分钟)来回答,我无法指出确切原因.

The microservice runs smoothly on my local machine and on Compute Engines as most of my routes takes less than 150 ms to answer when I test them. However I have a latency issue when I deploy it with Cloud Run. From time to time the microservice takes a really long time to answer (up to 15 mins) and I can't pin point what exactly causes it.

下面是一个屏幕截图,我们可以在其中查看请求计数和请求延迟:

Here is a screen shot where we can see the Request Count and the Request Latency :

请求计数和请求延迟

在请求等待时间和请求数量之间没有真正的相关性,或者至少没有琐碎的相关性.我还查看了该服务的内存使用情况,并且内存使用率最多为30%.但是,CPU使用率有时会达到100%,但不一定是在请求缓慢时.

There are no real correlations between the requests latency and the number of requests or at least no trivial ones. I also looked at the memory usage of the service and the memory usage is at 30% at most. The CPU usage however some times hit 100% but not necessarily when requests are slow.

最后,当我浏览跟踪列表并比较具有高延迟的请求时,我注意到了以下差异

Finally when I explored the Trace List and compared requests that have high latency I noticed the following difference

缓慢请求的踪迹
快速请求的踪迹

快速请求似乎可以自称,而慢速请求却不可以,我不知道为什么.

Fast requests seem to call themselves whereas slow requests don't and I do not know why.

目前我们的用户并不多,所以我认为这可能是一个冷门问题,但缓慢的请求并不一定是第一个.

For now we do not really have a lot of users so I thought that it could be a cold start issue but slow requests are not necessarily the first ones.

现在,老实说,我不知道这里发生了什么以及Cloud Run的工作(或我做错了什么),而且我也很难找到关于Cloud Run实际工作方式的详尽解释,所以如果您有一个(除了Google之外的一个),我很乐意深入其中.

Now, to be honest I don't know what's going on here and what Cloud Run does (or what I did wrong) and I also find it pretty difficult to find a thorough explanation on how Cloud Run actually works so if you have one (other than the google one) I would gladly dive into it.

非常感谢您的帮助

推荐答案

经过不同的实验,看来这是

After different experiments it seems that it was a cold start issue. Cloud Run container are stoped after a certain time if they are not begin used and as we did not have a lot of traffic the container had to boot every time a user wanted to access the app.

解决方案:

我创建了云函数,该函数在触发时向容器发送请求,然后创建了<每分钟运行一次该功能的href ="https://cloud.google.com/scheduler" rel ="nofollow noreferrer"> Cloud Scheduler 作业.

I created a Cloud Function that sends a request to the container when triggered and then created a Cloud Scheduler job that runs the function every minute.

注意:

如果将不同的修订路由到您的服务,则需要为每个修订创建Cloud Scheduler作业.为此,您必须创建一个修订URL (标签)用于每个路由的修订版(当前为beta).

If different revisions are routed to your service you need to create a Cloud Scheduler job for each of the revision. To do so you have to create a Revision URL (tag) for each of the routed revision (currently beta).

这篇关于云运行-请求延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆