健康检查是否应该调用其他应用程序健康检查 [英] Should Health Checks call other App Health Checks

查看:117
本文介绍了健康检查是否应该调用其他应用程序健康检查的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我控制着两个API A和B,并且都具有准备状态和活动状态健康检查. A对B有依赖性.

I have two API's A and B that I control and both have readiness and liveness health checks. A has a dependency on B.

A
/foo - This endpoint makes a call to /bar in B
/status/live
/status/ready

B
/bar
/status/live
/status/ready

由于依赖关系,A的就绪状态健康检查是否应该调用API B的就绪状态健康检查?

Should the readiness health check for A make a call to the readiness health check for API B because of the dependency?

推荐答案

如果服务A可以满足业务请求,则它已准备就绪.因此,如果能够到达B是 需要做的事情(看起来确实如此),那么它应该检查B.

Service A is ready if it can serve business requests. So if being able to reach B is part of what it needs to do (which it seems it is) then it should check B.

让A检查B的好处是您可以失败糟糕的滚动升级速度很快.假设您的A配置错误,因此升级为B提供了错误的连接详细信息-也许B的服务名称已作为环境变量注入,而新版本有错字.如果您的A实例在启动时检查到B,则可以更轻松地确保升级失败,并且没有流量流向配置错误的新Pod.有关更多信息,请参见

An advantage of having A check for B is you can then fail fast on a bad rolling upgrade. Say your A gets misconfigured so that the upgrade features a wrong connection detail for B - maybe B's service name is injected as an environment variable and the new version has a typo. If your A instances check to Bs on startup then you can more easily ensure that the upgrade fails and that no traffic goes to the new misconfigured Pods. For more on this see https://medium.com/spire-labs/utilizing-kubernetes-liveness-and-readiness-probes-to-automatically-recover-from-failure-2fe0314f2b2e

通常,对于A来说,检查B的活动性端点或任何最小可用性端点,而不是B的就绪性端点,就足够了.这是因为kubernetes将就绪端点执行的检查要多于活动检查.请记住,kubernetes将定期调用这些探针-端到端交易检查 ,您希望这些检查包含最少的逻辑并且不消耗过多的负载.

It would typically be enough for A to check B's liveness endpoint or any minimal availability endpoint rather than B's readiness endpoint. This is because kubernetes will be checking B's readiness probe for you anyway so any B instance that A can reach will be a ready one. Calling B's liveness endpoint rather than readiness can make a difference if B's readiness endpoint performs more checks than the liveness one. Keep in mind that kubernetes will be calling these probes regularly - readiness as well as liveness - they both have a period. The difference is whether the Pod is withdrawn from serving traffic (if readiness fails) or restarted (if liveness fails). You're not trying to do end-to-end transaction checks, you want these checks to contain minimal logic and not use up too much load.

最好在A的准备就绪实现中的代码进行检查,而不是在k8s级别(在Pod规范本身中)进行检查.在k8s级别上这样做是第二好的,因为理想情况下您想知道容器中运行的代码确实可以连接.

It is preferable if the code within A's implementation of readiness does the check rather than doing the check at the k8s level (in the Pod spec itself). It is second-best to do it at the k8s level as ideally you want to know that the code running in the container really does connect.

可以使用另一种检查依赖服务的方法与检查initContainer .使用initContainers可以避免在启动过程中看到多次重新启动(通过确保正确的顺序),但是通过探针对依赖项进行检查会更深入(如果在应用程序的代码中实现),并且探针在启动后将继续定期运行.因此,同时使用两者可能是有利的.

Another way to check dependent services are available is with a check in an initContainer. Using initContainers avoids seeing multiple restarts during startup (by ensuring correct ordering) but doing the checks to dependencies through probes can go deeper (if implemented in the app's code) and the probes will continue to run periodically after startup. So it can be advantageous to use both.

请谨慎检查其他服务是否处于就绪状态,因为这可能导致级联的不可用性.例如,如果后端短暂掉线而前端正在对其进行探测,则该前端也将变得不可用,因此将无法显示良好的错误消息.您可能想从简单的探针开始,并在添加过程中仔细添加复杂性.

Be careful of checking other services from readiness too liberally as it can lead to cascading unavailability. For example, if a backend briefly goes down and a frontend is probing to it then the frontend will also become unavailable and so won't be able to display a good error message. You might want to start with simple probes and carefully add complexity as you go.

这篇关于健康检查是否应该调用其他应用程序健康检查的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆