Istio 1.3之间的服务之间随机出现“上游连接错误或在标头之前断开连接/重置" [英] Random “upstream connect error or disconnect/reset before headers” between services with Istio 1.3

查看:428
本文介绍了Istio 1.3之间的服务之间随机出现“上游连接错误或在标头之前断开连接/重置"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,这个问题似乎是随机发生的,并且发生在不同的服务之间.

So, this problem is happening randomly (it seems) and between different services.

例如,我们有一个需要与服务B进行通信的服务A,有时我们会收到此错误,但一段时间后,该错误就会消失.而且这种错误不会经常发生.

For example we have a service A which needs to talk to service B, and some times we get this error, but after a while, the error goes away. And this error doesn't happen too often.

发生这种情况时,我们会在服务A中看到错误日志,并抛出上游连接错误"消息,但在服务B中则没有错误消息.因此,我们认为这可能与边车有关.

When this happens, we see the error log in service A throwing the "upstream connect error" message, but none in service B. So we think it might be related with the sidecars.

我们注意到的一件事是,在服务B中,我们在istio-proxy容器中收到了很多这样的错误消息:

One thing we notice is that in service B, we get a lot of this error messages in the istio-proxy container:

[src/istio/mixerclient/report_batch.cc:109] Mixer Report failed with: UNAVAILABLE:upstream connect error or disconnect/reset before headers. reset reason: connection failure

根据文档,当收到请求时,特使询问Mixer是否一切正常(授权等),如果Mixer不答复,则请求失败.这就是为什么存在一个名为policyCheckFailOpen的选项的原因. 我们将其设置为false,我认为这是一个默认设置,如果无法访问Mixer,我们不希望请求通过,但是为什么不能呢?

And according to documentation when a request comes in, envoy asks Mixer if everything is good (authorization and other things), and if Mixer doesn’t reply, the request is not success. So that’s why exists an option called policyCheckFailOpen. We have that in false, I guess is a sane default, we don’t want the request to go through if Mixer cannot be reached, but why can’t?

disablePolicyChecks: true
policyCheckFailOpen: false
controlPlaneSecurityEnabled: false

注意:istio-policy与istio-proxy边车一起运行.正确吗?

NOTE: istio-policy is running with the istio-proxy sidecar. Is that correct?

在其他某些服务中也看不到该错误,该错误也可能会失败.

We don’t see that error in some other service which can also fail.

我可以看到很多其他日志,这种日志发生在所有未在YAML文件中定义的fsGroup以root身份运行的服务中:

Another log that I can see a lot, and this one happens in all the services not running as root with fsGroup defined in the YAML files is:

watchFileEvents: "/etc/certs": MODIFY|ATTRIB
watchFileEvents: "/etc/certs/..2020_02_10_09_41_46.891624651": MODIFY|ATTRIB
watchFileEvents: notifying

我要跟踪的线索之一是有关默认circuitBreakers值的信息.可能与此有关吗?

One of the leads I'm chasing is about default circuitBreakers values. Could that be related with this?

谢谢

推荐答案

您看到的错误是由于无法建立与istio-policy的连接

The error you are seeing is because of a failure to establish a connection to istio-policy

基于此 github问题

社区成员在此处添加两个答案,可以帮助您解决问题

Community members add two answers here which could help you with your issue

如果全局启用了mTLS,请确保设置了controlPlaneSecurityEnabled:true

If mTLS is enabled globally make sure you set controlPlaneSecurityEnabled: true


我遇到了同样的问题,然后我读到了协议选择.我意识到服务定义中端口的名称应以例如http-开头.这为我解决了这个问题.和 .如果仍然遇到问题,则可能需要查看tls-check中的Pod,并使用目的地规则和策略对其进行解决.

I was facing the same issue, then I read about protocol selection. I realised the name of the port in the service definition should start with for example http-. This fixed the issue for me. And . if you face the issue still you might need to look at the tls-check for the pods and resolve it using destinationrules and policies.


istio-policy与istio-proxy边车一起运行.正确吗?

istio-policy is running with the istio-proxy sidecar. Is that correct?

是的,我刚刚检查了一下,它是随身携带的.

Yes, I just checked it and it's with sidecar.

让我知道是否有帮助.

这篇关于Istio 1.3之间的服务之间随机出现“上游连接错误或在标头之前断开连接/重置"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆