应用程序服务重新启动,导致应用程序不可访问(返回502.3错误的网关) [英] App Service restarts causing app to be inaccessible (502.3 Bad Gateway returned)

查看:127
本文介绍了应用程序服务重新启动,导致应用程序不可访问(返回502.3错误的网关)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在过去的几周中,我们遇到了几次问题,导致导致我们的应用程序服务重新启动的问题将导致应用程序无法访问,并返回502.3错误的网关结果.

背景:

这是在ASPNETCore 2.0.1上运行的WebApi,目标是4.6.2.在上一期之前,我们在单个Standard S2实例上运行,该实例始终在5-20%的CPU和大约33%的RAM使用率之间徘徊.现在可以在2个S2上横向扩展(更多信息 在下面的内容上.)

原因:

到目前为止,该问题似乎是由多种因素触发的:

-应用程序部署(将VSTS部署到暂存插槽,然后进行交换)

-更新应用程序设置

-常规重启,超出了我们的控制范围(已启动应用服务)

公共线程似乎是导致应用程序服务重新启动的事物.

通过将相同的提交部署到当前正在运行的内容,以及进行与应用程序代码无关的配置更改,我们已经能够使问题的发生与站点的代码或配置的任何更改脱钩 (例如,添加新的应用程序设置).

发生问题时,我们在kudu事件日志中看到Web进程已成功重新启动-但是,请求似乎不再进入应用程序(这是基于应用程序洞察力请求指标,可能不是足够低的水平 准确地捕获正在发生的事情).对应用程序的外部请求只会旋转,直到它们最终收到502.3 Bad Gateway/Timeout异常.该应用程序无法自行解决此问题-或我们等待的时间不够长 -但是中断已经持续了几分钟.

现有补救措施:

出现了两种不同的纠正方法,可以使问题变得更加直白:

-将应用程序移至其他应用程序服务计划

-将现有应用服务计划扩展到其他实例

在两种情况下,都无需进行任何其他部署/配置更改,该应用即可恢复在线状态.

我的工作理论(确实是一种猜测)是App Service正在使用的前端负载平衡器正在发生某些事情.我在这里没有任何可见性或证据,但是我怀疑这两项操作都改变了人们对负载的看法 平衡器拥有我们的应用程序服务器,然后请求就可以将其连接到一个真正可以正常运行的网站.

可能会重置其运行状况或可用性状态? (我在使用托管服务器的服务器上使用ARR/IIS时曾经历过很多事情,在这种情况下,健康状况不良或Web场中节点的可用性将产生完全相同的结果.)

有人可以提供任何想法或其他信息来帮助诊断吗?

谢谢

安德鲁

解决方案

观察和监视应用程序行为

  • 收集数据
  • 缓解问题
  • 用于更多详细信息,您可以参考此文档: https://docs.microsoft.com/zh-CN/azure/app-service/app-service-web-troubleshoot-http-502-http-503 .

    -------------------------------------------------- -------------------------------------------------- -

    >


    We've been experiencing an issue a few times over the past few weeks where something that causes our app service to restart will render the application inaccessible, and return 502.3 Bad Gateways results.

    Background:

    It's an WebApi running on ASPNETCore 2.0.1 targeting 4.6.2. Prior to the last issue, we were running on a single Standard S2 instance, which has always hovered between 5-20% CPU and around 33% usage on RAM. It's now running scaled out on 2 S2's (more information on that below).

    Causes:

    So far the issue seems to be have been triggered from a number of things:

    - App Deployment (VSTS deployment to a staging slot, followed by a swap)

    - Updates to application settings

    - General restarts out of our control (app service initiated)

    The common thread seems to be things that cause the app service to restart.

    We've been able to decouple the occurrence of the issues from any changes to code or configuration of the site, by deploying identical commits to what is currently running, and by making configuration changes that have no bearing on the application code (for instance, adding new app settings).

    When the problem occurs, we see in the kudu event log that the web process has successfully restarted - however, requests appear to no longer make it to the application (this is based on app insights request metrics, which may not be a low enough level to accurately capture what's happening).  External requests to the application simply spin until they finally receive a 502.3 Bad Gateway/Timeout exception.  The app does not resolve this issue on it's own - or perhaps we haven't waited long enough - but the outages have lasted for several minutes.

    Existing Remedies:

    Two different things have appeared to correct the issue and set things straight:

    - Moving the application to a different app service plan

    - Scaling the existing app service plan out to additional instances

    In both cases, without any additional deployments/config changes, the app has come back online. 

    My working theory (really, a guess) is that there is something going on with the front-end load balancer that App Service is using.  I don't have any visibility or evidence here, but I suspect both these operations are changing the view that the load balancer has of our app server and requests are then able to make it through to a site that was really up and running the whole time. 

    Potentially resetting its health or availability state? (I experienced a lot of this back in the day using ARR/IIS on servers for hosting, where a bad health state or availability for nodes in a web farm would have the same exact result).

    Anyone have any thoughts or additional information I can provide to help diagnose?

    Thanks,

    Andrew

    解决方案

    There are 3 troubleshooting steps for 502 bad gateway errors,

    • Observe and monitor application behavior
    • Collect data
    • Mitigate the issue

    For more details, you may refer this document: https://docs.microsoft.com/en-us/azure/app-service/app-service-web-troubleshoot-http-502-http-503.

    -----------------------------------------------------------------------------------------------------

    Do click on "Mark as Answer" on the post that helps you, this can be beneficial to other community members.


    这篇关于应用程序服务重新启动,导致应用程序不可访问(返回502.3错误的网关)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆