在Azure Batch中运行点网代码超时 [英] time out running dot net code in Azure Batch

查看:78
本文介绍了在Azure Batch中运行点网代码超时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否还有其他人使用从Data Factory计划的批处理帐户运行点网络代码超时?该代码已经使用了几个月,但是今天我们的环境之一在西欧地区出现了故障.我们的其他3个环境在北欧 并且运行正常.

Is anyone else getting time outs running dot net code using batch account scheduled from Data Factory? The code has been live for months, but today one of our environments failed, where it is in West Europe region. Our other 3 environments are in North Europe and are running ok. 

服务状态页上没有任何显示 https://azure.microsoft.com/en- gb/status/history/,但上一次发生这种情况是在2018年6月27日,显示结果花了一周的时间 向上.上次该错误影响了我们所有的环境,但是服务状态表明它们仅与一个环境有关.

There is nothing showing on the service status page https://azure.microsoft.com/en-gb/status/history/ but last time this happened, 27th June 2018, it took a week before anything showed up. Last time the error affected all of our environments, but the service status said they only had a problem with one.

这是服务状态消息从上次开始

This is the service status message  from last time

影响摘要:  在2018年6月27日16:00 UTC和2018年6月28日13:00 UTC之间,在西欧使用App Service的部分客户可能已经收到HTTP 500级响应代码,超时或高 访问此区域中托管的App Service(Web,Mobile和API Apps)部署时的延迟.

Summary of impact: Between 16:00 UTC on 27 Jun 2018 and 13:00 UTC on 28 Jun 2018, a subset of customers using App Service in West Europe may have received HTTP 500-level response codes, timeouts or high latency when accessing App Service (Web, Mobile and API Apps) deployments hosted in this region.

根本原因和缓解措施:  在最近的平台部署过程中,西欧的多个App Service规模单位遇到了后端性能遥测集合中的修改导致回归 系统.由于这种回归,具有运行大量工作负载的.NET应用程序的客户可能遇到了应用程序速度慢的问题.根本原因是遥测收集管道效率低下,导致虚拟机整体性能下降 退化和放缓.该问题已自动检测到,工程团队也参与其中.在6月28日10:00 UTC采取了缓解措施,以消除造成问题的效率低下.经过进一步审查,对 6月28日世界标准时间22:00的一部分虚拟机.此时,超过90%的受影响客户看到了缓解.在进行了额外的监视之后,最终缓解措施于6月29日15:00 UTC应用于单个剩余规模单位.所有客户均得到缓解 此时.

Root cause and mitigation: During a recent platform deployment several App Service scale units in West Europe encountered a backend performance regression due to a modification in the telemetry collection systems. Due to this regression, customer with .NET applications running large workloads may have encountered application slowness. The root cause of this was an inefficiency in the telemetry collection pipeline which caused overall virtual machine performance degradation and slowdown. The issue was detected automatically, and the engineering team was engaged. A mitigation to remove the inefficiency causing the issue was applied at 10:00 UTC on June 28. After further review, a secondary mitigation was applied to a subset of VMs at 22:00 UTC on June 28. More than 90% of the impacted customers saw mitigation at this time. After additional monitoring, a final mitigation was applied to a single remaining scale unit at 15:00 UTC on June 29. All customers were mitigated at this time.

下一步:  我们深表歉意对受影响客户的影响.我们一直在不断采取措施,以改善Microsoft Azure平台和我们的流程,以帮助确保不会发生此类事件 将来.在这种情况下,这包括(但不限于):
•删除引起回归的更改
•审查性能回归检测和警报,并在必要时进行调整

Next steps: We sincerely apologize for the impact to affected customers. We are continuously taking steps to improve the Microsoft Azure Platform and our processes to help ensure such incidents do not occur in the future. In this case, this includes (but is not limited to):
• Removing the changes that caused the regression
• Reviewing and if necessary adjusting the performance regression detection and alerting


谢谢

安德鲁

推荐答案

您好,安德鲁,我看了一眼,发现在8月2日确实出现了一个有关北欧虚拟机的小问题.尽管您看到Batch帐户存在问题,但它们确实在VM上运行,因此很可能也会受到影响.

Hi Andrew, I took a look and found a small issue did occur on August 2nd in regards to Virtual Machines in North Europe. Although you saw issues with your Batch account they do run on VMs so it is possible you were also impacted. 

该问题与导致连接问题的不健康的网络基础结构组件有关.如果您看到超时错误,这将是有道理的.

The issue was related to a unhealthy network infrastructure component that causes connectivity issues. This would make sense if you were seeing timeout errors. 

您是否仍然遇到任何问题,或者一切恢复正常?从我的报告中,8月2日的问题已经解决.

Are you still seeing any problems or is everything back to normal? From my reports the issue on August 2nd has been resolved. 


这篇关于在Azure Batch中运行点网代码超时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆