故障排除IIS净网站停运 [英] Troubleshooting an IIS .NET website outage

查看:206
本文介绍了故障排除IIS净网站停运的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

昨晚的网站之一(.NET 4.0的形式)托管在我赢2008 R2(IIS 7.5)服务器启动超时抛出下面的错误为所有连接的用户。

Last night one of the websites (.NET 4.0 forms) hosted on my Win 2008 R2 (IIS 7.5) Server started to time out throwing the following error for all connected users.

TYPE     System.Web.HttpException
MESSAGE  Request timed out.
DETAIL   System.Web.HttpException (0x80004005): Request timed out.

在停电仅限于在IIS中只是一个网站,其他继续正常工作。

The outage was confined to just one website within IIS, the others continued to work fine.

不幸的是我无法找出原因网站被超时。下面是我采取的步骤:

Unfortunately I was unable to identify why the website was timing out. Here are the steps I took:


    我做了
  • 第一件事就是看任务管理器,它揭示了正常的C​​PU和内存使用情况。网络活动也适中。

  • First thing I did was look at the task manager which revealed normal CPU and memory usage. Network activity was also moderate.

然后我打开IIS看在'工作进程'现场连接。大约有60现场连接,所以它看起来并不像任何相关的DDoS。

I then opened IIS to look at the live connections under 'Worker Processes'. There were about 60 live connections, so it didn't look like anything DDoS related.

经过数据库连接(托管一个单独的服务器上),所有罚款!

Checked database connectivity (hosted on a separate server), all fine!

然后我重新在网站上IIS。没有工作

I then reset the website on IIS. That didn't work

我试着然后做一个完整的 IISRESET ...仍然没有运气:(

I tried to then do a complete iisreset...still no luck :(

在结束(和在某些胁迫),我唯一能想到如何解决,这是重新启动服务器。

In the end (and under some duress) the only thing I could think to do to resolve this was to restart the server.

重新启动服务器的工作,但我很紧张,不知道为什么会这样摆在首位。任何人都可以建议我未能结转库存的检查?是否有通过这些各种各样的问题,IIS工作的正式清单?我已审阅IIS日志,但看不出有什么异常就运行到中断。

Restarting the server worked but I am nervous not knowing why this happened in the first place. Can anyone recommend any checks that I failed to carryout? Is there an official checklist for working through these sorts of IIS problems? I have reviewed the IIS logs but don't see anything unusual on the run up to the outage.

任何指针或链接到有用的资源,帮助我了解和缓解这一未来将更加AP preciated。

Any pointers or links to useful resources to help me understand and mitigate against this in future will be much appreciated.

修改

我登录到那天是增加一个额外的网络处理程序组件(用于远程部署)的服务器的唯一一次IIS Web部署。我怀疑这引起了停电的服务器工作了6小时后。

The only time I logged into the server that day was to add an additional web handler component (for remote deploy) to IIS Web Deploy. I'm doubtful this caused the outage as the server worked for for 6 hours after.

推荐答案

由于 IISRESET 并没有帮助,你不得不重新启动整台机器,我会怀疑这是一个全球资源短缺,主要用于网站(或大部分资源消耗)的影响。这可能是因为没有可用的RAM,网络连接拥堵由于一些故障电话(例如排气连接池很多 CLOSE_WAIT 插座,我们已经看到,在生产中由于外部服务的故障)。这可能是也一个特定的客户端的问题,这是机器重启后断开,因此最终问题就消失了。

Because iisreset didn't helped and you had to restart whole machine, I would suspect it was a global resources shortage and mostly used website (or most resource consuming) was impacted. It could be because of not available RAM, network connections congestion due to some malfunctioning calls (for example a lot of CLOSE_WAIT sockets exhausting connections pool, we've seen that in production because of malfunction of external service). It could be also one specific client problem, which was disconnected after machine restart so eventually the problem disappeared.

我就从开始:

历史分析


  • 审核的事件查看器的看到任何错误/从一段时间的警告,

  • 虽然你已经调查了IIS日志,我将与日志分析器的帮助下再次做到这一点蜥蜴做出那样的每个客户端请求的数量,每个客户端网络带宽,每个客户端的平均响应时间等等的一些统计数据。

  • review Event Viewer to see any errors/warnings from that period of time,
  • although you have already looked into IIS logs, I would do it once again with help of Log Parser Lizard to make some statistics like number of request per client, network bandwith per client, average response time per client and so on.

监测


  • 持续监视性能计数器:

    • \\处理器(_Total _)\\%处理器时间

    • \\ NET CLR异常(_Global _)\\ Exceps排名时抛出/秒

    • \\内存\\可用兆字节

    • \\ Web服务(默认网站)\\当前连接(每个站点名称),

    • \\ ASP.NET v4.0.30319 \\申请等待时间

    • \\ ASP.NET v4.0.30319 \\请求电流

    • \\ ASP.NET v4.0.30319 \\请求排队

    • \\过程(XXX)\\工作集

    • \\过程(XXX)\\%处理器时间(每个w3wp进程XXX),

    • \\网络接口(XXX)\\总字节数/秒

    • continuously monitor Performance Counters:
      • \Processor(_Total_)\% Processor Time,
      • \.NET CLR Exceptions(_Global_)\# of Exceps Thrown / sec,
      • \Memory\Available MBytes,
      • \Web Service(Default Web Site)\Current Connections (per each your site name),
      • \ASP.NET v4.0.30319\Request Wait Time,
      • \ASP.NET v4.0.30319\Requests Current,
      • \ASP.NET v4.0.30319\Request Queued,
      • \Process(XXX)\Working Set,
      • \Process(XXX)\% Processor Time (XXX per each w3wp process),
      • \Network Interface(XXX)\Bytes total / sec

      如果这一切会不会导致你任何结论,创建一个<一个href=\"http://blogs.msdn.com/b/friis/archive/2012/01/04/debug-diagnostic-1-2-creating-a-rule-in-hang-mode-to-use-the-response-time-of-the-request-etw.aspx\"相对=nofollow>调试诊断排除来创建长时间运行的请求进程的内存转储和使用WinDbg和PSSCor扩展.NET调试分析它。

      If all this will not lead you to any conclusion, create a Debug Diagnostic rule to create a memory dump of the process for long running requests and analyze it with WinDbg and PSSCor extension for .NET debugging.

      这篇关于故障排除IIS净网站停运的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆