具有高RAM使用率的性能故障 [英] Performance glitch with high RAM usage

查看:197
本文介绍了具有高RAM使用率的性能故障的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在具有256GB RAM的盒子上运行了一个高内存使用过程。 当我们的应用程序的测试版本正在运行时,我们目前通常使用50-60GB的RAM,但是当应用程序部署时,预计会增长。 我们遇到的问题
是当我们测试应用程序时,我们会定期减速。 我们使用的机器有48个内核,运行的是Windows 2008 r2企业。

We have a high memory usage process running here on a box that has 256GB of RAM.  We are currently typically using 50-60GB of RAM when our test version of the application is running, but expect this to grow when the application deploys.  The problem that we have is that when we are testing the application we are getting periodic slow downs.  The machine we are using has 48 cores and is running Windows 2008 r2 enterprise.

最初我们每秒服务大约5个完整的请求周期(询问我们的应用程序响应,它计算响应并输出它,然后在大约90秒后一切都变慢,我们通常只为每个请求实现约2秒的响应
时间。 大约20秒后,一切恢复正常,然后我们将在循环再次开始之前经历另外90秒的快速响应。

Initially we are servicing roughly 5 complete request cycles per second (asking our application for a response, it computing the response and outputting it), then after about 90 seconds everything slows down and we are typically only achieving a response time of about 2 seconds for each request.  After about 20 seconds everything returns to normal and then we will experience another 90 seconds of fast responses before the cycle begins again.

我们已关闭病毒检查程序,以及其他所有服务等正在机器上运行。 除了我们在机器上运行之外没有任何应用程序。 我们尝试过本地和整个网络的请求,但性能
不会改变。 我们甚至禁用了分页,看看这是否可能是根本原因,但没有任何运气。

We have turned off virus checkers, and every other service etc. that is running on the machine.  There is no application apart from ours running on the machine.  We have tried requests coming both locally and across the network but the performance doesn't change.  We have even disabled paging to see if that could be the root cause, but without any luck.

当我们观察"process explorer"中的性能时我们在"慢"期间看到的是什么句点是有近0个CPU活动,0个磁盘活动和0个网络活动,它就好像一切都停止了。 但是,我们确实注意到一个事件与缓慢下降一致的
。 每当发生减速时,内核线程就会跳转到进程表中最高的活动线程(大约2%的CPU使用率)。 此外,当我们检查在内核中运行的线程时,它始终是活动的相同线程
。 该线程被标识为"KeDetachProcess"。 阅读此未记录的电话似乎表明它与管理RAM的活动有关。 这似乎与我们的经验一致,因为我们已经提高了应用程序的RAM使用率,因为这个问题只出现了b $ b。

When we observe performance in "process explorer" what we see during the "slow" periods is there is close to 0 CPU activity, 0 disk activity and 0 network activity, it just appears as if everything stops.  However, we do notice one event that is consistent with the slow downs.  Whenever a slow down occurs the Kernel thread jumps to the highest activity thread in the process table (roughly 2% cpu usage).  Further, when we examine the threads that are running within the kernel it is always the same thread which is active.  This thread is identified as "KeDetachProcess".  Reading up on this undocumented call seems to indicate that it is related to activities that manage the RAM.  This would seem to be consistent with our experience as this problem has only emerged as we have ramped up the RAM usage of our application.

我们需要知道的是:


  • 我们是否正确诊断此线程是我们减速的来源
  • 如果是这样,为什么它会阻止我们的应用程序取得进展所以
  • 并且我们有什么方法可以重新配置Windows以便移除这个瓶颈

 

推荐答案

我认为这是一个很好的例子,我们可以尝试通过xperf进行分析。

I think this is a good example which we can try to unalyze by xperf.

我知道你遇到了一个很快的cpu跳跃动力。

I understand that you has encounter one realy shortly cpu jump momentum.

使用以下命令你应该能够"拍照"。确切的线索。

With the following command you should able to "pic up" the exactly thread for this.

1。设置一台具有互联网连接可用性的机器

1. setup one machine with internet connection availability

2。按照指令表格WPT在线帮助(使用chm文件)并配置MS符号服务器的符号路径

2. follow the instruction form WPT online help (using the chm file) and configure the symbol path to MS symbol server

 

3。安装WPT版本你需要的是什么

3. install the WPT version what is required for you os

4。请先尝试以下命令:

4. please try the following command first:

start" xperf -on DiagEasy + Profile + PROC_THREAD + LOADER"

start "xperf -on DiagEasy+Profile+PROC_THREAD+LOADER"

5。等待〜2分钟并确保在cpu jump上完成

5. wait ~2minutes and make sure that on cpu jump was done

6。使用以下命令停止跟踪

6. Stop the trace with the following command

停止" xperf -d C:\ Logs \ Thread.etl" - >我认为这需要时间因为你强大的机器:)

Stop "xperf -d C:\Logs\Thread.etl" --> I think this will take time because of you powerful machine :)

7。将Thread.etl复制到您的附加计算机以检查trce或在大型计算机上打开它(需要访问符号文件)

7. copy the Thread.etl to your additional machine to check the trce or open it on the big machine (need access to the symbol files)

8。确保符号支持正在运行

    (通常我会检查新构建的"symcache"文件夹,并希望在此文件夹中看到一些新文件)

8. mke sure that the symbol support is running
    ( typically I check the new builded "symcache" folder and expect to see some new files into this folder)

9。等待XperfViewer.exe下载所有请求的符号文件 - >花时间......

9. Wait if the XperfViewer.exe has download all requested symbol files --> take time...

10。显示左侧的框架列表

10. Expant the frame list on the left

11。添加"按螺纹CPU采样"的框架

11. Add the frame for "CPU sampling by Thread"

12。选择"按线程进行CPU采样"在窗口和上下文菜单中,您会找到选项"摘要表格"

    此时请确保选项"加载符号"也被选中

12. select the "CPU sampling by thread" window and at the context menue you will find the option "summary table"
     At this point please make sure that the option "load symbols" is selected as well

检查出来:)

BR

 

 


这篇关于具有高RAM使用率的性能故障的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆