似乎在等待的线程的高CPU利用率 [英] High CPU utilization for threads that seem to be waiting

查看:733
本文介绍了似乎在等待的线程的高CPU利用率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在运行一些JMeter测试来测试Web服务的性能。它使用了非常多的CPU。对于一个JMeter请求线程,它使用10-30%的任何位置(取决于请求的类型)。当我将其提升到仅仅15个线程时,我获得了~95%的CPU利用率。当然,我想知道发生了什么。我做了一个Hprof CPU示例(我尝试了时间选项但是花了一个半小时才开始我的服务而没有消息会通过)。以下是该抽样结果的片段(超过15分钟)。

I'm currently running some JMeter tests to test the performance of a web service. It's using a very high amount of CPU. For one JMeter request thread, it's using anywhere from 10-30% (depending on the type of request). When I pump it up to a mere 15 threads, I get ~95% CPU utilization. Naturally, I want to figure out what's going on. I did an Hprof CPU sample (I tried the times option but it took an hour and a half to start my service and no messages would go through). Below are the snippets of the results from that sampling (over a 15 minute span).


CPU SAMPLES BEGIN (total = 220846) Fri Aug 22 13:38:54 2014
rank   self  accum   count trace method
   1 14.96% 14.96%   33038 300514 java.net.PlainSocketImpl.socketAccept
   2 14.84% 29.80%   32776 301258 sun.nio.ch.EPollArrayWrapper.epollWait
   3 12.45% 42.26%   27505 313002 sun.nio.ch.EPollArrayWrapper.epollWait
   4  7.48% 49.73%   16517 300604 java.net.PlainSocketImpl.socketAccept
   5  7.18% 56.91%   15856 303203 sun.nio.ch.EPollArrayWrapper.epollWait
   6  6.18% 63.09%   13639 313001 sun.nio.ch.ServerSocketChannelImpl.accept0
   7  6.04% 69.13%   13329 304259 sun.nio.ch.EPoll.epollWait
   8  5.11% 74.23%   11275 307102 sun.nio.ch.EPollArrayWrapper.epollWait

以及相应的堆栈那些顶级样本:

And the corresponding stack for those top samples:


TRACE 300514:
    java.net.PlainSocketImpl.socketAccept(:Unknown line)
    java.net.AbstractPlainSocketImpl.accept(:Unknown line)
    java.net.ServerSocket.implAccept(:Unknown line)
    java.net.ServerSocket.accept(:Unknown line)
    sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(:Unknown line)
    sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(:Unknown line)
    java.lang.Thread.run(:Unknown line)
TRACE 301258:
    sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line)
    sun.nio.ch.EPollArrayWrapper.poll(:Unknown line)
    sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.select(:Unknown line)
    org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:327)
TRACE 313002:
    sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line)
    sun.nio.ch.EPollArrayWrapper.poll(:Unknown line)
    sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.select(:Unknown line)
    org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1163)
    java.lang.Thread.run(:Unknown line)
TRACE 300604:
    java.net.PlainSocketImpl.socketAccept(:Unknown line)
    java.net.AbstractPlainSocketImpl.accept(:Unknown line)
    java.net.ServerSocket.implAccept(:Unknown line)
    java.net.ServerSocket.accept(:Unknown line)
    sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(:Unknown line)
    sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(:Unknown line)
    sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(:Unknown line)
    java.lang.Thread.run(:Unknown line)
TRACE 303203:
    sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line)
    sun.nio.ch.EPollArrayWrapper.poll(:Unknown line)
    sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.select(:Unknown line)
    net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:217)
    net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:836)
TRACE 313001:
    sun.nio.ch.ServerSocketChannelImpl.accept0(:Unknown line)
    sun.nio.ch.ServerSocketChannelImpl.accept(:Unknown line)
    org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:793)
    java.lang.Thread.run(:Unknown line)
TRACE 304259:
    sun.nio.ch.EPoll.epollWait(:Unknown line)
    sun.nio.ch.EPollPort$EventHandlerTask.poll(:Unknown line)
    sun.nio.ch.EPollPort$EventHandlerTask.run(:Unknown line)
    java.lang.Thread.run(:Unknown line)
TRACE 307102:
    sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line)
    sun.nio.ch.EPollArrayWrapper.poll(:Unknown line)
    sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line)
    sun.nio.ch.SelectorImpl.select(:Unknown line)
    net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:217)
    net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:836)

正如您所看到的,超过一半的CPU使用量似乎来自应该等待的线程。难道不应该占用CPU时间吗?

As you can see, over half of the CPU usage seems to be coming from threads that should be waiting. Shouldn't that not be taking up CPU time?

我看到这个帖子 http://www.brendangregg.com/blog/2014-06-09/java-cpu-sampling-using-hprof.html 这可能会让我觉得这个结果是误导,但我的顶级-H结果显示最大的CPU使用率,Zabbix监控也是如此。所以,它似乎实际上在消耗CPU。但是,有一个链接来自hprof作者的引用,其中指出:

I saw this thread http://www.brendangregg.com/blog/2014-06-09/java-cpu-sampling-using-hprof.html which might make me think this result is misleading, but my "top -H" results showed maximum CPU usage, as did Zabbix monitoring. So, it appears that it is actually consuming CPU. However, there is a link there to a quote from the hprof author which states:

当你的Java线程以某种方式不使用CPU但是管理保持活动状态时,那么看起来好像这些堆栈跟踪正在消耗如果没有那么大的CPU时间。
When you have Java threads that are somehow not using the CPU, but managing to stay active, then it will appear as if those stack traces are consuming large amounts of CPU time when they aren't.

有人可以解释为什么会出现这种情况以及我可以做些什么来减少CPU使用率这些情况?或者所有CPU使用率指标实际上都是误导性的?如果是这样,在我的服务中理解真正CPU利用率的更好方法是什么?

Could someone explain why this might be the case and what I can do to lessen the CPU usage in these cases? Or are all the CPU usage indicators actually misleading? If so, what would be a better way to understand true CPU utilization in my service?

推荐答案

正如Brendan Gregg所指出的那样你链接的文章,来自JVM认为可运行的所有线程的hprof样本。正如您在Thread.state的Javadoc中所看到的,JVM区分以下线程状态:

As Brendan Gregg points out the blog article you linked, hprof samples from all threads the JVM considers runnable. As you can see in the Javadoc of Thread.state, the JVM distinguishes the following thread states:



  • NEW :尚未启动的线程处于此状态。

  • RUNNABLE:在Java虚拟机中执行的线程处于此状态。

  • BLOCKED:阻塞等待监视器锁定的线程处于此状态。

  • WAITING:无限期等待另一个线程执行特定操作的线程处于此状态。 / li>
  • TIMED_WAITING:正在等待另一个线程执行最多指定等待时间的操作的线程处于此状态。

  • 已终止:A退出的线程处于此状态。

  • NEW: A thread that has not yet started is in this state.
  • RUNNABLE: A thread executing in the Java virtual machine is in this state.
  • BLOCKED: A thread that is blocked waiting for a monitor lock is in this state.
  • WAITING: A thread that is waiting indefinitely for another thread to perform a particular action is in this state.
  • TIMED_WAITING: A thread that is waiting for another thread to perform an action for up to a specified waiting time is in this state.
  • TERMINATED: A thread that has exited is in this state.

正如我们所看到的,JVM没有等待I / O的线程的专用状态。那是因为这样的线程实际上是被操作系统阻止的,而不是JVM。也就是说,就JVM而言,等待网络适配器的线程是可运行的。实际上,RUNNABLE状态的详细Javadoc写道:

As we can see, the JVM does not have a dedicated state for a thread waiting for I/O. That's because such a thread is actually blocked by the operating system, not the JVM. That is, as far the JVM is concerned, a thread waiting for the network adapter is runnable. Indeed, the detail Javadoc for the RUNNABLE state writes:


可运行线程的线程状态。处于可运行状态的线程正在Java虚拟机中执行,但它可能正在等待来自操作系统的其他资源,例如处理器。

Thread state for a runnable thread. A thread in the runnable state is executing in the Java virtual machine but it may be waiting for other resources from the operating system such as processor.

因此,在hprofcpu采样中存在I / O方法并不意味着这些方法消耗了CPU,因为它们的等待时间也被计算在内。

Therefore, the presence of I/O methods in a hprof "cpu" sampling does not imply that these methods consumed CPU, as their wait time is counted as well.

您可以:


  • 假设I / O方法不负责CPU消耗,并专注于其他方法

  • 使用更好的探查器,考虑等待操作系统级资源

这篇关于似乎在等待的线程的高CPU利用率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆