为什么select()在我的程序中消耗了这么多CPU时间? [英] Why does select() consume so much CPU time in my program?

查看:393
本文介绍了为什么select()在我的程序中消耗了这么多CPU时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个使用MINA的Java应用程序,它们都使用20个MINA线程。一个应用程序为大约10,000个并发连接提供服20可能是该应用程序的一个合理的线程数,虽然我还没有完全描述它(这个问题正在进行中)。另一个应用程序一次只能提供大约15个连接,但启动IO工作,因此它们非常繁忙,无论如何都有20个MINA线程,这显然太多了。



对我来说很奇怪的是,两个应用程序总是把他们的CPU时间的30%,有时高达60%用于MINA的select()方法,在VisualVM中进行分析。调用堆栈如下所示:

  java.lang.Thread.State:sun.nio.ch上的RUNNABLE 
.EPollArrayWrapper.epollWait(原生方法)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- 已锁定< 40ca5d54> (a sun.nio.ch.Util $ 2)
- 已锁定< 24649fe8> (java.util.Collections $ UnmodifiableSet)
- 已锁定< 3fae9662> (sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.apache.mina.transport.socket.nio.NioProcessor。选择(NioProcessor.java:72)
org.apache.mina.core.polling.AbstractPollingIoProcessor $ Processor.run(AbstractPollingIoProcessor.java:1093)
at org.apache.mina.util.NamePreservingRunnable。 run(NamePreservingRunnable.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java: 603)
at java.lang.Thread.run(Thread.java:722)

它似乎是基于繁忙的民意调查,这对我来说听起来真的不对。



当我看到数字那么高时,我应该担心吗?是什么导致这个?这是我需要优化的东西还是更类似于睡眠或闲置程序?如果它更像是睡眠例程,它会以某种方式被安排为比其他CPU工作更低的优先级吗?



更新:这个主题似乎是同一个问题。我遵循了它的建议,现在正在运行Java 1.7.0_45,但我仍然看到选择在具有10k连接的应用程序中占用高达90%的CPU时间。 / p>

我们使用的是MINA 2.0.4,这意味着这个相关的错误是固定的。

解决方案

不幸的是,这是对数字的错误解释。



我多次面对这种情况(并在 stackoverflow )。



主要原因是VisualVM没有显示正确的CPU时间。它显示 RUNNING 状态中线程时间的百分比。但是来自 Thread.State 的文档:


可运行线程的线程状态。可运行的
状态中的线程正在Java虚拟机中执行,但它可能等待来自操作系统的其他资源
,例如处理器。


这正是正在发生的事情。实际上,在OS epoll_wait()调用内阻塞了线程。在Linux框中,有几种方法可以确认是这种情况。



strace '线程



  $ strace -tttT -f -p [thread-id] 

线程ID可以从 jstack 输出获得:

  $ jstack [java-pid] 
[...]
Netty Builtin Server 1#17 prio = 5 os_prio = 31 tid = 0x00000001013dd800 nid = 0xe12f runnable [ 0x0000700001fe4000]
java.lang.Thread.State:在sun.nio.ch.KQueueArrayWrapper.kevent0(本地方法)的RUNNABLE
在sun.nio.ch.KQueueArrayWrapper.poll上的
(KQueueArrayWrapper。 java:198)
[...]

在这种情况下,线程ID是 0xe12f (应转换为十进制)。您将看到大部分时间线程将在 epoll_wait()调用。



pidstat ing thread



  $ pidstat -tu -p [java-pid] | grep [thread pid] 

你会看到这个帖子的系统和用户CPU时间都很短,这意味着它不消耗CPU。



使用 ps


$ b轮询线程状态$ b

  $ ps -eL -o pid,tid,state | grep [thread-id] 

你会看到大部分时间线程处于状态 S Sl (可中断睡眠)而不是 R (可运行)。



如果服务没有运营问题,最后你不应该担心。


I have several Java applications that use MINA, and all of them use 20 MINA threads. One application serves around 10,000 concurrent connections, that are usually idle but receive input sometimes. 20 is likely a reasonable threadcount for that application, although I haven't exactly profiled it (which this question is getting at). Another application serves only around 15 connections at a time but initiates the IO work so they are very busy, and has 20 MINA threads anyway, which is obviously too many.

The thing that's strange to me is both applications always devote about 30%, sometimes as high as 60%, of their CPU time into MINA's select() method, profiled in VisualVM. The call stack looks like this:

java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <40ca5d54> (a sun.nio.ch.Util$2)
- locked <24649fe8> (a java.util.Collections$UnmodifiableSet)
- locked <3fae9662> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:72)
at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1093)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

It seems to be based in a busy poll, which sounds really wrong to me.

Should I be worried when I see the number that high? What causes this? Is it something I need to optimize out or is it more akin to a sleep or idle routine? If it's more like a sleep routine is it somehow scheduled to be lower priority than other CPU work?

Update: this thread seems to be the same issue. I followed its advice, and am now running Java 1.7.0_45, but I am still seeing select taking as high as 90% of CPU time in an application with 10k connections.

We are using MINA 2.0.4, which means this relevant bug is fixed.

解决方案

Unfortunately, this is wrong interpretation of the numbers.

I've faced this situation many times (and ask a question on stackoverflow too).

The main reason, is VisualVM doesn't show correct CPU time. It showing percentage of the thread time in RUNNING state. But from documentation on Thread.State:

Thread state for a runnable thread. A thread in the runnable state is executing in the Java virtual machine but it may be waiting for other resources from the operating system such as processor.

This is exactly what's going on. Actually, thread is blocked inside OS epoll_wait() call. On Linux box there are several ways you can confirm it's the case.

strace'ing thread

$ strace -tttT -f -p [thread-id]

Thread id could be obtained from jstack output:

$ jstack [java-pid]
[...]
"Netty Builtin Server 1" #17 prio=5 os_prio=31 tid=0x00000001013dd800 nid=0xe12f runnable [0x0000700001fe4000]
  java.lang.Thread.State: RUNNABLE
  at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
  at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
[...]

in this case thread id is 0xe12f (should be converted to decimal). You will see the most of the time thread will be in epoll_wait() call.

pidstating thread

$ pidstat -tu -p [java-pid] | grep [thread pid]

you will see low system as well as user CPU time by this thread, meaning it doesn't consume CPU.

polling thread state using ps

$ ps -eL -o pid,tid,state | grep [thread-id]

you will see most of the time thread in state S or Sl (interruptible sleep) instead of R (runnable).

In the end you should not worrying about that if there are no operational issues with the service.

这篇关于为什么select()在我的程序中消耗了这么多CPU时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆