等待线程资源消耗 [英] Waiting Threads Resource Consumption

查看:67
本文介绍了等待线程资源消耗的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题:

当线程的 TIMED_WAIT 状态(不处于睡眠状态)> 99.9%的时间时,JVM中的大量线程是否会消耗大量资源(内存,CPU)吗?在线程等待时,如果根本需要维护它们,需要花费多少CPU开销?

Does large numbers of threads in JVM consume a lot of resources (memory, CPU), when the threads are TIMED_WAIT state (not sleeping) >99.9% of the time? When the threads are waiting, how much CPU overhead does it cost to maintain them if any are needed at all?

答案是否也适用于与JVM不相关的环境(例如linux内核)?

Does the answer also apply to non-JVM related environments (like linux kernels)?

上下文:

我的程序收到大量占用空间的程序包.它在不同的程序包中存储相似属性的计数.在收到包裹后的给定时间后(可能是数小时或数天),该特定包裹将过期,并且该包裹所贡献的任何计数都应减少.

My program receives a large number of space consuming packages. It store counts of similar attributes within the different packages. After a given period of time after receiving a package(could be hours or days), that specific package expires and any count the package contributed to should be decremented.

当前,我通过将所有软件包存储在内存或磁盘中来实现这些功能.每隔5分钟,我会从存储中删除过期的软件包,并浏览其余的软件包以计算属性.此方法会占用大量内存,并且时间复杂度较差(时间和内存的 O(n),其中n是未过期的软件包数).这使得程序的可伸缩性很差.

Currently, I achieve these functionalities by storing all the packages in memory or disk. Every 5 minutes, I delete the expired packages from storage, and scan through the remaining packages to count the attributes. This method uses up a lot of memory, and has bad time complexity (O(n) for time and memory where n is the number of unexpired packages). This makes scalability of the program terrible.

解决此问题的另一种方法是,每当一个软件包通过时增加属性计数,并启动一个 Timer()线程,该线程在软件包到期后减少属性计数.这样就无需存储所有笨重的程序包,并将时间复杂度降低到 O(1).但是,这又带来了另一个问题,因为我的程序将开始具有 O(n)个线程,这可能会降低性能.由于大多数线程将处于 TIMED_WAIT 状态(Java的 Timer()调用 Object.wait(long)方法),绝大多数在其生命周期中,它还会对CPU产生很大的影响吗?

One alternative way to approach this problem is to increment the attribute count every time a package comes by and start a Timer() thread that decrements the attribute count after the package expires. This eliminates the need to store all the bulky packages and cut the time complexity to O(1). However, this creates another problem as my program will start having O(n) number of threads, which could cut into performance. Since most of the threads will be in the TIMED_WAIT state (Java’s Timer() invokes the Object.wait(long) method) the vast majority of their lifecycle, does it still impact the CPU in a very large way?

推荐答案

首先,是Java(或.NET)线程!=内核/OS线程.

First, a Java (or .NET) thread != a kernel/OS thread.

Java 线程很高封装抽象系统线程的某些功能的级别包装器;这些类型的线程也称为托管线程.在内核级别,线程仅具有2个状态,正在运行和未运行.内核会跟踪一些管理信息(堆栈,指令指针,线程ID等),但是在内核级别没有这样的事情,因为 WaitSleepJoin 状态).这些状态"仅存在于此类上下文中(这是C ++ std的部分原因:: thread 没有 state 成员).

A Java Thread is a high level wrapper that abstracts some of the functionality of a system thread; these kinds of threads are also known as managed threads. At the kernel level a thread only has 2 states, running and not running. There's some management information (stack, instruction pointers, thread id, etc.) that the kernel keeps track of, but there is no such thing at the kernel level as a thread that is in a TIMED_WAITING state (the .NET equivalent to the WaitSleepJoin state). Those "states" only exists within those kinds of contexts (part of why the C++ std::thread does not have a state member).

已经说过,当一个托管线程被阻塞时,它是以两种方式来完成的(取决于在托管级别被请求阻塞的方式);我在OpenJDK中看到的用于线程代码的实现利用信号量来处理托管等待(这是我在具有某种托管"线程类的其他C ++框架以及.NET Core中所看到的库),并将互斥锁用于其他类型的等待/锁定.

Having said that, when a managed thread is being blocked, it's being done so in a couple of ways (depending on how it is being requested to be blocked at the managed level); the implementations I've seen in the OpenJDK for the threading code utilize semaphores to handle the managed waits (which is what I've seen in other C++ frameworks that have a sort of "managed" thread class as well as in the .NET Core libraries), and utilize a mutex for other types of waits/locks.

由于大多数实现都将使用某种锁定机制(如信号量或互斥锁),因此内核通常会执行相同的操作(至少在涉及您问题的地方);也就是说,内核会将线程从运行"队列中移出并将其放入等待"队列中(

Since most implementations will utilize some sort of locking mechanism (like a semaphore or mutex), the kernel generally does the same thing (at least where your question is concerned); that is, the kernel will take the thread off of the "run" queue and put it in the "wait" queue (a context switch). Getting into thread scheduling and specifically how the kernel handles the execution of the threads is beyond the scope of this Q&A, especially since your question is in regards to Java and Java can be run on quite a few different types of OS (each of which handles threading completely differently).

更直接地回答您的问题:

Answering your questions more directly:

当线程的TIMED_WAIT状态(不处于睡眠状态)> 99.9%的时间时,JVM中的大量线程是否会消耗大量资源(内存,CPU)吗?

Does large numbers of threads in JVM consume a lot of resources (memory, CPU), when the threads are TIMED_WAIT state (not sleeping) >99.9% of the time?

为此,有两点需要注意:创建的线程消耗JVM的内存(堆栈,ID,垃圾收集器等),内核消耗内核内存以在内核级别管理线程.除非您特别声明,否则消耗的内存不会更改.因此,如果线程正在睡眠或正在运行,则内存是相同的.

To this, there are a couple of things to note: the thread created consumes memory for the JVM (stack, ID, garbage collector, etc.) and the kernel consumes kernel memory to manage the thread at the kernel level. That memory that is consumed does not change unless you specifically say so. So if the thread is sleeping or running, the memory is the same.

根据线程活动和请求的线程数将改变CPU(请记住,线程也消耗内核资源,因此必须在内核级别进行管理,因此必须处理的线程更多,必须花费更多的内核时间来管理它们.)

The CPU is what will change based on the thread activity and the number of threads requested (remember, a thread also consumes kernel resources, thus has to be managed at a kernel level, so the more threads that have to be handled, the more kernel time must be consumed to manage them).

请记住,调度和运行线程的内核时间非常短(这是设计要点的一部分),但是如果您打算运行 lot ,仍然要考虑一些问题.线程;另外,如果您知道您的应用程序将在只有几个核心的CPU(或群集)上运行,则可用的核心越少,内核必须进行上下文切换的次数就越多,这通常会增加时间.

Keep in mind that the kernel times to schedule and run the threads are extremely minuscule (that's part of the point of the design), but it's still something to consider if you plan on running a lot of threads; additionally, if you know your application will be running on a CPU (or cluster) with only a few cores, the fewer cores you have available to you, the more the kernel has to context switch, adding additional time in general.

线程正在等待时,如果根本需要维护它们,需要花费多少CPU开销?

When the threads are waiting, how much CPU overhead does it cost to maintain them if any are needed at all?

没有.参见上文,但是用于管理线程的CPU开销不会根据线程上下文而改变.可能会使用额外的CPU进行上下文切换,并且可以肯定的是,处于活动状态的线程本身会使用额外的CPU,但是维护等待线程与正在运行的线程并不会给CPU带来额外的成本".

None. See above, but the CPU overhead used to manage the threads does not change based on the thread context. Extra CPU might be used for context switching and most certainly extra CPU will be utilized by the threads themselves when active, but there's no additional "cost" to the CPU to maintain a waiting thread vs. a running thread.

答案是否也适用于与JVM不相关的环境(例如linux内核)?

Does the answer also apply to non-JVM related environments (like linux kernels)?

是,不是.如前所述,托管上下文通常适用于大多数此类环境(例如Java,.NET,PHP,Lua等),但是这些上下文可能会有所不同,并且线程习惯用法和一般功能取决于所使用的内核.因此,尽管一个特定的内核可能每个进程可以处理1000个以上的线程,但某些内核可能有严格的限制,而其他内核则可能存在其他问题,每个进程的线程数更高;您必须参考OS/CPU规格,以查看可能有哪些限制.

Yes and no. As stated, the managed contexts generally apply to most of those types of environments (e.g. Java, .NET, PHP, Lua, etc.), but those contexts can vary and the threading idioms and general functionality is dependant upon the kernel being utilized. So while one specific kernel might be able to handle 1000+ threads per process, some might have hard limits, others might have other issues with higher thread counts per process; you'll have to reference the OS/CPU specs to see what kind of limits you might have.

由于大多数线程将处于TIMED_WAIT状态(Java的Timer()调用Object.wait(long)方法)在其生命周期的绝大部分时间内,它是否仍会以很大的方式影响CPU?>

Since most of the threads will be in the TIMED_WAIT state (Java’s Timer() invokes the Object.wait(long) method) the vast majority of their lifecycle, does it still impact the CPU in a very large way?

否(阻塞线程的一部分),但要考虑的问题:如果(边缘情况)所有(或> 50%)这些线程需要恰好同时运行怎么办?如果您只有几个线程来管理您的软件包,那可能不是问题,而是说您有500多个线程.同时唤醒所有250个线程会导致大量CPU争用.

No (part of the point of a blocked thread), but something to consider: what if (edge case) all (or >50%) of those threads need to run at the exact same time? If you only have a few threads managing your packages, that might not be an issue, but say you have 500+; 250 threads all being woken at the same time would cause massive CPU contention.

由于您尚未发布任何代码,因此很难针对您的情况提出具体建议,但是人们倾向于将属性结构存储为类,并将该类保留在可以引用的列表或哈希图中.在 Timer (或单独的线程)中查看当前时间是否与程序包的到期时间匹配,则将运行"expire"代码.这样可以将线程数减少到1,并将访问时间减少到 O(1);但同样,如果没有代码,则该建议可能不适用于您的情况.

Since you haven't posted any code, it's hard to make specific suggestions to your scenario, but one would be inclined to store a structure of attributes as a class and keep that class in a list or hash map that can be referenced in a Timer (or a separate thread) to see if the current time matches the expiration time of the package, then the "expire" code would run. This cuts down the number of threads to 1 and the access time to O(1); but again, without code, that suggestion might not work in your scenario.

希望有帮助.

这篇关于等待线程资源消耗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆