在多线程Java程序中计时远程调用 [英] Timing a remote call in a multithreaded java program

查看：69 发布时间：2020/9/20 19:05:22 java multithreading benchmarking

本文介绍了在多线程Java程序中计时远程调用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在编写一个压力测试，它将向远程服务器发出许多呼叫.我想在测试后收集以下统计信息:

I am writing a stress test that will issue many calls to a remote server. I want to collect the following statistics after the test:

远程呼叫的延迟时间(以毫秒为单位).
远程服务器每秒可以处理的操作数.

我可以成功获得(2)，但是我遇到了(1)的问题.我当前的实现与其他SO问题中显示的实现非常相似.我在那个问题中描述了同样的问题:使用多线程运行测试时，使用System.currentTimeMillis()报告的延迟时间比预期的长.

I can successfully get (2), but I am having problems with (1). My current implementation is very similar to the one shown in this other SO question. And I have the same problem described in that question: latency reported by using System.currentTimeMillis() is longer than expected when the test is run with multiple threads.

我分析了问题，并确定问题出在线程交织上(请参阅我对上面链接的另一个问题的回答以获取详细信息)，并且System.currentTimeMillis()并不是解决此问题的方法.

I analyzed the problem and I am positive the problem comes from the thread interleaving (see my answer to the other question that I linked above for details), and that System.currentTimeMillis() is not the way to solve this problem.

似乎我应该可以使用java.lang.management来做到这一点，它具有一些有趣的方法，例如:

It seems that I should be able to do it using java.lang.management, which has some interesting methods like:

ThreadMXBean.getCurrentThreadCpuTime()
ThreadMXBean.getCurrentThreadUserTime()
ThreadInfo.getWaitedTime()
ThreadInfo.getBlockedTime()

ThreadMXBean.getCurrentThreadCpuTime()
ThreadMXBean.getCurrentThreadUserTime()
ThreadInfo.getWaitedTime()
ThreadInfo.getBlockedTime()

我的问题是，即使我已经阅读了API，但仍然不清楚这些方法中的哪一种可以满足我的需求.在我链接的另一个SO问题的上下文中，这是我需要的:

My problem is that even though I have read the API, it is still unclear to me which of these methods will give me what I want. In the context of the other SO question that I linked, this is what I need:

long start_time = **rightMethodToCall()**;

result = restTemplate.getForObject("Some URL",String.class);
long difference = (**rightMethodToCall()** - start_time);

即使在多线程环境中，difference也可以使我很好地估计远程调用所花费的时间.

So that the difference gives me a very good approximation of the time that the remote call took, even in a multi-threaded environment.

限制:我想避免使用synchronized块来保护该代码块，因为我的程序还有其他允许继续执行的线程.

Restriction: I'd like to avoid protecting that block of code with a synchronized block because my program has other threads that I would like to allow to continue executing.

提供更多信息.

问题是这样的:我想给远程呼叫计时，而只是对远程呼叫计时.如果我使用System.currentTimeMillis或System.nanoTime()，并且如果我的线程多于内核，那么可能会有这个线程交织:

The issue is this: I want to time the remote call, and just the remote call. If I use System.currentTimeMillis or System.nanoTime(), AND if I have more threads than cores, then it is possible that I could have this thread interleaving:

线程1:启动时间很长...
线程1:结果= ...
线程2:较长的启动时间...
线程2:结果= ...
线程2:长期不同...
线程1:长差...

如果发生这种情况，则Thread2计算出的差是正确的，但Thread1计算出的差是不正确的(它将大于应有的差).换句话说，为了测量Thread1中的差异，我想排除第4行和第5行的时间.这是线程正在等待吗?

If that happens, then the difference calculated by Thread2 is correct, but the one calculated by Thread1 is incorrect (it would be greater than it should be). In other words, for the measurement of the difference in Thread1, I would like to exclude the time of lines 4 and 5. Is this time that the thread was WAITING?

以不同的方式总结问题，以防其他人更好地理解它(这是@ jason-c在评论中的表达方式):

Summarizing question in a different way in case it helps other people understand it better (this quote is how @jason-c put it in his comment.):

[我正在尝试为远程呼叫计时，但是使用多个线程运行测试只是为了增加测试量.

推荐答案

使用System.nanoTime()(但在此答案末尾查看更新).

Use System.nanoTime() (but see updates at end of this answer).

您绝对不希望使用当前线程的CPU或用户时间，因为用户感知的延迟是挂钟时间，而不是线程CPU时间.您也不想使用当前线程的阻塞或等待时间，因为它会测量每个线程的争用时间，而这也不能准确地表示您要测量的内容.

You definitely don't want to use the current thread's CPU or user time, as user-perceived latency is wall clock time, not thread CPU time. You also don't want to use the current thread's blocking or waiting time, as it measures per-thread contention times which also doesn't accurately represent what you are trying to measure.

System.nanoTime()将返回相对准确的结果(尽管从技术上讲，只能保证粒度与currentTimeMillis()相同或更好，但实际上，粒度往往要好得多，通常使用硬件时钟或其他性能计时器来实现，例如或Linux上的clock_gettime)，并使用具有固定参考点的高分辨率时钟，可以准确测量您要测量的内容.

System.nanoTime() will return relatively accurate results (although granularity is technically only guaranteed to be as good or better than currentTimeMillis(), in practice it tends to be much better, generally implemented with hardware clocks or other performance timers, e.g. QueryPerformanceCounter on Windows or clock_gettime on Linux) from a high resolution clock with a fixed reference point, and will measure exactly what you are trying to measure.

long start_time = System.nanoTime();
result = restTemplate.getForObject("Some URL",String.class);
long difference = (System.nanoTime() - start_time);
long milliseconds = difference / 1000000;

System.nanoTime()确实具有自己的一套问题，但请注意不要被偏执狂激怒；对于大多数应用来说，这已经足够了.例如，当您将音频样本发送到硬件或其他东西时，您只是不想使用它来精确计时(无论如何，您都不会直接在Java中这样做).

System.nanoTime() does have it's own set of issues but be careful not to get whipped up in paranoia; for most applications it is more than adequate. You just wouldn't want to use it for, say, precise timing when sending audio samples to hardware or something (which you wouldn't do directly in Java anyways).

更新1:

更重要的是，您如何知道测量值比预期更长?如果您的测量显示真正的时钟时间，并且某些线程花费的时间比其他线程长，那么这仍然是用户感知的延迟的准确表示，就像某些用户会遇到的那样更长的延迟时间.

More importantly, how do you know the measured values are longer than expected? If your measurements are showing true wall clock time, and some threads are taking longer than others, that is still an accurate representation of user-perceived latency, as some users will experience those longer delay times.

更新2(基于注释的澄清):

那么我上面的大部分答案仍然有效；但出于不同的原因.

Much of my above answer is still valid then; but for different reasons.

使用每个线程的时间不能给您准确的表述，因为在远程请求仍在处理时，线程可能处于空闲/不活动状态，因此，即使是感知到的延迟的一部分，您也可以从测量中排除该时间.

Using per-thread time does not give you an accurate representation because a thread could be idle/inactive while the remote request is still processing, and you would therefore exclude that time from your measurement even though it is part of perceived latency.

远程服务器引入了更多的不准确性，需要花费更长的时间来处理您同时发出的请求-这是您要添加的一个额外变量(尽管可以接受，因为它代表远程服务器正忙).

Further inaccuracies are introduced by the remote server taking longer to process the simultaneous requests you are making - this is an extra variable that you are adding (although it may be acceptable as representative of the remote server being busy).

挂墙时间也不是完全准确的，因为如您所见，本地线程开销的变化可能会增加额外的延迟，这通常在单请求客户端应用程序中不存在(尽管这可以作为客户端的代表被接受)多线程应用程序，但这是您无法控制的变量.

Wall time is also not completely accurate because, as you have seen, variances in local thread overhead may add extra latency that isn't typically present in single-request client applications (although this still may be acceptable as representative of a client application that is multi-threaded, but it is a variable you cannot control).

在这两个时间中，壁挂时间仍将使您比每个线程的时间更接近实际结果，这就是为什么我在上面留下了上一个答案的原因.您有几种选择:

Of those two, wall time will still get you closer to the actual result than per-thread time, which is why I left the previous answer above. You have a few options:

您可以按顺序在单个线程上进行测试-这最终是满足您陈述的要求的最准确的方法.
您创建的线程数不能超过核心数，例如一个固定大小的线程池，具有对每个内核的绑定亲和力(严格: Java线程亲和力)，并以每个任务.当然，由于底层机制的同步超出了您的控制范围，因此仍然会添加任何变量.这可能会降低交错的风险(尤其是如果您设置了亲和力)，但是您仍然无法完全控制例如JVM正在运行的其他线程或系统上其他不相关的进程.
您可以测量远程服务器上的请求处理时间；当然，这不会考虑网络延迟.
您可以继续使用当前方法，并对结果进行一些统计分析，以消除异常值.
您根本无法衡量它，而只是在进行优化之前就进行用户测试并等待对其发表评论(即与正在开发的人一起衡量).如果优化此功能的唯一原因是针对UX，则很可能是用户拥有令人愉悦的体验，并且等待时间是完全可以接受的.

You could do your tests on a single thread, serially -- this is ultimately the most accurate way to achieve your stated requirements.
You could not create more threads than cores, e.g. a fixed size thread pool with bound affinities (tricky: Java thread affinity) to each core and measurements running as tasks on each. Of course this still adds any variables due to synchronization of underlying mechanisms that are beyond your control. This may reduce the risk of interleaving (especially if you set the affinities) but you still do not have full control over e.g. other threads the JVM is running or other unrelated processes on the system.
You could measure the request handling time on the remote server; of course this does not take network latency into account.
You could continue using your current approach and do some statistical analysis on the results to remove outliers.
You could not measure this at all, and simply do user tests and wait for a comment on it before attempting to optimize it (i.e. measure it with people, who are what you're developing for anyways). If the only reason to optimize this is for UX, it could very well be the case that users have a pleasant experience and the wait time is totally acceptable.

此外，这些都不能保证系统上其他不相关的线程不会影响您的计时，所以这就是为什么a)多次运行测试并进行平均(显然)和b)都很重要的原因为您可以接受的计时误差设置一个可接受的要求(您真的是否需要知道此精度，例如0.1ms的精度?).

Also, none of this makes any guarantees that other unrelated threads on the system won't be affecting your timings, but that is why it is important to both a) run your test multiple times and average (obviously) and b) set an acceptable requirement for timing error's that you are OK with (do you really need to know this to e.g. 0.1ms accuracy?).

就我个人而言，我要么做第一个单线程方法，让它运行一整夜或一个周末，要么使用您现有的方法，从结果中去除异常值，并接受时间上的误差.您的目标是在误差令人满意的范围内找到一个现实的估算.在确定可接受的条件时，您还需要考虑如何最终使用此信息.

Personally, I would either do the first, single-threaded approach and let it run overnight or over a weekend, or use your existing approach and remove outliers from the result and accept a margin of error in the timings. Your goal is to find a realistic estimate within a satisfactory margin of error. You will also want to consider what you are going to ultimately do with this information when deciding what is acceptable.

这篇关于在多线程Java程序中计时远程调用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在多线程Java程序中计时远程调用 [英] Timing a remote call in a multithreaded java program

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

在多线程Java程序中计时远程调用 [英] Timing a remote call in a multithreaded java program

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭