协议缓冲区首次使用高延迟 [英] Protocol Buffer first usage high latency

查看:90
本文介绍了协议缓冲区首次使用高延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我们的一个 Java 应用程序中,我们有相当多的协议缓冲区类,并且 jar 本质上公开了一个接口和另一个应用程序使用的方法.我们注意到第一次调用这个方法的调用时间非常长(>500ms),而后续调用要快得多(<10ms).起初我们认为这与我们的代码有关,但是在分析之后我们无法确认这一点.通过消除过程,很明显它与协议缓冲区有关.

In one of our java applications we have quite a few protocol buffer classes and the jar essentially exposes one interface with one method that is used by another application. We have noticed that the first time this method is called the invocation time is quite high (>500ms) while subsequent calls are much faster (<10ms). At first we assumed this has something to do with our code, however after profiling we could not confirm this. Through process of elimination it became obvious that it has something to do with protocol buffers.

这在不同的应用程序中得到进一步证实,该应用程序的工作方式完全不同 - 但也使用协议缓冲区 - 表现出相同的行为.此外,我们尝试在启动时为所有 proto 缓冲区类创建一个虚拟实例(XY.newBuilder().build()),并且对于我们添加的每个类,我们可以注意到第一次调用下降的开销.

This was further confirmed when in a different application, that works completely different - but also uses protocol buffers - showed the same behavior. Additionally we tried creating a dummy instance (XY.newBuilder().build()) of all the proto buffer classes at startup and with each of those we added we could notice the overhead of the first invocation drop.

在 .NET 中,我可以找到另一个显示类似问题的问题(为什么 ProtoBuf 在第一次调用时如此缓慢,但在内部循环中却非常快?),但是那里的解决方案似乎特定于带有预编译序列化程序的 C#.到目前为止,我在 Java 中找不到相同的问题.是否有适用于 Java 的上述问题中所示的解决方法?

In .NET I can find another question that shows the similar problem (Why is ProtoBuf so slow on the 1st call but very fast inside loops?), however the solution there seems to be specific to C# with precompiling serializers. I couldn't find the same issue in Java so far. Are there workarounds like the one shown in the question above that apply to java?

推荐答案

JVM 附带即时 (JIT) 编译器,它可以对您的代码进行大量优化.如果您想进一步了解它,可以深入研究 JVM 内部.会有类加载和卸载、性能分析、代码编译和反编译、偏向锁定等.

JVM ships with just-in-time (JIT) compiler which does a lot of optimization to your code. You can dig into JVM internals if you want to understand it further. There will be class loading and unloading, performance profiles, code compilation and de-compilation, biased locking, etc.

举个例子,根据 这篇文章,在 OpenJDK 中有两个编译器(C1 和 C2),有五个可能的代码编译层:

To give you an example how complex this can get, as per this article, in OpenJDK there are two compilers (C1 and C2) with five possible tiers of code compilation:

分层编译有五层优化.它从第 0 层(解释器层)开始,其中检测提供有关性能关键方法的信息.很快,第 1 层,简单的 C1(客户端)编译器优化了代码.在第 1 层,没有分析信息.接下来是第 2 层,其中只编译了几个方法(再次由客户端编译器).在第 2 层,对于这几种方法,为入口计数器和环回分支收集分析信息.然后,第 3 层将看到客户端编译器编译的所有方法以及完整的分析信息,最后第 4 层将利用服务器编译器 C2.

Tiered compilation has five tiers of optimization. It starts in tier-0, the interpreter tier, where instrumentation provides information on the performance critical methods. Soon enough the tier 1 level, the simple C1 (client) compiler, optimizes the code. At tier 1, there is no profiling information. Next comes tier 2, where only a few methods are compiled (again by the client compiler). At tier 2, for those few methods, profiling information is gathered for entry-counters and loop-back branches. Tier 3 would then see all the methods getting compiled by the client compiler with full profiling information, and finally tier 4 would avail itself of C2, the server compiler.

这里的要点是,如果您需要可预测的性能,则应始终通过在每次部署后运行一些虚拟请求来预热代码.

The takeaway here is that if you require a predictable performance you should always warmup your code by running some dummy requests after each deployment.

您使用创建所有使用过的 protobuff 对象的虚拟代码做了正确的事情,但您应该更进一步并预热您正在使用的实际方法.

You did the right thing with dummy code creating all used protobuff objects but you should take it a step further and warmup the actual method you are hitting.

这篇关于协议缓冲区首次使用高延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆