NUMA架构如何影响ActivePivot的性能? [英] How does NUMA architecture affect the performance of ActivePivot?

查看:243
本文介绍了NUMA架构如何影响ActivePivot的性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在将ActivePivot应用程序迁移到新服务器(4个插槽Intel Xeon,512GB内存)。部署之后,我们启动了应用程序基准测试(这是大型OLAP查询与实时事务并发的混合)。测量的性能几乎是我们以前的服务器的两倍,它具有类似的处理器,但内核少两倍,内存减少两倍。

We are migrating an ActivePivot application to a new server (4 sockets Intel Xeon, 512GB of memory). After deploying we launched our application benchmark (that's a mix of large OLAP queries concurrent to real-time transactions). The measured performance is almost twice slower than on our previous server, that has similar processors but twice less cores and twice less memory.

我们调查了两台服务器之间的差异,看起来大的有 NUMA架构(非统一内存访问)。每个CPU插槽在物理上接近内存的1/4,但远离其余部分...运行我们的应用程序的JVM分配一个大的全局堆,每个NUMA节点上有一个随机的堆。我们的分析是内存访问模式非常随机,CPU核心经常浪费时间访问远程内存。

We have investigated the differences between the two servers, and it appears the big one has a NUMA architecture (non uniform memory acccess). Each CPU socket is physically close to 1/4 of the memory, but further away from the rest of it... The JVM that runs our application allocates a large global heap, there is a random fraction of that heap on each NUMA node. Our analysis is that the memory access pattern is pretty random and CPU cores frequently waste time accessing remote memory.

我们正在寻找有关在NUMA服务器上利用ActivePivot的更多反馈。我们可以配置ActivePivot多维数据集或线程池,更改我们的查询,配置操作系统吗?

We are looking after more feedback about leveraging ActivePivot on NUMA severs. Can we configure ActivePivot cubes, or thread pools, change our queries, configure the operating system?

推荐答案

Peter描述了一般的JVM选项目前可用于降低NUMA架构的性能影响。为了保持简短,NUMA感知JVM将相对于NUMA节点对堆进行分区,并且当线程创建新对象时,该对象在运行该线程的核心的NUMA节点中分配(如果同一线程稍后使用)它,对象将在本地内存中)。此外,在压缩堆时,NUMA感知JVM可以避免在节点之间移动大数据块(并减少停止世界事件的长度)。

Peter described the general JVM options available today to reduce the performance impact of NUMA architectures. To keep it short a NUMA aware JVM will partition the heap with respect to the NUMA nodes, and when a thread creates a new object, the object is allocated in the NUMA node of the core that runs that thread (if the same thread later uses it, the object will be in the local memory). Also when compacting the heap the NUMA aware JVM avoids moving large data chunks between nodes (and reduces the length of stop-the-world events).

因此在任何NUMA硬件上并且对于任何Java应用程序,可能应该启用 -XX:+ UseNUMA 选项。

So on any NUMA hardware and for any Java application the -XX:+UseNUMA option should probably be enabled.

但对于ActivePivot来说,这没什么用处:ActivePivot是内存数据库。有实时更新,但大部分数据驻留在主存储器中,用于应用程序的生命周期。无论JVM选项如何,数据都将在NUMA节点之间分配,执行查询的线程将随机访问内存。知道ActivePivot查询引擎的大多数部分运行速度与内存一样快,NUMA影响尤为明显。

But for ActivePivot that does not help much: ActivePivot is an in-memory database. There are real-time updates but the bulk of the data resides in the main memory for the life of the application. Whatever the JVM options, the data will be split among NUMA nodes, and the threads that execute queries will access memory randomly. Knowing that most sections of the ActivePivot query engine run as fast as memory can be fetched, the NUMA impact is particularly visible.

那么如何从ActivePivot中获得最大收益NUMA硬件上的解决方案?

So how can you get the most from your ActivePivot solution on a NUMA hardware?

当ActivePivot应用程序仅使用一小部分资源时,有一个简单的解决方案(我们发现在几个ActivePivot解决方案中经常出现这种情况)在同一台服务器上运行)。例如,ActivePivot解决方案仅使用64个核心中的16个核心,以及TeraByte中的256核心。在这种情况下,您可以将JVM进程本身限制为NUMA节点。

There is an easy solution when the ActivePivot application only uses a fraction of the resources (we find that it is often the case when several ActivePivot solutions run on the same server). For instance an ActivePivot solution that only uses 16 cores out of 64, and 256GB out of a TeraByte. In that case you can restrict the JVM process itself to a NUMA node.

在Linux上,您使用以下选项为JVM启动添加前缀(http://linux.die.net/man/8/numactl ):

On Linux you prefix the JVM launch with the following option ( http://linux.die.net/man/8/numactl ):

numactl --cpunodebind=xxx

如果整个服务器都是专用的对于一个ActivePivot解决方案,您可以利用ActivePivot分布式架构对数据进行分区。如果有4个NUMA节点,则启动4个承载4个ActivePivot节点的JVM,每个节点绑定到其NUMA节点。通过此部署,查询将在节点之间分配,并且每个节点将在正确的NUMA节点内以最高性能执行其工作共享。

If the entire server is dedicated to one ActivePivot solution, you can leverage the ActivePivot Distributed Architecture to partition the data. If there are 4 NUMA nodes, you start 4 JVMs hosting 4 ActivePivot nodes, each one bound to its NUMA node. With this deployment queries are distributed among the nodes, and each node will perform its share of the work at max performance, within the right NUMA node.

这篇关于NUMA架构如何影响ActivePivot的性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆