当值的大小高度可变时,ChronicleMap导致JVM崩溃 [英] ChronicleMap causes JVM to crash when values are highly variable in size

查看:255
本文介绍了当值的大小高度可变时,ChronicleMap导致JVM崩溃的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

到目前为止,我们已经成功地将ChronicleMap用于我们想用于它的大多数事情,并且大多数数据集都运行良好.我们拥有的一个用例是将其用作多图,涵盖了这样做的大多数问题.在这种情况下,我们专门将其用作Map<String,Set<Integer>>.但是,我们遇到了一些有趣的JVM崩溃,并且难以找到确定性模式,因此我们可以避免它们.

We've had success so far using ChronicleMap for most things we wanted to use it for, and most data sets have worked just fine. One use case we have is using it as a multimap, covering most of the concerns with doing so. We're using it as a Map<String,Set<Integer>> specifically in this case. However, we've run into some interesting JVM crashes and are having trouble finding a deterministic pattern so we can avoid them.

因此,在将所有Set<Integer>放入ChronicleMap之前,我们已将其完全放在JVM中,因此我们立即编写以减少碎片.由于我们将其完全存储在内存中,因此我们可以确定最大和平均Set<Integer>的大小,并可以轻松地使用ChronicleMapBuilder.averageValueSize适当地调整ChronicleMap的大小.在大多数情况下,这很好.

So, before we put all the Set<Integer> into ChronicleMap, we have it entirely in the JVM so we write at once to reduce fragmentation. Since we have it entirely in memory, we can determine what the max and average Set<Integer> size is, and can easily size the ChronicleMap appropriately using ChronicleMapBuilder.averageValueSize. In most cases, this works just fine.

但是,在某些情况下,当Set<Integer>的大小偏离平均值时,JVM会崩溃.例如,平均大小可能是400,但是我们可以有离群集,其中包含20,000个整数.我们仍然可以使用一组400个整数的平均序列化大小来调整地图的大小,然后开始填充ChronicleMap,直到到达非常大的列表为止.

In some cases, however, the JVM crashes when the size of the Set<Integer> deviates to far from the average. For example, the average size might be 400, but we could have outlier sets with 20,000 integers in them. We can still size the map using the average serialized size of a set of 400 integers, and it starts populating ChronicleMap just fine until it reaches a list of a very large size.

所以问题是:我如何确定我可以偏离平均值多大?我希望平均值确实是平均值,但是似乎有一些最大值会导致JVM死亡.

So the question is: how do I figure out how big I can deviate from the average? I was hoping the average was indeed an average, but there appears to be some max that above that causes the JVM to die.

我们设计了一种算法,将大集合拆分为较小的集合(例如,如果密钥是AAA,那么现在有密钥AAA:1,AAA:2,... AAA:n).拆分集的大小是平均大小的10倍.换句话说,如果平均大小为500,但是我们有一个20,000,则将其分为四个5,000(500 * 10)元素集.

We devised an algorithm to split the large sets into smaller sets (e.g. if the key was AAA, then now there are keys AAA:1, AAA:2, ... AAA:n). The size of the split setwas 10 times the average size. In other words, if the average size was 500, but we had a set that was 20,000, we'd split it into four 5,000 (500 * 10) element sets.

在大多数情况下这是可行的,但是随后我们遇到了另一个奇怪的案例,即使这种分裂也不足够.我将该因子减小到平均大小的5倍,现在又可以工作了……但是我怎么知道它足够小呢?我认为知道源问题或确切确定导致问题的原因是最好的方法,但是,have,我不知道为什么ChronicleMap在这里苦苦挣扎.

This worked in most cases, but then we ran into another curious case and even this splitting wasn't sufficient. I reduced the factor to 5 times the average size and now it works again... but how do I know that's small enough? I think knowing the source issue or how to determine exactly what causes it is the best way, but alas, I have no idea why ChronicleMap is struggling here.

另外,FWIW,我使用的是旧版本2.1.17.如果这是新版本中已修复的错误,我想了解一些有关该错误的详细信息,以及是否可以通过我们自己的方式避免该错误(例如拆分集合),但仍继续使用2.1.17(我们'稍后再升级;只是不想让船摇得太多).

Also, FWIW, I'm using an older version 2.1.17. If this is a bug that was fixed in a newer version, I'd like to know a little detail about the bug and if we can avoid it through our own means (like splitting the sets) but still continue using 2.1.17 (we'll upgrade later; just don't want to rock the boat too much more).

推荐答案

如果不能重现该错误,我不能100%确定,但是我知道为什么在这种情况下JVM崩溃.如果我是对的,那么当您的条目大小超过ChronicleMap的64 * chunkSize时,就会发生这种情况.块大小可以直接配置,但是如果仅配置平均键和值大小,则默认为2的幂,介于averageEntrySize/8和averageEntrySize/4之间,其中平均条目大小是averageKeySize和averageValueSize的总和,加上一些内部开销.因此,在您的情况下,很可能如果您具有平均值-400或500个整数(每个4字节)的集合+小键,我想chunkSize被计算为256字节,因此您的条目应小于256 * 64 = 16384字节.

I can not be 100% sure without reproducing the bug, but I have an idea why JVM crashes occur in this case. If I am right, it happens if your entry size exceeds 64 * chunkSize of the ChronicleMap. Chunk size could be configured directly, but if you configure just average key and value sizes, it defaults to such a power of 2, that is between averageEntrySize/8 and averageEntrySize/4, where average entry size is the sum of your averageKeySize and averageValueSize, plus some internal overhead added. So in your case, it is likely that if you have average values - sets of 400 or 500 ints (each 4 bytes), + small keys, I suppose chunkSize is computed as 256 bytes, so your entries should be smaller than 256 * 64 = 16384 bytes.

如果我正确地假设了这个错误的出处,那么《纪事地图3》应该没有这个错误,并且应允许条目任意大于平均大小或块大小.

Again if I'm right in my hypotesis from where this bug comes, Chronicle Map 3 shouldn't have this bug and should allow entries arbitrarily larger than average size or chunk size.

这篇关于当值的大小高度可变时,ChronicleMap导致JVM崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆