Hazelcast与Ignite基准测试 [英] Hazelcast vs. Ignite benchmark

查看:431
本文介绍了Hazelcast与Ignite基准测试的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用数据网格作为我的主要数据库".我注意到Hazelcast和Ignite查询性能之间的巨大差异.我通过适当的自定义序列化和索引优化了数据网格的使用,但IMO仍然明显不同.

I am using data grids as my primary "database". I noticed a drastic difference between Hazelcast and Ignite query performance. I optimized my data grid usage by the proper custom serialization and indexes, but the difference is still noticeable IMO.

由于这里没有人问,我将回答我自己的问题,以备将来参考.这不是一个抽象的(学习)练习,而是一个真实的基准测试,它对大型SaaS系统中的数据网格使用情况进行建模-主要是显示排序和过滤后的分页列表.我主要想知道与原始的无框架Hazelcast和Ignite用法相比,通用的JDBC-ish数据网格访问层增加了多少开销.但是由于我要比较的是苹果,所以基准就来了.

Since no one asked it here, I am going to answer my own question for all future references. This is not an abstract (learning) exercise, but a real-world benchmark, that models my data grid usage in large SaaS systems - primarily to display sorted and filtered paginated lists. I primarily wanted to know how much overhead my universal JDBC-ish data grid access layer adds compared to raw no-frameworks Hazelcast and Ignite usage. But since I am comparing apples to apples, here comes the benchmark.

推荐答案

我已经在GitHub上查看了所提供的代码并发表了很多评论:

I have reviewed the provided code on GitHub and have many comments:

  1. 最重要的一点可能是Apache Ignite索引比Hazelcast更复杂.与Hazelcast不同,Ignite支持ANSI 99 SQL,因此您可以随意编写查询.
  2. 最重要的是,与Hazelcast不同,Ignite支持跨不同缓存或数据类型的组索引和SQL JOIN.假设您有个人和组织表,并且需要选择为同一组织工作的所有个人.在Hazelcast中第一步不可能做到这一点(如果我错了,请纠正我),但是在Ignite中,这是一个简单的SQL JOIN查询.

鉴于上述情况,创建Ignite索引将花费更长的时间,尤其是在您拥有7个索引的测试中.

Given the above, Ignite indexes will take a bit longer to create, especially in your test, where you have 7 of them.

在您的代码中,您存储在缓存中的实体 TestEntity 重新计算 idSort createdAtSort modifiedAtSort的值每次调用getter时.当实体存储在索引树中时,Ignite会多次调用这些getter.对TestEntity类的简单修复可将性能提高4倍: https://gist.github.com/dsetrakyan/6bfe089d53f888448503

In your code, the entity you store in cache, TestEntity, recalculates the value for idSort, createdAtSort, and modifiedAtSort every time the getter is called. Ignite calls these getters several times while the entity is being stored in the index tree. A simple fix to the TestEntity class provides 4x performance improvement: https://gist.github.com/dsetrakyan/6bfe089d53f888448503

测量堆的方法不正确.您至少应该在进行堆测量之前调用 System.gc(),即使那样也不准确.例如,在下面的结果中,使用您的方法,我得到的堆大小为负.

The way you measure heap is incorrect. You should at least call System.gc() before taking the heap measurement, and even that would not be accurate. For example, in the results below, I get negative heap size using your method.

每个基准测试都需要预热.例如,当我按照上面的建议应用 TestEntity 修复程序,并进行两次缓存填充和查询时,我得到了更好的结果.

Every benchmark requires a warm-up. For example, when I apply the TestEntity fix, as suggested above, and do the cache population and queries 2 times, I get better results.

我认为将单节点数据网格测试与MySQL进行比较并不公平,无论是对于Ignite还是Hazelcast而言.数据库具有自己的缓存,每当使用如此小的内存大小时,您通常都在测试数据库内存缓存与数据网格内存缓存.

I don't think comparing a single-node Data Grid test to MySQL is fair, neither for Ignite, nor for Hazelcast. Databases have their own caching and whenever working with such small memory sizes, you are usually testing database in-memory cache vs. Data Grid in-memory cache.

每当对分区缓存进行分布式测试时,通常都会获得性能优势.这样,数据网格将在每个群集节点上并行执行查询,结果返回的速度将更快.

The performance benefit usually comes in whenever doing a distributed test over a partitioned cache. This way a Data Grid will execute the query on each cluster node in parallel and the results should come back a lot faster.

这是我为Apache Ignite获得的结果.我进行了上述修复后,它们看起来好多了.

Here are the results I got for Apache Ignite. They look a lot better after I made the aforementioned fixes.

请注意,第二次执行高速缓存填充和高速缓存查询时,由于HotSpot JVM已预热,因此可获得更好的结果.

Note that the 2nd time we execute the cache population and cache queries, we get better results because the HotSpot JVM is warmed up.

值得一提的是, Ignite不会缓存查询结果.每次运行查询时,都是从头开始执行.

It is worth mentioning that Ignite does not cache query results. Every time you run the query, you are executing it from scratch.

[00:45:15] Ignite node started OK (id=0960e091, grid=Benchmark)
[00:45:15] Topology snapshot [ver=1, servers=1, clients=0, CPUs=4, heap=8.0GB]
Starting - used heap: 225847216 bytes
Inserting 100000 records: ....................................................................................................
Inserted all records - used heap: 1001824120 bytes
Cache: 100000 entries, heap size: 775976904 bytes, inserts took 14819 ms
------------------------------------
Starting - used heap: 1139467848 bytes
Inserting 100000 records: ....................................................................................................
Inserted all records - used heap: 978473664 bytes
Cache: 100000 entries, heap size: **-160994184** bytes, inserts took 11082 ms
------------------------------------
Query 1 count: 100, time: 110 ms, heap size: 1037116472 bytes
Query 2 count: 100, time: 285 ms, heap size: 1037116472 bytes
Query 3 count: 100, time: 19 ms, heap size: 1037116472 bytes
Query 4 count: 100, time: 123 ms, heap size: 1037116472 bytes
------------------------------------
Query 1 count: 100, time: 10 ms, heap size: 1037116472 bytes
Query 2 count: 100, time: 116 ms, heap size: 1056692952 bytes
Query 3 count: 100, time: 6 ms, heap size: 1056692952 bytes
Query 4 count: 100, time: 119 ms, heap size: 1056692952 bytes
------------------------------------
[00:45:52] Ignite node stopped OK [uptime=00:00:36:515]

我将使用更正后的代码创建另一个GitHub存储库,并在我清醒的时候将其发布到此处(咖啡已不再有用).

I will create another GitHub repo with the corrected code and post it here when I am more awake (coffee is not helping anymore).

这篇关于Hazelcast与Ignite基准测试的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆