将记录添加到空间层后,Neo4J的巨大性能下降 [英] Neo4J huge performance degradation after records added to spatial layer

查看:90
本文介绍了将记录添加到空间层后,Neo4J的巨大性能下降的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有大约7000万个空间记录要添加到空间层(我已经测试了一个很小的集合,并且一切都很顺利,查询返回的结果与postgis相同,并且层操作看起来还不错) 但是当我尝试将所有空间记录添加到数据库中时,性能会迅速下降,以至于它在500万条记录(大约2小时的运行时间)中变得非常缓慢,而在770万条记录(经过8小时)后挂起. /p>

由于空间索引是使用图结构构造自身的Rtree,所以我想知道为什么当os记录数增加时它会退化. 如果我没记错的话,Rtree的插入为O(n),这就是为什么我担心这可能是在重新排列边界框之间的问题,不是树叶子的节点会导致addToLayer进程随着时间的推移变慢.

目前,我正在像这样向该层添加节点(自从我试图在模式和代码样式之前找出问题以来,大量硬编码的东西):

Transaction tx = database.beginTx();
    try {

        ResourceIterable<Node> layerNodes = GlobalGraphOperations.at(database).getAllNodesWithLabel(label);
        long i = 0L;
        for (Node node : layerNodes) {
            Transaction tx2 = database.beginTx();
            try {
                layer.add(node);
                i++;
                if (i % commitInterval == 0) {
                    log("indexing (" + i + " nodes added) ... time in seconds: "
                            + (1.0 * (System.currentTimeMillis() - startTime) / 1000));
                }
                tx2.success();
            } finally {
                tx2.close();
            }
        }
        tx.success();
    } finally {
        tx.close();
    }

有什么想法吗?关于如何提高性能的任何想法吗?

ps .:使用Java API Neo4j 2.1.2,空间0.13 酷睿i5 3570k @ 4.5Ghz,32GB内存 数据库专用2TB 7200硬盘驱动器(无操作系统,无虚拟内存文件,仅数据本身)

ps2 .:所有几何图形都是LineStrings(如果很重要:P),它们表示街道,道路等.

ps3:节点已经存在于数据库中,我只需要将它们添加到层"中即可执行空间查询,bbox和wkb属性都可以,可以进行测试并可以在很小的范围内使用.

提前谢谢

再次更改并运行代码(将点插入数据库仅需5个小时,不涉及任何层)之后,这种情况就会发生,将尝试增加jvm堆和Embeddedgraph内存参数.

indexing (4020000 nodes added) ... time in seconds: 8557.361
Exception in thread "main" org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:140)
    at gis.CataImporter.addDataToLayer(CataImporter.java:263)
    at Neo4JLoadData.addDataToLayer(Neo4JLoadData.java:138)
    at Neo4JLoadData.main(Neo4JLoadData.java:86)
Caused by: javax.transaction.SystemException: Kernel has encountered some problem, please perform neccesary action (tx recovery/restart)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at org.neo4j.kernel.impl.transaction.KernelHealth.assertHealthy(KernelHealth.java:61)
    at org.neo4j.kernel.impl.transaction.TxManager.assertTmOk(TxManager.java:339)
    at org.neo4j.kernel.impl.transaction.TxManager.getTransaction(TxManager.java:725)
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:119)
    ... 3 more
Caused by: javax.transaction.xa.XAException
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:560)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:448)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:385)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:123)
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
    at gis.CataImporter.addDataToLayer(CataImporter.java:256)
    ... 2 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.neo4j.kernel.impl.nioneo.store.DynamicRecord.clone(DynamicRecord.java:179)
    at org.neo4j.kernel.impl.nioneo.store.PropertyBlock.clone(PropertyBlock.java:215)
    at org.neo4j.kernel.impl.nioneo.store.PropertyRecord.clone(PropertyRecord.java:221)
    at org.neo4j.kernel.impl.nioneo.xa.Loaders$2.clone(Loaders.java:118)
    at org.neo4j.kernel.impl.nioneo.xa.Loaders$2.clone(Loaders.java:81)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.ensureHasBeforeRecordImage(RecordChanges.java:217)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.prepareForChange(RecordChanges.java:162)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.forChangingData(RecordChanges.java:157)
    at org.neo4j.kernel.impl.nioneo.xa.PropertyCreator.primitiveChangeProperty(PropertyCreator.java:64)
    at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransactionContext.primitiveChangeProperty(NeoStoreTransactionContext.java:125)
    at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.nodeChangeProperty(NeoStoreTransaction.java:1244)
    at org.neo4j.kernel.impl.persistence.PersistenceManager.nodeChangeProperty(PersistenceManager.java:119)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation$1.visitNodePropertyChanges(KernelTransactionImplementation.java:344)
    at org.neo4j.kernel.impl.api.state.TxStateImpl$6.visitPropertyChanges(TxStateImpl.java:238)
    at org.neo4j.kernel.impl.api.state.PropertyContainerState.accept(PropertyContainerState.java:187)
    at org.neo4j.kernel.impl.api.state.NodeState.accept(NodeState.java:148)
    at org.neo4j.kernel.impl.api.state.TxStateImpl.accept(TxStateImpl.java:160)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation.createTransactionCommands(KernelTransactionImplementation.java:332)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation.prepare(KernelTransactionImplementation.java:123)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.prepareKernelTx(XaResourceManager.java:900)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:510)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:548)
    ... 7 more

28/07->增加内存没有帮助,现在我正在测试RTreeIndex和LayerRTreeIndex的某些修改(maxNodeReferences字段的作用是什么?

// Constructor

public LayerRTreeIndex(GraphDatabaseService database, Layer layer) {
    this(database, layer, 100);     
}

public LayerRTreeIndex(GraphDatabaseService database, Layer layer, int maxNodeReferences) {
    super(database, layer.getLayerNode(), layer.getGeometryEncoder(), maxNodeReferences);
    this.layer = layer;
}

它被硬编码为100,并且当我的addToLayer方法崩溃(明智地增加了节点数)时,更改其值会更改为OutOfMemory错误,如果我没记错,则更改该字段的值会增加或减少树的宽度和深度(比50宽100,而50比100深.)

总结到目前为止的进展:

  • @Jim纠正的交易使用不正确
  • 在@Peter的建议下,内存堆增加到27GB
  • 还有3个空间层,但是现在问题变得现实了,因为它们很大.
  • 在向空间层添加节点时进行了一些内存分析,发现了一些有趣的点.

内存和GC分析: http://postimg.org/gallery/biffn9zq/

在整个过程中使用最多内存的类型是byte [],我只能假定它属于几何wkb属性(几何本身或rtree的bbox). 考虑到这一点,我还注​​意到(您可以检查新的配置文件图像)所使用的堆空间量永远不会低于18GB.

根据此问题是否收集了Java原语垃圾 Java中的基本类型是原始数据,因此不会进行垃圾回收,并且仅在方法返回时才从方法的堆栈中释放(因此,当我创建一个新的空间层时,所有这些wkb字节数组都将保留在内存中,直到我手动关闭图层对象).

这有意义吗?有没有更好的方法来管理内存资源,以便该层不会保留未使用的旧数据?

解决方案

最后通过三个修复程序解决了该问题: 设置cache_type = none 增加Neostore低级图形引擎的堆大小,以及 设置use_memory_mapped_buffers = true,以便内存管理由操作系统而不是慢速的JVM完成

那样,我在空间层中自定义的批处理插入速度更快,并且没有任何错误/异常

感谢所有提供的帮助,我想我的回答只是这里人们提供的所有提示的结合,非常感谢

So I have around 70 million spatial records that i want to add to the spatial layer (I've tested with a small set and everything is smoothly, queries returning the same results as postgis and the layer operation seems fine) But when i try to add all the spatial records to the database, the performance degrades rapidly to the point that it gets really slow at around 5 million (around 2h running time) records and hangs at ~7.7 million (8 hours lapsed).

Since the spatial index is an Rtree that uses the graph structure to construct itself, i am wondering why is it degrading when the number os records increase. Rtree insertions are O(n) if im not mistaken and thats why im concerned it might be something between the rearranging of bounding boxes, nodes that are not tree leaves that are causing the addToLayer process to get slower over time.

Currently im adding nodes to the layer like that (lots of hardcoded stuff since im trying to figure out the problem before patterns and code style):

Transaction tx = database.beginTx();
    try {

        ResourceIterable<Node> layerNodes = GlobalGraphOperations.at(database).getAllNodesWithLabel(label);
        long i = 0L;
        for (Node node : layerNodes) {
            Transaction tx2 = database.beginTx();
            try {
                layer.add(node);
                i++;
                if (i % commitInterval == 0) {
                    log("indexing (" + i + " nodes added) ... time in seconds: "
                            + (1.0 * (System.currentTimeMillis() - startTime) / 1000));
                }
                tx2.success();
            } finally {
                tx2.close();
            }
        }
        tx.success();
    } finally {
        tx.close();
    }

Any thoughts ? Any ideas of how performance could be increased ?

ps.: using java API Neo4j 2.1.2, Spatial 0.13 Core i5 3570k @4.5Ghz, 32GB ram dedicated 2TB 7200 hard drive to the database (no OS, no virtual memory files, only the data itself)

ps2.: All geometries are LineStrings (if thats important :P) they represent streets, roads, etc..

ps3.: the nodes are already in the database, i only need to add them to the Layer so that i can perform spatial queries, bbox and wkb attributes are OK, tested and working for a small set.

Thank you in advance

After altering and running the code again (which takes 5hours only to insert the points into the database, no layer involved) this happened, will try to increase the jvm heap and the embeddedgraph memory parameters.

indexing (4020000 nodes added) ... time in seconds: 8557.361
Exception in thread "main" org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:140)
    at gis.CataImporter.addDataToLayer(CataImporter.java:263)
    at Neo4JLoadData.addDataToLayer(Neo4JLoadData.java:138)
    at Neo4JLoadData.main(Neo4JLoadData.java:86)
Caused by: javax.transaction.SystemException: Kernel has encountered some problem, please perform neccesary action (tx recovery/restart)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at org.neo4j.kernel.impl.transaction.KernelHealth.assertHealthy(KernelHealth.java:61)
    at org.neo4j.kernel.impl.transaction.TxManager.assertTmOk(TxManager.java:339)
    at org.neo4j.kernel.impl.transaction.TxManager.getTransaction(TxManager.java:725)
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:119)
    ... 3 more
Caused by: javax.transaction.xa.XAException
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:560)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:448)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:385)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:123)
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
    at gis.CataImporter.addDataToLayer(CataImporter.java:256)
    ... 2 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.neo4j.kernel.impl.nioneo.store.DynamicRecord.clone(DynamicRecord.java:179)
    at org.neo4j.kernel.impl.nioneo.store.PropertyBlock.clone(PropertyBlock.java:215)
    at org.neo4j.kernel.impl.nioneo.store.PropertyRecord.clone(PropertyRecord.java:221)
    at org.neo4j.kernel.impl.nioneo.xa.Loaders$2.clone(Loaders.java:118)
    at org.neo4j.kernel.impl.nioneo.xa.Loaders$2.clone(Loaders.java:81)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.ensureHasBeforeRecordImage(RecordChanges.java:217)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.prepareForChange(RecordChanges.java:162)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.forChangingData(RecordChanges.java:157)
    at org.neo4j.kernel.impl.nioneo.xa.PropertyCreator.primitiveChangeProperty(PropertyCreator.java:64)
    at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransactionContext.primitiveChangeProperty(NeoStoreTransactionContext.java:125)
    at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.nodeChangeProperty(NeoStoreTransaction.java:1244)
    at org.neo4j.kernel.impl.persistence.PersistenceManager.nodeChangeProperty(PersistenceManager.java:119)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation$1.visitNodePropertyChanges(KernelTransactionImplementation.java:344)
    at org.neo4j.kernel.impl.api.state.TxStateImpl$6.visitPropertyChanges(TxStateImpl.java:238)
    at org.neo4j.kernel.impl.api.state.PropertyContainerState.accept(PropertyContainerState.java:187)
    at org.neo4j.kernel.impl.api.state.NodeState.accept(NodeState.java:148)
    at org.neo4j.kernel.impl.api.state.TxStateImpl.accept(TxStateImpl.java:160)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation.createTransactionCommands(KernelTransactionImplementation.java:332)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation.prepare(KernelTransactionImplementation.java:123)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.prepareKernelTx(XaResourceManager.java:900)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:510)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:548)
    ... 7 more

28/07 -> Increasing memory did not help, now im testing some modifications in the RTreeIndex and LayerRTreeIndex (what exactly does the field maxNodeReferences does ?

// Constructor

public LayerRTreeIndex(GraphDatabaseService database, Layer layer) {
    this(database, layer, 100);     
}

public LayerRTreeIndex(GraphDatabaseService database, Layer layer, int maxNodeReferences) {
    super(database, layer.getLayerNode(), layer.getGeometryEncoder(), maxNodeReferences);
    this.layer = layer;
}

It is hardcoded to 100, and changing its value changes when (number of nodes added wise) my addToLayer method crashes into OutOfMemory error, If im not mistaken, changing that field's value increases or decreases the tree's width and depth (being 100 wider than 50, and 50 being deeper than 100).

To summarize the progress so far:

  • Incorrect usage of transactions corrected by @Jim
  • Memory Heap increased to 27GB following @Peter 's advice
  • 3 spatial layers to go, but now the problem gets real because they're the big ones.
  • Did some memory profiling while adding nodes to the spatial layer and i found interesting points.

Memory and GC profiling: http://postimg.org/gallery/biffn9zq/

The type that uses the most memory througout the entire process is the byte[], which i can only assume belongs to the geometries wkb properties (either the geometry itself or the rtree's bbox). Having that in mind, I also noticed (you can check on the new profiling images) that the ammount of heap space used never goes below the 18GB mark.

According to this question are java primitives garbage collected primitive types in java are raw data, therefore not being subjected to garbage collection, and are only freed from the method's stack when the method returns (so maybe when i create a new spatial layer, all those wkb byte arrays will remain in memory until I manually close the layer object).

Does that make any sense ? isnt there a better way to manage memory resources so that the layer doesnt keep unused, old data loaded ?

解决方案

Finally solved the problem with three fixes: setting cache_type=none increasing heap size for neostore low level graph engine and setting use_memory_mapped_buffers=true so that memory management is done by the OS and not the slowish JVM

that way, my custom batch insertion in the spatial layers went much faster, and without any errors/exceptions

Thanks for all the help provided, i guess my answer is just a combination of all the tips people provided here, thanks very much

这篇关于将记录添加到空间层后,Neo4J的巨大性能下降的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆