保存CRUDRepository的方法很慢? [英] save method of CRUDRepository is very slow?

查看:357
本文介绍了保存CRUDRepository的方法很慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在neo4j数据库中存储一些数据。我使用spring-data-neo4j。

i want to store some data in my neo4j database. i use spring-data-neo4j for that.

我的代码如下:

    for (int i = 0; i < newRisks.size(); i++) {
        myRepository.save(newRisks.get(i));
        System.out.println("saved " + newRisks.get(i).name);
    }

我的newRisks-array包含大约60000个对象和60000个边。每个节点和边都有一个属性。
这个循环的持续时间大约是15到20分钟,这是正常的吗?
我使用Java VisualVM来搜索一些瓶颈,但我的平均CPU使用率是10 - 25%(4个核心),我的堆不到一半。

My newRisks-array contains circa 60000 objects and 60000 edges. Every node and edge has one property. The duration of this loop is circa 15 - 20 minutes, is this normal? I used Java VisualVM to search some bottlenecks, but my average CPU usage was 10 - 25% (of 4 cores) and my heap was less than half full.

有什么方法可以提升这项操作吗?

There are any options to boost up this operation?

编辑:附加是,在第一次调用 myRepository.save(newRisks.get(i)); 第一次输出到来之前几分钟,jvm下降了fpr

additional is, on the first call of myRepository.save(newRisks.get(i)); the jvm falling assleep fpr some minutes before the first output is comming

第二次编辑:

等级风险:

@NodeEntity
public class Risk {
    //...
    @Indexed
    public String name;

    @RelatedTo(type = "CHILD", direction = Direction.OUTGOING)
    Set<Risk> risk = new HashSet<Risk>();

    public void addChild(Risk child) {
        risk.add(child);
    }

    //...
}

创建风险:

@Autowired
private Repository myRepository;

@Transactional
public Collection<Risk> makeSomeRisks() {

    ArrayList<Risk> newRisks = new ArrayList<Risk>();

    newRisks.add(new Risk("Root"));

    for (int i = 0; i < 60000; i++) {
        Risk risk = new Risk("risk " + (i + 1));
        newRisks.get(0).addChild(risk);
        newRisks.add(risk);
    }

    for (int i = 0; i < newRisks.size(); i++) {
        myRepository.save(newRisks.get(i));
    }

    return newRisks;
}


推荐答案

这里的问题是你正在使用不适用于此的API进行大量插入。

The problem here is that you are doing mass-inserts with an API that is not intended for that.

您创建一个风险和60k的孩子,您首先保存根,这也保留了6万名儿童同时(并创造关系)。这就是为什么第一次保存需要这么长时间。然后你再次拯救孩子们。

You create a Risk and 60k children, you first save the root which also persists the 60k children at the same time (and creates the relationships). That's why the first save takes so long. And then you save the children again.

有一些解决方案可以加快SDN的速度。

There are some solutions to speed it up with SDN.


  1. 不要使用集合方法进行大量插入,保持两个参与者并使用template.createRelationshipBetween(root,child,CHILD,false);

  1. don't use the collection approach for mass inserts, persist both participants and use template.createRelationshipBetween(root, child, "CHILD",false);

先保留孩子,然后将所有持久的孩子添加到根对象并坚持

persist the children first then add all the persisted children to the root object and persist that

正如您所做的那样,使用Neo4j -Core API但调用template.postEntityCreation(node,Risk.class),以便您可以通过SDN访问实体。然后你还必须自己索引实体(db.index.forNodes(Risk)。add(node,name,name);)(或者使用neo4j core-api auto-index,但那不是与SDN兼容)。

As you did, use the Neo4j-Core API but call template.postEntityCreation(node,Risk.class) so that you can access the entities via SDN. Then you also have to index the entities on your own (db.index.forNodes("Risk").add(node,"name",name);) (or use the neo4j core-api auto-index, but that's not compatible with SDN).

无论使用core-api还是SDN,您都应该使用大约10-20k节点/相关的tx大小以获得最佳性能

Regardless with the core-api or SDN you should use tx-sizes of around 10-20k nodes/rels for best performance

这篇关于保存CRUDRepository的方法很慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆