加载速度真的很慢Neo4jClient C#LoadCsv [英] Really slow load speed Neo4jClient C# LoadCsv

查看:166
本文介绍了加载速度真的很慢Neo4jClient C#LoadCsv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在使用的代码确实很慢,每秒大约有20次插入,并使用拆分器创建多个要加载的csv文件.是否可以通过Neo4jClient for dotnet以正确的方式使用"USING PERIODIC COMMIT 1000"?

The code I use now is really slow with about 20 inserts per second and uses a splitter to create multiple csv files to load. Is there a way to use "USING PERIODIC COMMIT 1000" in a proper way using the Neo4jClient for dotnet?

    public async Task InsertEdgesByName(List<string> nodeListA, List<string> nodeListB,
        List<int> weightList, string type)
    {
        for (var i = 0; i < nodeListA.Count; i += 200)
        {
            using (var sw = new StreamWriter(File.OpenWrite($"tempEdge-{type}.csv")))
            {
                sw.Write("From,To,Weight\n");
                for (var j = i; 
                    j < i + 200 & 
                    j < nodeListA.Count; 
                    j++)
                {
                    sw.Write($"{nodeListA[j]}," +
                             $"{nodeListB[j]}," +
                             $"{weightList[j]} + id:{j}" +
                             $"\n");
                }
            }
            var f = new FileInfo($"tempEdge-{type}.csv");

            await Client.Cypher
                .LoadCsv(new Uri("file://" + f.FullName), "rels", true)
                .Match("(from {label: rels.From}), (to {label: rels.To})")
                .Create($"(from)-[:{type} {{weight: rels.Weight}}]->(to);")
                .ExecuteWithoutResultsAsync();

            _logger.LogDebug($"{DateTime.Now}\tEdges inserted\t\tedges inserted: {i}");
        }
    }

要创建我使用的节点

        await Client.Cypher
            .Create("INDEX ON :Node(label);")
            .ExecuteWithoutResultsAsync();

        await Client.Cypher
            .LoadCsv(new Uri("file://" + f.FullName), "csvNode", true)
            .Create("(n:Node {label:csvNode.label, source:csvNode.source})")
            .ExecuteWithoutResultsAsync();

标签上的索引似乎并没有改变两个插入语句的速度.我大约要插入200.000条边,以每秒20条的速度需要花费数小时.能够添加USING PERIODIC COMMIT 1000可以清理我的代码,但不会大大提高性能.

The indexing on label does not seem to change the speed of either insert statement. I have about 200.000 edges to insert, at 20 per second this would take hours. Being able to add the USING PERIODIC COMMIT 1000 would clean up my code but wouldn't improve performance by much.

有没有一种方法可以加快插入速度?我知道neo4jclient不是最快的,但是我真的很想留在asp.net环境中.

Is there a way to speed up inserts? I know the neo4jclient is not the fastest but I would really like to stay within the asp.net environment.

public class SimpleNodeModel
{
    public long id { get; set; }
    public string label { get; set; }
    public string source { get; set; } = "";

    public override string ToString()
    {
        return $"label: {label}, source: {source}, id: {id}";
    }

    public SimpleNodeModel(string label, string source)
    {
        this.label = label;
        this.source = source;
    }

    public SimpleNodeModel() { }

    public static string Header => "label,source";

    public string ToCSVWithoutID()
    {
        return $"{label},{source}";
    }
}

密码

USING PERIODIC COMMIT 500
LOAD CSV FROM 'file://F:/edge.csv' AS rels
MATCH (from {label: rels.From}), (to {label: rels.To})
CREATE (from)-[:edge {{weight: rels.Weight}}]->(to);

推荐答案

关于底部Cycy代码的速度较慢,这是因为您没有在MATCH中使用标签,因此您的MATCH从未使用索引来查找节点必须快速扫描,而必须扫描数据库TWICE中的每个节点,一次扫描from,然后再次扫描to.

Regarding the slow speed of the Cypher code at the bottom, that's because you're not using labels in your MATCH, so your MATCH never uses the index to find the nodes quickly, it instead must scan every node in your database TWICE, once for from, and again for to.

您在节点属性中使用label与节点标签不同.由于您使用:Node标签创建了节点,因此请在您的匹配项中重复使用此标签:

Your use of label in the node properties is not the same as the node label. Since you created the nodes with the :Node label, please reuse this label in your match:

...
MATCH (from:Node {label: rels.FROM}), (to:Node {label: rels.To})
...

这篇关于加载速度真的很慢Neo4jClient C#LoadCsv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆