Neo4j over Bolt协议具有很高的延迟 [英] Neo4j over bolt protocol has very high latency

查看:181
本文介绍了Neo4j over Bolt协议具有很高的延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Neo4j进行项目开发,该项目使用可在此处找到的.NET官方Neo4j官方驱动程序:

I'm using Neo4j for a project using the official Neo4j driver for .NET found here:

https://www.nuget.org/packages/Neo4j.Driver

该驱动程序在bolt协议上工作,我假设专用的二进制协议比HTTP API效率更高.但是自从项目开始以来,我已经注意到Neo4j的延迟相对较高,即使操作非常简单.例如,当UserID是索引字段而数据库完全为空时,以下匹配将花费30-60ms:

This driver works over the bolt protocol, my assumption being that a specialized binary protocol would be more efficient than the HTTP API. But ever since the start of the project, I've noticed relatively high latencies from Neo4j for even very simple operations. Such as a match like the following taking 30-60ms when UserID is an indexed field and the database otherwise being completely empty:

match(n:User { UserID: 1 }) return n.UserID

此行为在我的本地计算机(几乎零网络开销)和我们的生产环境中都发生.我今天开始对此进行调查,发现查询返回很快,但是实际流式传输结果需要花费很长时间.例如,以下查询需要 0.2ms 才能在本地主机上返回调用之前,然后在result上调用ToArray()(缓冲记录,在这种情况下为单个整数字段)会增加时间 60ms .

This behavior occurs both on my local machine (near zero network overhead) and our production environment. I started investigating this today and found that the query returns quickly, but it takes a long time to actually stream in the results. For example, the below query takes 0.2ms before the call returns on localhost, but then calling ToArray() on result (buffering the records, which in this case is a single integer field) increases the time to 60ms.

using (var driver = GraphDatabase.Driver($"bolt://localhost:7687", AuthTokens.Basic("neo4j", "1")))
{    
    using (var session = driver.Session())
    {
        // 0.2ms to return from this call
        var result = session.Run("match(n:User { ID: 1}) return n.ID"); 

        // Uncommenting this makes the whole thing take 60ms
        // result.ToArray(); 
    }
}

然后我尝试了社区赞助的Neo4jClient程序包,该程序包可通过HTTP运行:

I then tried the community sponsored Neo4jClient package, which works over HTTP:

https://github.com/Readify/Neo4jClient

使用相同的查询,总时间减少到仅0.5ms:

With the same query, the total time is reduced to just 0.5ms:

var client = new GraphClient(new Uri("http://localhost:7474/db/data"), "neo4j", "1");
client.Connect();

client.Cypher.Match("(n:User { ID: 1})").Return<int>("n.ID").Results.ToArray();

运行更正式的基准会给出以下结果,即螺栓驱动的正式驱动程序和基于HTTP的Neo4jClient之间的巨大差异.

Running a more official benchmark gives the following results, a huge difference between the bolt-driven official driver and the HTTP based Neo4jClient.

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4770 CPU 3.40GHz, ProcessorCount=8
Frequency=3312642 ticks, Resolution=301.8739 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1586.0

Type=Neo4jBenchmarks  Mode=Throughput  Platform=X64  
Jit=RyuJit  

      Method |         Median |      StdDev | Scaled | Scaled-SD |
------------- |--------------- |------------ |------- |---------- |
  Neo4jClient |    382.5675 us |   3.3771 us |   1.00 |      0.00 |
Neo4jSession | 61,299.9382 us | 690.1626 us | 160.02 |      2.24 |

因此,在网络开销可忽略不计的情况下,HTTP客户端的速度快160倍.

So the HTTP client is 160x faster when network overhead is negligible.

我还在生产环境上运行了基准测试,虽然差别没有那么大,但是HTTP方法仍然快6倍(并且我与生产的网络连接非常慢).

I also ran the benchmark on our production environment and while the difference wasn't as large, the HTTP method was still 6x faster (and my network connection to production is pretty slow).

完整的基准代码:

public class Neo4jBenchmarks
{
    private readonly IDriver _driver;
    private readonly GraphClient _client;

    public Neo4jBenchmarks()
    {
      _driver = GraphDatabase.Driver("bolt://localhost:7687", AuthTokens.Basic("neo4j", "1"));
      _client = new GraphClient(new Uri("http://localhost:7474/db/data"), "neo4j", "1");
      _client.Connect();
    }

    [Benchmark(Baseline = true)]
    public void Neo4jClient()
    {
      _client.Cypher.Match("(n:User { ID: 1})").Return<int>("n.ID").Results.ToArray();
    }

    [Benchmark]
    public void Neo4jSession()
    {
      using (var session = _driver.Session())
      {
        session.Run("match(n:User { ID: 1}) return n.ID").ToArray();
      }
    }
}

我的机器和生产都运行Neo4j CE 3.0.4(当前为社区版),尽管我在Windows 10上运行它,并且生产是Linux机器.我们尚未调整任何设置,但我怀疑这是否可以解释160倍的性能差异.

Both my machine and production is running Neo4j CE 3.0.4 (currently the community edition), though I'm running it on Windows 10 and production is a Linux machine. We haven't tweaked any settings to my knowledge, but I doubt that could explain a 160x performance difference.

我还尝试过重用session对象(我​​认为这是一个非常糟糕的主意,因为它不是线程安全的),因为创建会话涉及创建事务,以查看这是否有所作为,但事实并非如此.显.

I also tried reusing the session object (which I think is a very bad idea since it isn't thread-safe) because creating a session involves creating a transaction, to see if that made a difference, but it wasn't noticeable.

我希望我可以使用Neo4jClient,但我们确实需要能够执行任意字符串查询的功能,尽管Neo4jClient严重依赖于流畅的API,并且它提供了低级字符串模式,但已弃用了

I wish I could use the Neo4jClient, but we really need the ability to execute arbitrary string queries, while the Neo4jClient relies heavily on a fluent API and while it offers a low-level string mode, it's deprecated and actively discouraged in the documentation.

推荐答案

进一步研究之后,我将问题追溯到Neo4j.Driver软件包中,因为NodeJS的驱动程序没有遇到同样的问题.

After further digging, I traced the problem to the Neo4j.Driver package specifically, as the driver for NodeJS didn't suffer from the same issue.

克隆软件包的当前,对其进行构建并直接引用DLL. NuGet软件包的使用完全消除了该问题.透视一下:NuGet(1.0.2)上的当前版本需要 62秒来对localhost进行1000次简单匹配请求,而当前源在 0.3秒内 strong>(甚至比NodeJS驱动程序高10倍).

Cloning the current source of the package, building it and referencing the DLL directly instead of the NuGet package eliminated the problem entirely. To put into perspective: the current version that is on NuGet (1.0.2) takes 62 seconds to do 1000 simple match requests against localhost, whereas the current source does so in 0.3 seconds (even beating the NodeJS driver by a factor of 10).

我不太清楚为什么,但是我很确定它与当前软件包的rda.SocketsForPCL依赖关系有关,后者似乎是使套接字可以跨平台工作的粘合库.但是,当前源为此引用了System.Net.Sockets包.

I'm not quite sure why, but I'm pretty sure it has something to do with the rda.SocketsForPCL dependency of the current package, which appears to be a glue library to make sockets work cross-platform. However, the current source references the System.Net.Sockets package for that.

因此,总而言之,可以通过引用源的当前版本来解决此问题,并且在发布新版本的软件包时将完全解决该问题.

So in conclusion, this issue can be worked around by referencing a current build of the source and will be resolved entirely when a new version of the package is released.

这篇关于Neo4j over Bolt协议具有很高的延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆