使用 neo4j rest http 客户端的性能问题 [英] Performance issues using neo4j rest http client

查看:65
本文介绍了使用 neo4j rest http 客户端的性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在用 Apache http 客户端替换 neo4j-jdbc 客户端后苦苦挣扎.

当仅运行 1k 个并发用户执行我们的查询时,我们似乎仍然存在问题.

这是我们使用客户端的方式:

每个请求的平均延迟为 3 秒.

我们应该放弃 Neo4j 吗?我们对表演结果感到绝望

谢谢.

解决方案

那么,您想要更多并发请求吗?让我们探索一下我们可以在这里做什么.

查询

首先 - 检查查询是否足够好.将其复制粘贴到 Neo4j 浏览器中,添加 PROFILE 并探索输出.

您的查询可能比您预期的要多得多.这会导致等待时间很长,因为 Neo4j 仍在执行查询.

客户

HttpClient 配置

您正在使用 PoolingHttpClientConnectionManager.来自文档:

<块引用>

PoolingHttpClientConnectionManager 维护每个路由的最大连接数和总连接数.默认情况下,此实现将为每个给定路由创建不超过 2 个并发连接,并且总共不会创建超过 20 个连接.

所以,我们应该增加我们的限制.示例:

PoolingHttpClientConnectionManager cnnMgr = new PoolingHttpClientConnectionManager();cnnMgr.setMaxTotal(500);cnnMgr.setDefaultMaxPerRoute(100);

HttpRequest

尝试向请求添加 keep-alive 标头.示例:

request.setHeader("Connection", "keep-alive");

然后,您应该始终尽快关闭您的回复.您不应该依赖这样一个事实,即当您用尽流内容时连接已关闭.代码:

try(CloseableHttpResponse response = httpClient.execute(request)) {//在这里做一些有响应的事情//当 try-with-resource 块结束时关闭响应}

记住 - 您从服务器事务端点接收的内容流回客户端.

return createResultSet(new JsonObject(IOUtils.toString(response.getEntity().getContent())));

因此,在此代码示例中,我们等待直到检索到完整响应,然后才开始序列化.

在你的情况下,你正在寻找这样的东西:

String rawJsonResult = null;try(CloseableHttpResponse response = httpClient.execute(request);) {rawJsonResult = IOUtils.toString(response.getEntity().getContent());} catch (IOException e) {抛出新的运行时异常(e);}返回 createResultSet(new JsonObject(rawJsonResult));

通过这样做,我们确保我们正在检索结果并在任何序列化发生之前关闭连接.这将为其他并发连接释放资源.

服务器

Neo4j 使用 Jetty 作为网络服务器.Jetty 由 BlockingQueue 支持.这意味着可以处理 x 个并发 HTTP 请求.这个 x 是队列大小.如果我们有超过 x 数量的并发请求,那么队列中有等待空闲位置.

幸运的是,您可以配置队列的大小.您对此房产感兴趣:

org.neo4j.server.webserver.maxthreads=200

注意:这里没有魔法.默认情况下,Neo4j 使用 cpuCount * 4 数量的 Web 服务器线程.增加此数量会导致并发请求数量增加,但每个请求都会变慢.

Linux

您应该检查这个.每个 TCP 连接都是一个单独的文件.通常,大多数 Linux 发行版的默认值是 1024.你需要增加它.你可以试试40000.

记住 - 这不仅适用于服务器,也适用于客户端.您不仅要接收连接,还需要打开它们.

一般注意事项

您不应该太相信分析结果.我们在发出 HTTP 请求时等待是完全可以的.总体而言 - 这是沟通中最昂贵的部分.

此外,您应该确保您的客户端和服务器位于同一本地网络上.通过公共网络执行请求会显着降低性能.

最后一个 - 并发 HTTP 连接有上限.超过此限制会使数据库几乎没有响应(类似于任何其他 Web 应用程序).您可能需要考虑水平扩展(Neo4j 集群)才能发出更多并发请求.

<小时>

祝你好运!

Struggling this after replacing neo4j-jdbc client with Apache http client.

Seems like we still have issues when running only 1k concurrent users that execute our query.

This is how we using the client: https://gist.github.com/IdanFridman/1989b600a0a032329a5e

this is how we execute the query using that rest-client:

https://gist.github.com/IdanFridman/22637f95ba696f498b6c

after profiling we see the above bad performance results:

With avg latency of 3 seconds per request.

Should we ditch neo4j? we getting desperate with performances results

thanks.

解决方案

So, you want to more concurrent requests? Let's explore what we can do here.

Queries

First of all - check that query is performing well enough. Copy-paste it Neo4j Browser, prepend with PROFILE and explore output.

It might be that your query is doing a lot more than you are expecting. And this results in long wait time because Neo4j is still executing a query.

Client

HttpClient configuration

You are using PoolingHttpClientConnectionManager. From documentation:

PoolingHttpClientConnectionManager maintains a maximum limit of connections on a per route basis and in total. Per default this implementation will create no more than 2 concurrent connections per given route and no more 20 connections in total.

So, we should increase our limits. Example:

PoolingHttpClientConnectionManager cnnMgr = new PoolingHttpClientConnectionManager();
cnnMgr.setMaxTotal(500);
cnnMgr.setDefaultMaxPerRoute(100);

HttpRequest

Try to add keep-alive header to request. Example:

request.setHeader("Connection", "keep-alive");

Then, you should always close your response as soon as possible. You shouldn't rely on that fact that when you are exhausting stream content connection is closed. Code:

try(CloseableHttpResponse response = httpClient.execute(request)) {
    // do stuff with response here
    // close response when try-with-resource block ends
}

Remember - content that you are receiving from server transaction endpoint streamed back to a client.

return createResultSet(new JsonObject(IOUtils.toString(response.getEntity().getContent())));

So, in this code sample, we are waiting until we retrieve full response and only after that we start serialization.

In your case you are looking for something like this:

String rawJsonResult = null;
try(CloseableHttpResponse response = httpClient.execute(request);) {
    rawJsonResult = IOUtils.toString(response.getEntity().getContent());
} catch (IOException e) {
    throw new RuntimeException(e);
}
return createResultSet(new JsonObject(rawJsonResult));

By doing this, we ensure that we are retrieving result and closing connection before any serialization occurs. This will free up resources for other concurrent connections.

Server

Neo4j is using Jetty as a web server. Jetty is backed by BlockingQueue. This means that there x amount of concurrent HTTP request which can be processed. This x is queue size. If we have more than x amount of concurrent requests, then there are waiting for a free spot in the queue.

Fortunately, you can configure how large is a queue. You are interested in this property:

org.neo4j.server.webserver.maxthreads=200

Note: there is no magic here. By default, Neo4j is using cpuCount * 4 amount of web server threads. Increasing this number can result in a higher number of concurrent requests, but each request can slow down.

Linux

You should check this. Each TCP connection is a separate file. Usually, default value on most Linux distributions is 1024. You need to increase it. You can try 40000.

Remember - this applies not only to a server, but to the client as well. You not only want to receive connection, but also, you need to open them.

General Notes

You shouldn't believe profiling results that much. It's totally OK that we are waiting while making HTTP requests. Overall - this is most expensive part of communication.

Also, you should ensure that your Client and Server are located on the same local network. Doing request via a public network can significantly degrade performance.

And the last one - there is an upper limit of concurrent HTTP connections. Passing this limit can make database almost unresponsive (similar to any other web application). You might need to consider horizontal scaling (Neo4j Cluster) to be able to make more concurrent requests.


Good luck!

这篇关于使用 neo4j rest http 客户端的性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆