使用neo4j的性能问题rest http客户端 [英] Performance issues using neo4j rest http client

查看:381
本文介绍了使用neo4j的性能问题rest http客户端的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用Apache http客户端替换neo4j-jdbc客户端后,尝试这样做。



似乎我们在运行只有1k个执行查询的并发用户时仍然遇到问题。



我们使用客户:



每个请求的平均延迟时间为3秒。



我们应该保留neo4j吗?

解决方案

因此,你想要更多的并发请求吗?



查询



首先,检查查询执行效果是否良好。复制 - 粘贴它Neo4j浏览器,前面加上 PROFILE 并探索输出。



这可能是你的查询比你想象的要多得多。



客户端



HttpClient配置 h3>

您正在使用 PoolingHttpClientConnectionManager
从文档:


PoolingHttpClientConnectionManager在每个路由基础上维护连接的最大限制。默认情况下,此实现将为每个给定的路由创建不超过2个并发连接,总共不超过20个连接。


因此,我们应该增加我们的限制。示例:

  PoolingHttpClientConnectionManager cnnMgr = new PoolingHttpClientConnectionManager 
cnnMgr.setMaxTotal(500);
cnnMgr.setDefaultMaxPerRoute(100);



HttpRequest



尝试添加keep-请求的活动头。示例:

  request.setHeader(Connection,keep-alive); 

然后,您应该尽快关闭您的回复。你不应该依赖那个事实,当你耗尽流内容连接被关闭。代码:

  try(CloseableHttpResponse response = httpClient.execute(request)){
//
//当try-with-resource块结束时关闭响应
}

记住 - 您从服务器事务端点接收的内容流回到客户端。

  return createResultSet(new JsonObject(IOUtils.toString(response.getEntity()。getContent()))); 

所以,在这个代码示例中,我们等待,直到我们检索完整的响应,序列化。



在您的案例中,您正在寻找类似如下的内容:

  String rawJsonResult = null; 
try(CloseableHttpResponse response = httpClient.execute(request);){
rawJsonResult = IOUtils.toString(response.getEntity()。getContent());
} catch(IOException e){
throw new RuntimeException(e);
}
return createResultSet(new JsonObject(rawJsonResult));

通过这样做,我们确保我们正在检索结果并关闭连接在任何序列化发生之前。这将释放其他并发连接的资源。



服务器



Neo4j正在使用Jetty作为Web服务器。 Jetty支持 BlockingQueue 。这意味着 x 可以处理的并发HTTP请求数量。这个 x 是队列大小。如果我们有超过 x 的并发请求数量,则会等待队列中的一个空闲位置。



幸运的是,您可以配置队列有多大。您对此资源感兴趣:

  org.neo4j.server.webserver.maxthreads = 200 

注意:这里没有魔法。默认情况下,Neo4j使用 cpuCount * 4 Web服务器线程数。增加此数字可能会导致更高数量的并发请求,但每个请求可能会减慢。



Linux



您应查看此。每个TCP连接是一个单独的文件。通常,大多数Linux发行版的默认值为 1024 。你需要增加它。您可以尝试 40000



请记住 - 这不仅适用于服务器,也适用于客户端。您不仅希望接收连接,还需要打开



h2>

你不应该相信分析结果那么多。在发出HTTP请求时,我们还在等待。总体来说,这是最昂贵的沟通部分。



此外,您应确保您的客户端和服务器位于同一本地网络。通过公共网络执行请求可能会显着降低性能。



最后一个 - 并发HTTP连接的上限。传递此限制可能使数据库几乎无响应(类似于任何其他Web应用程序)。您可能需要考虑水平缩放(Neo4j Cluster),以便能够提出更多的并发请求。






祝你好运! / p>

Struggling this after replacing neo4j-jdbc client with Apache http client.

Seems like we still have issues when running only 1k concurrent users that execute our query.

This is how we using the client: https://gist.github.com/IdanFridman/1989b600a0a032329a5e

this is how we execute the query using that rest-client:

https://gist.github.com/IdanFridman/22637f95ba696f498b6c

after profiling we see the above bad performance results:

With avg latency of 3 seconds per request.

Should we ditch neo4j? we getting desperate with performances results

thanks.

解决方案

So, you want to more concurrent requests? Let's explore what we can do here.

Queries

First of all - check that query is performing well enough. Copy-paste it Neo4j Browser, prepend with PROFILE and explore output.

It might be that your query is doing a lot more than you are expecting. And this results in long wait time because Neo4j is still executing a query.

Client

HttpClient configuration

You are using PoolingHttpClientConnectionManager. From documentation:

PoolingHttpClientConnectionManager maintains a maximum limit of connections on a per route basis and in total. Per default this implementation will create no more than 2 concurrent connections per given route and no more 20 connections in total.

So, we should increase our limits. Example:

PoolingHttpClientConnectionManager cnnMgr = new PoolingHttpClientConnectionManager();
cnnMgr.setMaxTotal(500);
cnnMgr.setDefaultMaxPerRoute(100);

HttpRequest

Try to add keep-alive header to request. Example:

request.setHeader("Connection", "keep-alive");

Then, you should always close your response as soon as possible. You shouldn't rely on that fact that when you are exhausting stream content connection is closed. Code:

try(CloseableHttpResponse response = httpClient.execute(request)) {
    // do stuff with response here
    // close response when try-with-resource block ends
}

Remember - content that you are receiving from server transaction endpoint streamed back to a client.

return createResultSet(new JsonObject(IOUtils.toString(response.getEntity().getContent())));

So, in this code sample, we are waiting until we retrieve full response and only after that we start serialization.

In your case you are looking for something like this:

String rawJsonResult = null;
try(CloseableHttpResponse response = httpClient.execute(request);) {
    rawJsonResult = IOUtils.toString(response.getEntity().getContent());
} catch (IOException e) {
    throw new RuntimeException(e);
}
return createResultSet(new JsonObject(rawJsonResult));

By doing this, we ensure that we are retrieving result and closing connection before any serialization occurs. This will free up resources for other concurrent connections.

Server

Neo4j is using Jetty as a web server. Jetty is backed by BlockingQueue. This means that there x amount of concurrent HTTP request which can be processed. This x is queue size. If we have more than x amount of concurrent requests, then there are waiting for a free spot in the queue.

Fortunately, you can configure how large is a queue. You are interested in this property:

org.neo4j.server.webserver.maxthreads=200

Note: there is no magic here. By default, Neo4j is using cpuCount * 4 amount of web server threads. Increasing this number can result in a higher number of concurrent requests, but each request can slow down.

Linux

You should check this. Each TCP connection is a separate file. Usually, default value on most Linux distributions is 1024. You need to increase it. You can try 40000.

Remember - this applies not only to a server, but to the client as well. You not only want to receive connection, but also, you need to open them.

General Notes

You shouldn't believe profiling results that much. It's totally OK that we are waiting while making HTTP requests. Overall - this is most expensive part of communication.

Also, you should ensure that your Client and Server are located on the same local network. Doing request via a public network can significantly degrade performance.

And the last one - there is an upper limit of concurrent HTTP connections. Passing this limit can make database almost unresponsive (similar to any other web application). You might need to consider horizontal scaling (Neo4j Cluster) to be able to make more concurrent requests.


Good luck!

这篇关于使用neo4j的性能问题rest http客户端的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆