当传入线程数增加时,Couchbase Get操作速度变慢 [英] Couchbase Get operation slows down when the number of incoming threads increases
问题描述
摘要:
我们在 Spring-Boot 2.0.4 $ c中遇到了一个主要的性能问题$ c>和
Couchbase服务器
5.5.1
数据库响应迅速下降线程数增加时的时间性能。 此处是有关此问题的另一报告。
We are experiencing a rapid decline in DB response time performance when the number of threads is increasing. Here is another report about the issue.
详细信息:
Spring Boot运行500线程:
Spring Boot is running with 500 threads:
server:
tomcat:
max-threads: 500
max-connections: 500
我们使用以下依赖项:
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-couchbase</artifactId>
<version>3.0.9.RELEASE</version>
</dependency>
我们对DB的选择是通过Spring-Data存储库执行的:
Our "select" from DB is performed with Spring-Data repository:
Cat findFirstByOwnerIdAndNameAndColor(String ownerId, String name, String color);
我们有一个特别适合该查询的索引:
We have an index that is especially for this query:
CREATE INDEX `cat_by_ownerId_name_and_color_idx` ON `pets`(`ownerId`,`name`,`color`) WHERE (`_class` = "com.example.Cat")
随着请求数量的增加,我们可以看到数据库花费的时间迅速减少。
As the number of requests increase, we can see a quick degradation in the time it takes the DB to answer the query.
例如,当每秒运行300个请求时,响应时间的99%大约为 10秒!! 50的百分位数约为5秒。
For example, when running 300 requests per second, the 99's percentile of response time is about 10 Seconds!! and the 50's percentile is about 5 seconds.
返回的文档的平均大小约为 300字节。这意味着我们正在尝试每秒提取大约90千字节。相对较低的金额。
The average size of the returned document is about 300 Bytes. Meaning that we are trying to extract about 90 Kilobytes per second. A relatively low amount.
编辑:
我在此处添加在Couchbase的UI中运行相同查询的结果:
(在UI中,查询需要1.75毫秒才能完成)。
I'm adding here the result of running the same query in the UI of Couchbase: (In the UI, the query takes 1.75ms to complete).
{
"plan": {
"#operator": "Sequence",
"~children": [
{
"#operator": "IndexScan3",
"index": "cats_by_ownerId_name_and_color_idx",
"index_id": "c061141c2d373067",
"index_projection": {
"primary_key": true
},
"keyspace": "pets",
"namespace": "default",
"spans": [
{
"exact": true,
"range": [
{
"high": "\"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\"",
"inclusion": 3,
"low": "\"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\""
},
{
"high": "\"Oscar\"",
"inclusion": 3,
"low": "\"Oscar\""
},
{
"high": "\"red\"",
"inclusion": 3,
"low": "\"red\""
}
]
}
],
"using": "gsi"
},
{
"#operator": "Fetch",
"keyspace": "pets",
"namespace": "default"
},
{
"#operator": "Parallel",
"~child": {
"#operator": "Sequence",
"~children": [
{
"#operator": "Filter",
"condition": "(((((`pets`.`_class`) = \"com.example.Cat\") and ((`pets`.`ownerId`) = \"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\")) and ((`pets`.`name`) = \"Oscar\")) and ((`pets`.`color`) = \"red\"))"
},
{
"#operator": "InitialProject",
"result_terms": [
{
"expr": "self",
"star": true
}
]
},
{
"#operator": "FinalProject"
}
]
}
}
]
},
"text": "select * from pets where _class=\"com.example.Cat\" and projectId=\"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\" and name=\"Oscar\" and color=\"red\""
}
编辑2
我们也尝试隐式编写N1ql查询,但结果是相同。和以前一样,我们得到许多TimeOutException:
We also tried to implicitly write the N1ql query, but the outcome is the same. As before, we get many TimeOutExceptions:
Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.dao.QueryTimeoutException: java.util.concurrent.TimeoutException: {"b":"pets","s":"n1ql","t":7500000,"i":"f8cdf670-d32a-4d74-858c-f9dd9789d264"}; nested exception is java.lang.RuntimeException: java.util.concurrent.TimeoutException: {"b":"pets","s":"n1ql","t":7500000,"i":"f8cdf670-d32a-4d74-858c-f9dd9789d264"}] with root cause
java.util.concurrent.TimeoutException: {"b":"pets","s":"n1ql","t":7500000,"i":"f8cdf670-d32a-4d74-858c-f9dd9789d264"}
at com.couchbase.client.java.bucket.api.Utils$1.call(Utils.java:131) ~[java-client-2.7.0.jar:na]
at com.couchbase.client.java.bucket.api.Utils$1.call(Utils.java:127) ~[java-client-2.7.0.jar:na]
at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:140) ~[rxjava-1.3.8.jar:1.3.8]
at rx.internal.operators.OnSubscribeTimeoutTimedWithFallback$TimeoutMainSubscriber.onTimeout(OnSubscribeTimeoutTimedWithFallback.java:166) ~[rxjava-1.3.8.jar:1.3.8]
at rx.internal.operators.OnSubscribeTimeoutTimedWithFallback$TimeoutMainSubscriber$TimeoutTask.call(OnSubscribeTimeoutTimedWithFallback.java:191) ~[rxjava-1.3.8.jar:1.3.8]
at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) ~[rxjava-1.3.8.jar:1.3.8]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_161]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_161]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_161]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_161]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_161]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
是否可以解决此问题,或者我们需要
Is there a way to fix this, or we need a different DB?
推荐答案
因此,在进一步调查后,在 Spring-Data
组件中发现了问题。
So after further investigation, the problem was found in the Spring-Data
component.
要克服这一点,我们必须转向非阻塞机制。
To over come it, we had to move to non-blocking mechanism.
我们做了两件事:
- 从控制器层到服务与存储库层已更改为
CompleteableFuture< Cat>
-
要绕过Spring-Data连接到沙发床,我们创建了一个我们自己的存储库类,带有类似于以下内容的实现代码:
- All the calls from controller layer down to service & repository layers, were changed to
CompleteableFuture<Cat>
To bypass Spring-Data connection to the couchbase, we created a repository class of our own with implementation code that looks something like that:
Statement statement = select("*")
.from(i(bucket.name()))
.where(x("name").eq(s(name))
.and(x("ownerId").eq(s(ownerId)))
.and(x("color").eq(s(color)))
.and(x("_class").eq(s("com.example.Cat"))));
CompletableFuture<Cat> completableFuture = new CompletableFuture();
bucket.async().query(statement)
...
完成此操作后,即使在大约数百个并发请求期间,延迟问题也消失了,查询性能约为 2毫秒。
After we did that, the latency problem disappeared and the performance are about 2 Milliseconds for query, even during about few hundreds concurrent requests.
这篇关于当传入线程数增加时,Couchbase Get操作速度变慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!