没有Thread.sleep的Cassandra抛出WriteTimeout异常 [英] Cassandra Throwing WriteTimeout Exception without Thread.sleep

查看:75
本文介绍了没有Thread.sleep的Cassandra抛出WriteTimeout异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个批处理工作,将大约300,000行写入cassandra。我将它们分成较小的批量,每个批量为50行。



伪代码如下。

  @Override 
public void executeQuery(List< BatchStatement>批处理){
List< ResultSetFuture>期货= List.of(); (batchStatement批处理:batchs)的
{
futures.add(session.executeAsync(batch));
}

for(ResultSetFuture rsf:Futures){
rsf.getUninterruptible();
/ *我必须添加以下代码以避免WriteTimeoutException
try {
Thread.sleep(100);
} catch(InterruptedException e){
logger.error( Thread.sleep,e);
}
* /

}
}

我不知道为什么没有Thread.sleep,它总是会给出WriteTimeout异常。如何避免这种情况?

解决方案

通过对数据使用批处理语句(很可能属于不同的分区),您实际上是由于协调节点需要将请求发送到其他节点并等待答案,因此会使系统过载。您仅需要将批处理用于特定的用例,而不必像在关系数据库中使用批处理一样,以加快执行速度。此文档描述了批处理的错误使用。 / p>

为每行发送单个异步请求将改善情况,但是您需要注意不要同时发送太多请求(使用信号灯),并通过增加每个连接的运行中请求数池选项


I have a batch job writing around 300,000 lines into cassandra. I divide them into smaller batches whose size is 50 lines each.

Pseudo code is below.

@Override
public void executeQuery(List<BatchStatement> batches) {
    List<ResultSetFuture> futures = List.of();
    for (BatchStatement batch: batches) {
        futures.add(session.executeAsync(batch));
    }

    for(ResultSetFuture rsf: futures) {
        rsf.getUninterruptibly();
        /* I have to add the following code to avoid WriteTimeoutException
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {
            logger.error("Thread.sleep", e);
        }
        */

    }
}

I don't know why without Thread.sleep, it always gives WriteTimeout Exception. How to avoid this?

解决方案

By using the batch statement on data (that most probably belongs to different partitions) you're really overload your system because coordinating node need to send requests to other nodes and wait for answer. You need to use batches only for specific use cases, and not the same way as you used them in relational databases - to speedup execution. This documentation describes the bad use of batches.

Sending individual asynchronous requests for every line will improve situation, but you need to take care that you don't send too many requests at the same time (by using semaphore), and by increasing the number of in-flight requests per connection via pooling options.

这篇关于没有Thread.sleep的Cassandra抛出WriteTimeout异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆