Tombstone vs nodetool 和修复 [英] Tombstone vs nodetool and repair

查看:18
本文介绍了Tombstone vs nodetool 和修复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Cassandra 的一个表中插入了 10K 个条目,该表在单个分区下的 TTL 为 1 分钟.

I inserted 10K entries in a table in Cassandra which has the TTL of 1 minute under the single partition.

插入成功后,我试图从单个分区读取所有数据,但它抛出如下错误,

After the successful insert, I tried to read all the data from a single partition but it throws an error like below,

WARN  [ReadStage-2] 2018-04-04 11:39:44,833 ReadCommand.java:533 - Read 0 live rows and 100001 tombstone cells for query SELECT * FROM qcs.job LIMIT 100 (see tombstone_warn_threshold)
DEBUG [Native-Transport-Requests-1] 2018-04-04 11:39:44,834 ReadCallback.java:132 - Failed; received 0 of 1 responses
ERROR [ReadStage-2] 2018-04-04 11:39:44,836 StorageProxy.java:1906 - Scanned over 100001 tombstones during query 'SELECT * FROM qcs.job LIMIT 100' (last scanned row partion key was ((job), 2018-04-04 11:19+0530, 1, jobType1522820944168, jobId1522820944168)); query aborted

我知道墓碑是 sstable 中的一个标记,而不是实际的删除.

I understand tombstone is an marking in the sstable not the actual delete.

所以我使用nodetool

即使在那之后,当我从表中读取数据时,它也会在日志文件中抛出相同的错误.

Even after that when I read the data from the table, It throws the same error in log file.

1) 如何处理这种情况?

1) How to handle this scenario?

2) 有人能解释一下为什么会发生这种情况吗,为什么压缩和修复没有解决这个问题?

2) Could some explain why this scenario happened and Why not the compaction and repair didn't solve this issue?

推荐答案

在表的 gc_grace_seconds 设置指定的时间段(默认为 10 天)之后,墓碑才会真正删除.这样做是为了确保在删除时关闭的任何节点在恢复后都将拾取这些更改.以下是详细讨论此问题的博客文章:来自 thelastpickle(推荐)12DSE 文档Cassandra 文档.

Tombstones are really deleted after period specified by gc_grace_seconds setting of the table (it's 10 days by default). This is done to make sure that any node that was down at time of deletion will pickup these changes after recover. Here are the blog posts that discuss this in great details: from thelastpickle (recommended), 1, 2, and DSE documentation or Cassandra documentation.

您可以将单个表上的 gc_grace_seconds 选项设置为较低的值以更快地删除已删除的数据,但这仅适用于具有 TTL 数据的表.您可能还需要调整 tombstone_threshold &tombstone_compaction_interval 表选项以更快地执行压缩.请参阅本文档本文档 对这些选项的说明.

You can set the gc_grace_seconds option on the individual table to lower value to remove deleted data faster, but this should be done only for tables with TTLed data. You may also need to tweak tombstone_threshold & tombstone_compaction_interval table options to perform compactions faster. See this document or this document for description of these options.

这篇关于Tombstone vs nodetool 和修复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆