Cassandra - 删除不起作用 [英] Cassandra - Delete not working

查看:25
本文介绍了Cassandra - 删除不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有时;当我执行 DELETE 时;它不起作用.

我的配置:[cqlsh 5.0.1 |卡桑德拉 3.0.3 |CQL 规范 3.4.0 |本机协议 v4]

cqlsh:my_db>SELECT * FROM 对话 WHERE user_id=120 AND session_id=2 AND peer_type=1;用户 ID |对话_id |peer_type |消息映射---------+-----------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------——120 |2 |1 |{0: {real_id: 68438, date: 1455453523, sent: True}, 1: {real_id: 68437, date: 1455453520, sent: True}, 2: {real_id: 68436, date: 1455453523, sent: True}, 3,: {real_id: 68435, date: 1455453501, sent: True}, 4: {real_id: 68434, date: 1455453500, sent: True}, 5: {real_id: 68433, date: 1455453499, sent: Truereal_id: 68432, date: 1455453498, sent: True}, 7: {real_id: 68431, date: 1455453494, sent: True}, 8: {real_id: 68430, date: 1455453480, sent: True}(1 行)cqlsh:my_db>从对话中删除 message_map,其中 user_id=120 AND session_id=2 AND peer_type=1;cqlsh:my_db>SELECT * FROM 对话 WHERE user_id=120 AND session_id=2 AND peer_type=1;用户 ID |对话_id |peer_type |消息映射---------+-----------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------——120 |2 |1 |{0: {real_id: 68438, date: 1455453523, sent: True}, 1: {real_id: 68437, date: 1455453520, sent: True}, 2: {real_id: 68436, date: 1455453523, sent: True}, 3,: {real_id: 68435, date: 1455453501, sent: True}, 4: {real_id: 68434, date: 1455453500, sent: True}, 5: {real_id: 68433, date: 1455453499, sent: Truereal_id: 68432, date: 1455453498, sent: True}, 7: {real_id: 68431, date: 1455453494, sent: True}, 8: {real_id: 68430, date: 1455453480, sent: True}(1 行)

CQLSH 不会在 DELETE 指令中返回任何错误,但就像没有考虑到它一样.

你知道为什么吗?

注意:这是我的表定义:

创建表 be_telegram.conversations (user_id bigint,对话 ID 整数,peer_type int,message_map 映射>,主键(user_id、conversation_id、peer_type)) WITH CLUSTERING ORDER BY (conversation_id ASC, peer_type ASC)和bloom_filter_fp_chance = 0.01和缓存 = {'keys': 'ALL', 'rows_per_partition': 'NONE'}AND 评论 = ''和压缩 = {'class':'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy','max_threshold':'32','min_threshold':'4'}AND 压缩 = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}AND crc_check_chance = 1.0AND dclocal_read_repair_chance = 0.1AND default_time_to_live = 0与 gc_grace_seconds = 864000AND max_index_interval = 2048AND memtable_flush_period_in_ms = 0AND min_index_interval = 128AND read_repair_chance = 0.0AND speculative_retry = '99PERCENTILE';

解决方案

DELETE 语句从表中的一行或多行中删除一或多列,如果没有列则删除整行被指定.Cassandra 在同一partition key 中以原子方式和隔离地应用选择.

当一列被删除时,它不会立即从磁盘中删除.删除的列用 tombstone 标记,然后在配置的宽限期到期后删除.可选的 timestamp 定义了新的 tombstone 记录.

关于 Cassandra

中的删除

Cassandra 删除数据的方式与关系数据库删除数据的方式不同.关系数据库可能会花时间扫描数据以查找过期数据并将其丢弃,或者管理员可能必须按月对过期数据进行分区,例如,以更快地将其清除.Cassandra 列中的数据可以有一个可选的到期日期,称为 TTL(生存时间).

要记住的有关已删除数据的事实是:

  1. Cassandra 不会立即删除标记为删除的数据盘.删除发生在压缩期间.
  2. 如果您使用

    <块引用>

    Compaction 通过分区键合并每个SSTable数据中的数据,根据时间戳选择最新的数据进行存储.Cassandra 可以高效地合并数据,无需随机 IO,因为行按每个 SSTable 中的分区键排序.驱逐后墓碑并删除已删除的数据、列和行,压缩过程将 SSTable 合并到一个文件中.老人一旦任何挂起的读取完成,SSTable 文件就会被删除文件.旧 SSTable 占用的磁盘空间可用于重复使用.

    输入到 SSTables 的数据经过排序以防止 SSTable 期间的随机 I/O合并.压缩后,Cassandra 使用新的合并SSTable 而不是多个旧的 SSTable,满足读取请求比压缩前更有效.旧的 SSTable 文件是一旦任何挂起的读取完成使用文件,就删除.磁盘旧的 SSTable 占用的空间可以重用.

    所以试试这个

    nodetool 修理选项是:( -h | --host ) <主机名>|<IP地址>( -p | --port ) <端口号>( -pw | --password ) <密码>( -u | --username ) <用户名>-- 将选项和可能被误认为选项的参数分开.键空间是键空间的名称.table 是一个或多个表名,用空格隔开.

    此命令在使用 SizeTieredCompactionStrategyDateTieredCompactionStrategy 的表上启动压缩过程.您可以指定用于压缩的键空间.如果不指定 keyspacenodetool 命令使用 current keyspace.您可以为 compaction 指定一个或多个表.如果您不指定表,则会压缩键空间中的所有表.这称为主要压缩.如果您确实指定了一个表,则会发生指定表的压缩.这称为小压缩.主要压缩将所有现有的 SSTable 合并为一个 SSTable.在压缩期间,由于新旧 SSTable 共存,磁盘空间使用和磁盘 I/O 会出现临时高峰.主要压缩会导致大量磁盘 I/O.

    Sometimes; when I perform a DELETE; it doesn't work.

    My config : [cqlsh 5.0.1 | Cassandra 3.0.3 | CQL spec 3.4.0 | Native protocol v4]

    cqlsh:my_db> SELECT * FROM conversations  WHERE user_id=120 AND conversation_id=2 AND peer_type=1;
    
    user_id | conversation_id | peer_type | message_map
    ---------+-----------------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     120 |               2 |         1 | {0: {real_id: 68438, date: 1455453523, sent: True}, 1: {real_id: 68437, date: 1455453520, sent: True}, 2: {real_id: 68436, date: 1455453517, sent: True}, 3: {real_id: 68435, date: 1455453501, sent: True}, 4: {real_id: 68434, date: 1455453500, sent: True}, 5: {real_id: 68433, date: 1455453499, sent: True}, 6: {real_id: 68432, date: 1455453498, sent: True}, 7: {real_id: 68431, date: 1455453494, sent: True}, 8: {real_id: 68430, date: 1455453480, sent: True}}
    
    (1 rows)
    cqlsh:my_db> DELETE message_map FROM conversations WHERE user_id=120 AND conversation_id=2 AND peer_type=1;
    cqlsh:my_db> SELECT * FROM conversations  WHERE user_id=120 AND conversation_id=2 AND peer_type=1;
    
    user_id | conversation_id | peer_type | message_map
    ---------+-----------------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     120 |               2 |         1 | {0: {real_id: 68438, date: 1455453523, sent: True}, 1: {real_id: 68437, date: 1455453520, sent: True}, 2: {real_id: 68436, date: 1455453517, sent: True}, 3: {real_id: 68435, date: 1455453501, sent: True}, 4: {real_id: 68434, date: 1455453500, sent: True}, 5: {real_id: 68433, date: 1455453499, sent: True}, 6: {real_id: 68432, date: 1455453498, sent: True}, 7: {real_id: 68431, date: 1455453494, sent: True}, 8: {real_id: 68430, date: 1455453480, sent: True}}
    
    (1 rows)
    

    CQLSH doesn't return me any error on the DELETE instruction, but it's like if it wasn't taken in account.

    Do you know why ?

    NB : This is my table definition :

    CREATE TABLE be_telegram.conversations (
    user_id bigint,
    conversation_id int,
    peer_type int,
    message_map map<int, frozen<message>>,
    PRIMARY KEY (user_id, conversation_id, peer_type)
    ) WITH CLUSTERING ORDER BY (conversation_id ASC, peer_type ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
    

    解决方案

    A DELETE statement removes one or more columns from one or more rows in a table, or it removes the entire row if no columns are specified. Cassandra applies selections within the same partition key atomically and in isolation.

    When a column is deleted, it is not removed from disk immediately. The deleted column is marked with a tombstone and then removed after the configured grace period has expired. The optional timestamp defines the new tombstone record.

    About deletes in Cassandra

    The way Cassandra deletes data differs from the way a relational database deletes data. A relational database might spend time scanning through data looking for expired data and throwing it away or an administrator might have to partition expired data by month, for example, to clear it out faster. Data in a Cassandra column can have an optional expiration date called TTL (time to live).

    Facts about deleted data to keep in mind are:

    1. Cassandra does not immediately remove data marked for deletion from disk. The deletion occurs during compaction.
    2. If you use the sized-tiered or date-tiered compaction strategy, you can drop data immediately by manually starting the compaction process. Before doing so, understand the documented disadvantages of the process.
    3. A deleted column can reappear if you do not run node repair routinely.

    Why deleted data can reappear

    Marking data with a tombstone signals Cassandra to retry sending a delete request to a replica that was down at the time of delete. If the replica comes back up within the grace period of time, it eventually receives the delete request. However, if a node is down longer than the grace period, the node can miss the delete because the tombstone disappears after gc_grace_seconds. Cassandra always attempts to replay missed updates when the node comes back up again. After a failure, it is a best practice to run node repair to repair inconsistencies across all of the replicas when bringing a node back into the cluster. If the node doesn't come back within gc_grace,_seconds, remove the node, wipe it, and bootstrap it again.

    In your case, compaction is sized-tiered. So please try compaction process.

    Compaction

    Periodic compaction is essential to a healthy Cassandra database because Cassandra does not insert/update in place. As inserts/updates occur, instead of overwriting the rows, Cassandra writes a new timestamped version of the inserted or updated data in another SSTable. Cassandra manages the accumulation of SSTables on disk using compaction.

    Cassandra also does not delete in place because the SSTable is immutable. Instead, Cassandra marks data to be deleted using a tombstone. Tombstones exist for a configured time period defined by the gc_grace_seconds value set on the table. During compaction, there is a temporary spike in disk space usage and disk I/O because the old and new SSTables co-exist. This diagram depicts the compaction process:

    Compaction merges the data in each SSTable data by partition key, selecting the latest data for storage based on its timestamp. Cassandra can merge the data performantly, without random IO, because rows are sorted by partition key within each SSTable. After evicting tombstones and removing deleted data, columns, and rows, the compaction process consolidates SSTables into a single file. The old SSTable files are deleted as soon as any pending reads finish using the files. Disk space occupied by old SSTables becomes available for reuse.

    Data input to SSTables is sorted to prevent random I/O during SSTable consolidation. After compaction, Cassandra uses the new consolidated SSTable instead of multiple old SSTables, fulfilling read requests more efficiently than before compaction. The old SSTable files are deleted as soon as any pending reads finish using the files. Disk space occupied by old SSTables becomes available for reuse.

    so try this

    nodetool <options> repair
    
    options are:
    ( -h | --host ) <host name> | <ip address>
    ( -p | --port ) <port number>
    ( -pw | --password ) <password >
    ( -u | --username ) <user name>
    -- Separates an option and argument that could be mistaken for a option.
    keyspace is the name of a keyspace.
    table is one or more table names, separated by a space.
    

    This command starts the compaction process on tables that use the SizeTieredCompactionStrategy or DateTieredCompactionStrategy. You can specify a keyspace for compaction. If you do not specify a keyspace, the nodetool command uses the current keyspace. You can specify one or more tables for compaction. If you do not specify a table(s), compaction of all tables in the keyspace occurs. This is called a major compaction. If you do specify a table(s), compaction of the specified table(s) occurs. This is called a minor compaction. A major compaction consolidates all existing SSTables into a single SSTable. During compaction, there is a temporary spike in disk space usage and disk I/O because the old and new SSTables co-exist. A major compaction can cause considerable disk I/O.

    这篇关于Cassandra - 删除不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆