使用timeuuid作为列名称将日志数据写入Cassandra时性能不佳 [英] Bad performance when writing log data to Cassandra with timeuuid as a column name

查看:891
本文介绍了使用timeuuid作为列名称将日志数据写入Cassandra时性能不佳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

按照指南中的指示操作ebay tech blog datastax开发人员blog ,我在Cassandra 1.2中建模了一些事件日志数据。作为分区键,我使用ddmmyyhh | bucket,其中bucket是介于0和集群中节点数之间的任何数字。

Following the pointers in an ebay tech blog and a datastax developers blog, I model some event log data in Cassandra 1.2. As a partition key, I use "ddmmyyhh|bucket", where bucket is any number between 0 and the number of nodes in the cluster.


数据模型

cqlsh:Log> CREATE TABLE事务(yymmddhh varchar,bucket int,
rId int,created timeuuid, data map,PRIMARY
KEY((yymmddhh,bucket),created));

cqlsh:Log> CREATE TABLE transactions (yymmddhh varchar, bucket int, rId int, created timeuuid, data map, PRIMARY KEY((yymmddhh, bucket), created) );

(rId标识触发事件的资源)
(map是从JSON导出的键值对;键改变,但不是很多)

(rId identifies the resource that fired the event.) (map is are key value pairs derived from a JSON; keys change, but not much)

我假设这转化为复合主/行键,每小时X个桶。
我的列名是timeuuids。查询此数据模型的工作原理如预期(我可以查询时间范围。)

I assume that this translates into a composite primary/row key with X buckets per hours. My column names are than timeuuids. Querying this data model works as expected (I can query time ranges.)

问题是性能:插入新行的时间不断增加。
所以我在做s.th.错了,但不能确定问题。

The problem is the performance: the time to insert a new row increases continuously. So I am doing s.th. wrong, but can't pinpoint the problem.

当我使用timeuuid作为行键的一部分时,性能在高水平上保持稳定,但这会阻止我查询它行键当然会抛出一个关于过滤的错误消息)。

When I use the timeuuid as a part of the row key, the performance remains stable on a high level, but this would prevent me from querying it (a query without the row key of course throws an error message about "filtering").

任何帮助?非常感谢!

从地图数据类型切换到预定义的列名减轻了问题。现在,插入次数似乎保持在每次插入<0.005s。

Switching from the map data-type to a predefined column names alleviates the problem. Insert times now seem to remain at around <0.005s per insert.

核心问题仍然是:
我如何使用map数据类型高效?对于数以千计的插入,只有轻微变化的键,这将是一个有效的方法。

The core question remains: How is my usage of the "map" datatype in efficient? And what would be an efficient way for thousands of inserts with only slight variation in the keys.

我的键我使用的数据到地图大多保持不变。我理解datastax文档(不能发布链接由于声誉限制,对不起,但很容易找到)说每个键创建一个额外的列 - 或者它是创建一个每个地图一列吗?这将是...很难相信我。

My keys I use data into the map mostly remain the same. I understood the datastax documentation (can't post link due to reputation limitations, sorry, but easy to find) to say that each key creates an additional column -- or does it create one new column per "map"?? That would be... hard to believe to me.

推荐答案

我建议你模型你的行有点不同。集合不是很好,使用的情况下,你可能最终会有太多的元素。原因是Cassandra二进制协议中的限制,其使用两个字节来表示集合中的元素的数量。这意味着如果您的集合中有超过2 ^ 16个元素,则大小字段将溢出,即使服务器将所有元素发送回客户端,客户端只能看到 N%2 ^ 16 第一个元素(因此,如果你有2 ^ 16 + 3个元素,它将寻找客户端,就像只有3个元素)。

I suggest you model your rows a little differently. The collections aren't very good to use in cases where you might end up with too many elements in them. The reason is a limitation in the Cassandra binary protocol which uses two bytes to represent the number of elements in a collection. This means that if your collection has more than 2^16 elements in it the size field will overflow and even though the server sends all of the elements back to the client, the client only sees the N % 2^16 first elements (so if you have 2^16 + 3 elements it will look to the client as if there are only 3 elements).

如果没有获得这么多元素到你的集合的风险,你可以忽略这个建议。我不认为使用集合会给你更差的性能,我不太确定如何会发生。

If there is no risk of getting that many elements into your collections, you can ignore this advice. I would not think that using collections gives you worse performance, I'm not really sure how that would happen.

CQL3集合基本上只是一个hack在存储模型(我不是在任何负面意义上的黑客),你可以做一个类似MAP的行,不受上述限制自己限制:

CQL3 collections are basically just a hack on top of the storage model (and I don't mean hack in any negative sense), you can make a MAP-like row that is not constrained by the above limitation yourself:

CREATE TABLE transactions (
  yymmddhh VARCHAR,
  bucket INT,
  created TIMEUUID,
  rId INT,
  key VARCHAR,
  value VARCHAR,
  PRIMARY KEY ((yymmddhh, bucket), created, rId, key)
)


b $ b

(注意,我把 rId 和地图键移动到主键,我不知道 rId 是,但我认为这是正确的)

(Notice that I moved rId and the map key into the primary key, I don't know what rId is, but I assume that this would be correct)

这有两个缺点,使用MAP:它需要你重新组装地图,当你查询数据(每个映射条目都会返回一行),并且它使用更多的空间,因为C *将插入一些额外的列,但是上方是获取太大的集合没有问题。

This has two drawbacks over using a MAP: it requires you to reassemble the map when you query the data (you would get back a row per map entry), and it uses a litte more space since C* will insert a few extra columns, but the upside is that there is no problem with getting too big collections.

最后,它很大程度上取决于你如何查询你的数据。不优化插入,优化读取。例如:如果你不需要每次读回整个地图,但通常只是从它读取一个或两个键,将键放在分区/行键,而不是每个键有一个单独的分区/行假设键的集合将是固定的,所以你知道要查询什么,因为我说:这取决于你如何查询你的数据)。

In the end it depends a lot on how you want to query your data. Don't optimize for insertions, optimize for reads. For example: if you don't need to read back the whole map every time, but usually just read one or two keys from it, put the key in the partition/row key instead and have a separate partition/row per key (this assumes that the set of keys will be fixed so you know what to query for, so as I said: it depends a lot on how you want to query your data).

您还在评论中提到,当您将桶数从三(0-2)增加到300(0-299)时,性能会提高。这样做的原因是您在集群中更均匀地分散负载。当你有一个基于时间的分区/行键,比如你的 yymmddhh ,总会有一个热分区,所有的写入都会移动在任何给定时刻它将只击中一个节点)。您使用 bucket 列/单元正确添加了平滑因子,但是只有三个值,至少两个结束在相同物理节点上的可能性太高。有三百人,你会有更好的传播。

You also mentioned in a comment that the performance improved when you increased the number of buckets from three (0-2) to 300 (0-299). The reason for this is that you spread the load much more evenly thoughout the cluster. When you have a partition/row key that is based on time, like your yymmddhh, there will always be a hot partition where all writes go (it moves throughout the day, but at any given moment it will hit only one node). You correctly added a smoothing factor with the bucket column/cell, but with only three values the likelyhood of at least two ending up on the same physical node are too high. With three hundred you will have a much better spread.

这篇关于使用timeuuid作为列名称将日志数据写入Cassandra时性能不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆