相交cassandra行 [英] intersect cassandra rows

查看:115
本文介绍了相交cassandra行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有cassandra列族。
每行有多列。列具有名称,但值为空。
如果我们有5-10个行键,我们如何找到出现在所有这些键的列名。
eg

We have cassandra column family. each row have multiple columns. columns have name, but value is empty. if we have 5-10 row keys, how we can find column names that appear in all of these keys. e.g.

row1: php, programming, accounting
row2: php, bookkeeping, accounting
row3: php, accounting

必须返回:

result: php, accounting

容易将整行加载到内存中,因为它可能包含1M +列
解决方案不需要快速。

note we can not easily load whole row into the memory, because it may contain 1M+ columns solution not need to be fast.

推荐答案

为了做几行的交集,我们需要先将它们中的两个相交,然后将结果与第三个相交。依此类推。

In order to do intersection of several rows, we will need to intersect two of them first, then to intersect the result with third and so on.

看起来像cassandra可以通过列名来查询数据,这是相对快速的操作。

Looks like in cassandra we can query the data by column names and this is relatively fast operation.

所以我们首先得到10k行的列切片。列名列表(在PHP Cassa - 将它们放在数组中)。然后从第二行选择。

So we first get Column Slice of 10k rows. Making list of column names (in PHP Cassa - put them in array). Then select those from second row.

代码可能如下所示:

$x = $cf->get($first_key, <some column slice>);

$column_names = array();
foreach(array_keys($x) as $k)
   $column_names[] = $k;

$result = $cf->get($second_key, $column_slice = null, $column_names);

// write result somewhere, and proceed with next slice

这篇关于相交cassandra行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆