如何筛选Cassandra中的数据? [英] How do I filter through data in Cassandra?

查看:120
本文介绍了如何筛选Cassandra中的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用mySQL的应用程序一段时间,我收集的数据越多,它得到的速度越慢。所以我一直在研究NOSQL选项。我在我的SQL中的一个东西是一个视图从一堆连接创建。该应用程序显示网格中的所有重要信息,用户可以选择范围,搜索等。在此数据集。标准查询内容。

I've been using mySQL for an app for some time, and the more data I collect, the slower it gets. So I have been looking into NOSQL options. One of the things I have in mySQL is a View created from a bunch of joins. The app shows all the important info in a grid, and the user can select ranges, do searches, etc. On this data set. Standard Query stuff.

查看Cassandra一切都已经根据我在storage-conf.xml中提供的参数排序。所以我会有一个特定的字符串作为我的关键在SuperColumn,并保留在列下面的一堆数据。但我只能排序一个列,我不能做任何真正的搜索,而不拉所有的SuperColumns,循环的数据,对吗?

Looking at Cassandra everything is already sorted based on the parameters I provide in my storage-conf.xml. So I would have a certain string as my key in the SuperColumn, and keep a bunch of the data in Columns below that. But I can only sort by one Column, and I can't do any real searching within the columns without pulling all the SuperColumns, and looping through the data, right?

我不想跨不同的ColumnFamilies重复数据,所以我想确保Cassandra适合我。在Facebook,Digg,Twitter,他们有很多搜索功能,所以也许我只是看不到解决方案。

I don't want to duplicate data across different ColumnFamilies, so I want to make sure Cassandra is appropriate for me. In Facebook, Digg, Twitter, they have plenty of searching functions, so maybe I am just not seeing the solution.

Cassandra是否有办法在SuperColumn或其相关列中搜索或过滤特定数据值?如果没有,是否有另一个NOSQL选项?

Is there a way with Cassandra for me to search for or filter specific data values in a SuperColumn, or its associated Column(s)? If not, is there another NOSQL option?

在下面的例子中,似乎我只能查询phatduckk,friend1,John等等。但是如果我想找到居住在城市的ColumnFamily中的任何人==Beverley Hills?是否可以在不返回所有记录的情况下完成?如果是这样,我可以搜索city ==Beverley HillsAND state ==CA?它似乎不是我能做的,但我想确保看看我的选择是什么。

In the example below, it seems I can only query for phatduckk, friend1,John, etc. But what if I wanted to find anyone in the ColumnFamily that lived in city == "Beverley Hills"? Can it be done without returning all records? If so, could I do a search for city == "Beverley Hills" AND state == "CA"? It doesn't seem like I can do either, but I want to make sure and see what my options are.

AddressBook = { // this is a ColumnFamily of type Super
  phatduckk: {    // this is the key to this row inside the Super CF
    friend1: {street: "8th street", zip: "90210", city: "Beverley Hills", state: "CA"},
    John: {street: "Howard street", zip: "94404", city: "FC", state: "CA"},
    Kim: {street: "X street", zip: "87876", city: "Balls", state: "VA"},
    Tod: {street: "Jerry street", zip: "54556", city: "Cartoon", state: "CO"},
    Bob: {street: "Q Blvd", zip: "24252", city: "Nowhere", state: "MN"},
  }, // end row
  ieure: {     
    joey: {street: "A ave", zip: "55485", city: "Hell", state: "NV"},
    William: {street: "Armpit Dr", zip: "93301", city: "Bakersfield", state: "CA"},
  },

}

推荐答案

您不能在Cassandra中执行这些操作。有一些种类的选择谓词可以在列键上设置,但是它们所保存的值没有任何值。查看 API 并检查get_slice / get_superslice和get_range查询类型。同样,所有这一切都涉及ColumnFamily或SuperColumnFamily中的键不是值。

You cannot perform those kind of operations in Cassandra. There is a certain kinds of selection predicates that can be set on column-keys but nothing on the value that they hold. Look at the API and check get_slice/get_superslice and get_range query types. Again, all of this is concerning the keys in the ColumnFamily or SuperColumnFamily not the values.

如果你想要你所描述的那种功能,那么最好的办法是使用SQL数据库。在表上构建正确的索引,特别是在最被查询的列上,您会发现查询性能有很大的不同。希望这有助于。

If you want the kind of functionality that you have described then your best bet is a SQL database. Build proper indexes on your tables, especially on the columns that are most queried and you will see a big difference in the query performance. Hope this helps.

这篇关于如何筛选Cassandra中的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆