Cassandra二级指数v。Lucene [英] Cassandra secondary indices v. Lucene

查看:163
本文介绍了Cassandra二级指数v。Lucene的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我理解Cassandra是一个NoSQL数据库,修补它与许多索引不是去的方式,但在这里我正在寻找我的分析集群的解决方案,而不是生产/实时的。



所以我认为添加索引以减少Spark过滤的数据量是有意义的。



如何做原生Cassandra第二指数与Lucene的指数相比较?



许多功能不适用于Cassandra,但是你可以用两种方法做什么?



它是更好还是只使用Lucene有意义吗?



我看到的另一个优点是,我可以安装Lucene只在我的分析集群,而不会使用索引重载实时索引(因此提高了写入性能)。

解决方案



Cassandra 3.4 ,我们有一个新的辅助索引,称为SASI,提供全文搜索,性能非常好。 / p>

阅读: https://github.com/apache/cassandra/blob/trunk/doc/SASI.md


I understand that Cassandra is a NoSQL db and patching it with many indices is not the way to go, but here I'm looking at solution for my analytics cluster, not for the production/real-time one.

So I think it makes sense to add indices to reduce the amount of data filtered by Spark.

How do native Cassandra secondary indices compare to Lucene's indices?

Many functionalities are not available with Cassandra alone, but what about things that you can do with both?

Is it better / does it make sense to only use Lucene?

Another advantage that I see is that I can install Lucene only on my analytics cluster, without overloading the real-time one with indices (and therefore improving the write performance on that side).

解决方案

Don't bother with Lucene integration

Since Cassandra 3.4, we have a new secondary index called SASI that offers full text search and is quite performant.

Read this: https://github.com/apache/cassandra/blob/trunk/doc/SASI.md

这篇关于Cassandra二级指数v。Lucene的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆