与Redis的复合查询 [英] Compound Queries with Redis

查看:104
本文介绍了与Redis的复合查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

出于学习目的,我试图在Redis中编写一个简单的结构化文档存储.在我的示例应用程序中,我正在为数百万个看起来像下面的文档建立索引.

For learning purposes I'm trying to write a simple structured document store in Redis. In my example application I'm indexing millions of documents that look a little like the following.

<book id="1234">
    <title>Quick Brown Fox</title>
    <year>1999</year>
    <isbn>309815</isbn>
    <author>Fred</author>
</book>

我正在编写一种查询语言,允许我说YEAR = 1999 AND TITLE="Quick Brown Fox"(再次,只是为了我的学习,我不在乎我是在重新发明轮子!),这应该返回匹配项的ID.文档(在这种情况下为1234). ANDOR表达式可以任意嵌套.

I'm writing a little query language that allows me to say YEAR = 1999 AND TITLE="Quick Brown Fox" (again, just for my learning, I don't care that I'm reinventing the wheel!) and this should return the ID's of the matching documents (1234 in this case). The AND and OR expressions can be arbitrarily nested.

对于每个文档,我都会如下生成密钥

For each document I'm generating keys as follows

BOOK_TITLE.QUICK_BROWN_FOX = 1234
BOOK_YEAR.1999 = 1234

我正在使用 SADD 将这些文档放入一组格式为KEYNAME.VALUE = { REFS }的系列中

I'm using SADD to plop these documents in a series of sets in the form KEYNAME.VALUE = { REFS }.

查询时,我将表达式解析为AST.一个简单的表达式(例如YEAR=1999)直接映射到 SMEMBERS 命令,该命令可以使我获得匹配的文档集.但是,我不确定如何最有效地执行与"和或"部分.

When I do the querying, I parse the expression into an AST. A simple expression such as YEAR=1999 maps directly to a SMEMBERS command which gets me the set of matching documents back. However, I'm not sure how to most efficiently perform the AND and OR parts.

给出一个查询,例如:

(TITLE=Dental Surgery OR TITLE=DIY Appendectomy)
    AND
(YEAR = 1999 AND AUTHOR = FOO)

我目前向Redis发出以下请求以回答这些查询.

I currently make the following requests to Redis to answer these queries.

-- Stage one generates the intermediate results and returns RANDOM_GENERATED_KEY3
SUNIONSTORE RANDOMLY_GENERATED_KEY1 BOOK_TITLE.DENTAL_SURGERY BOOK_TITLE.DIY_APPENDECTOMY
SINTERSTORE RANDOMLY_GENERATED_KEY2 BOOK_YEAR.1999 BOOK_YEAR.1998
SINTERSTORE RANDOMLY_GENERATED_KEY3 RANDOMLY_GENERATED_KEY1 RANDOMLY_GENERATED_KEY2

-- Retrieving the top level results just requires the last key generated
SMEMBERS RANDOMLY_GENERATED_KEY3

当我遇到AND时,我会基于两个子键使用 SINTERSTORE (对于OR我使用 SUNIONSTORE ).我随机生成一个密钥来存储结果(并设置一个短的TTL,这样我就不会用Redft来填充残骸了).在这一系列命令的最后,返回值是一个键,我可以使用该键通过 SMEMBERS .我使用存储功能的原因是,我不想将所有匹配的文档引用都传输回服务器,因此我使用临时键将结果存储在Redis实例上,然后仅将匹配的结果带回结束.

When I encounter an AND I use SINTERSTORE based on the two child keys (and similarly for OR I use SUNIONSTORE). I randomly generate a key to store the results in (and set a short TTL so I don't fill Redis up with cruft). By the end of this series of commands the return value is a key that I can use to retrieve the results with SMEMBERS. The reason I've used the store functions is that I don't want to transport all the matching document references back to the server, so I use temporary keys to store the result on the Redis instance and then only bring back the matching results at the end.

我的问题很简单,这是将Redis用作文档存储的最佳方法吗?

My question is simply, is this the best way to make use of Redis as a document store?

推荐答案

我正在使用带有排序集的类似方法来实现全文索引.总体方法不错,尽管您可以进行一些相当简单的改进.

I'm using a similar approach with sorted sets to implement full text indexing. The overall approach is good, though there are a couple of fairly simple improvements you could make.

  • 您可以使用查询(或其简短形式)作为关键字,而不是使用随机生成的关键字.这样一来,您就可以重用已经计算出的集合,如果您对通常以相似方式组合的两个大集合进行查询,则可以显着提高性能.
  • 将标题作为完整的字符串处理将导致大量的单个成员集.如果确实需要,最好在标题中为单个单词建立索引并过滤最终结果以进行完全匹配.

这篇关于与Redis的复合查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆