如何在 rethinkdb 中创建复合多索引? [英] How do I create a compound multi-index in rethinkdb?

查看:146
本文介绍了如何在 rethinkdb 中创建复合多索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用带有官方 python 驱动程序的 Rethinkdb 1.10.1.我有一个与一个用户相关联的标记事物表:

I am using Rethinkdb 1.10.1 with the official python driver. I have a table of tagged things which are associated to one user:

{
    "id": "PK",
    "user_id": "USER_PK",
    "tags": ["list", "of", "strings"],
    // Other fields...
}

我想通过 user_idtag 查询(例如,通过用户tawmas"和标签tag"查找所有内容).从 Rethinkdb 1.10 开始,我可以创建这样的多索引:

I want to query by user_id and tag (say, to find all the things by user "tawmas" with tag "tag"). Starting with Rethinkdb 1.10 I can create a multi-index like this:

r.table('things').index_create('tags', multi=True).run(conn)

我的查询将是:

res = (r.table('things')
       .get_all('TAG', index='tags')
       .filter(r.row['user_id'] == 'USER_PK').run(conn))

但是,这个查询仍然需要扫描给定标签的所有文档,所以我想根据 user_id 和 tags 字段创建一个复合索引.这样的索引将允许我查询:

However, this query still needs to scan all the documents with the given tag, so I would like to create a compound index based on the user_id and tags fields. Such an index would allow me to query with:

res = r.table('things').get_all(['USER_PK', 'TAG'], index='user_tags').run(conn)

文档中没有关于复合多索引的内容.但是,我尝试使用结合复合要求的自定义索引函数通过返回 ["USER_PK", "tag"] 对的列表来创建索引和多索引.

There is nothing in the documentation about compound multi-indexes. However, I tried to use a custom index function combining the requirements for compound indexes and multi-indexes by returning a list of ["USER_PK", "tag"] pairs.

我的第一次尝试是在 python 中:

My first attempt was in python:

r.table('things').index_create(
    'user_tags',
    lambda each: [[each['user_id'], tag] for tag in each['tags']],
    multi=True).run(conn)

这使得 python 驱动程序因 MemoryError 试图解析索引函数而窒息(我猜驱动程序并不真正支持列表推导式).

This makes the python driver choke with a MemoryError trying to parse the index function (I guess list comprehensions aren't really supported by the driver).

所以,我转向我的(不可否认,生疏的)javascript 并想出了这个:

So, I turned to my (admittedly, rusty) javascript and came up with this:

r.table('things').index_create(
    'user_tags',
    r.js(
        """(function (each) {
            var result = [];
            var user_id = each["user_id"];
            var tags = each["tags"];
            for (var i = 0; i < tags.length; i++) {
                result.push([user_id, tags[i]]);
            }
            return result;
        })
        """),
    multi=True).run(conn)

这被服务器以一个奇怪的异常拒绝:rethinkdb.errors.RqlRuntimeError:无法证明函数确定性.索引函数必须是确定性的.

This is rejected by the server with a curious exception: rethinkdb.errors.RqlRuntimeError: Could not prove function deterministic. Index functions must be deterministic.

那么,定义复合多索引的正确方法是什么?或者是什么目前不支持哪个?

So, what is the correct way to define a compound multi-index? Or is it something which is not supported at this time?

推荐答案

简答:

列表推导式在 ReQL 函数中不起作用.您需要像这样使用 map :

List comprehensions don't work in ReQL functions. You need to use map instead like so:

r.table('things').index_create(
    'user_tags',
    lambda each: each["tags"].map(lambda tag: [each['user_id'], tag]),
    multi=True).run(conn)

<小时>

长答案

这实际上是 RethinkDB 驱动程序工作方式的一个微妙方面.所以这不起作用的原因是您的python代码实际上并没有看到每个文档的真实副本.所以在表达式中:

This is actually a somewhat subtle aspect of how RethinkDB drivers work. So the reason this doesn't work is that your python code doesn't actually see real copies of the each document. So in the expression:

lambda each: [[each['user_id'], tag] for tag in each['tags']]

each 从未绑定到数据库中的实际文档,它绑定到表示文档的特殊 python 变量.我实际上会尝试运行以下命令来演示它:

each isn't ever bound to an actual document from your database, it's bound to a special python variable which represents the document. I'd actually try running the following just to demonstrate it:

q = r.table('things').index_create(
       'user_tags',
       lambda each: print(each)) #only works in python 3

它会打印出如下内容:

<RqlQuery instance: var_1 >

驱动程序只知道这是一个来自函数的变量,特别是它不知道 each["tags"] 是一个数组还是什么(它实际上只是另一个非常相似的抽象对象).所以python不知道如何迭代那个字段.基本上完全相同的问题存在于 javascript 中.

the driver only knows that this is a variable from the function, in particular it has no idea if each["tags"] is an array or what (it's actually just another very similar abstract object). So python doesn't know how to iterate over that field. Basically exactly the same problem exists in javascript.

这篇关于如何在 rethinkdb 中创建复合多索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆