如何在 RethinkDB 中通过对象数组查询多索引 [英] How to query a multi index in RethinkDB over an array of objects

查看:52
本文介绍了如何在 RethinkDB 中通过对象数组查询多索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个看起来像这样的数据集:

I'm working with a data set that looks something like this:

"bitrates": [
  {
    "format":  "mp3" ,
    "rate":  "128K"
  } ,
  {
    "format":  "aac" ,
    "rate":  "192K"
  }
] ,
"details": [ ... ] ,
"id": 1 ,
"name":  "For Those About To Rock We Salute You" ,
"price": 1026 ,
"requires_shipping": false ,
"sku":  "ALBUM-1" 
}

我想在 bitrates 上创建二级索引,灵活使用 {multi:true}.这是我的尝试:

And I wanted to create a secondary index on bitrates, flexing {multi:true}. This was my attempt:

r.db("music").table("catalog").indexCreate("bitrates", {multi: true})

索引构建得很好,但是当我查询它时,没有任何返回 - 这似乎与我在这里阅读的每个示例相反:

The index built just fine, but when I query it, nothing returns - which seems contrary to every example I've read here:

http://rethinkdb.com/docs/secondary-indexes/javascript/

我写的查询是这样的:

r.db("music").table("catalog").getAll(["mp3", "128K"], {index : "bitrates"})

没有错误,只有 0 个结果(我有 300 个左右的文档包含这些确切数据).

There is no error, just 0 results (and I have 300 or so documents with this exact data).

我使用的是 RethinkDB 2.0 RC1.

I'm using RethinkDB 2.0 RC1.

推荐答案

为列创建索引时,列中的值按字面用作索引的键.在您的情况下,您的 bitrates 索引的键将是文档中 bitrates 数组中的对象.

When you create an index for a column, the values in the column are used literally as the keys of the index. In your case, the keys for your bitrates index would be the objects within the bitrates array in the document.

似乎您想要的是从文档字段中的值派生的索引.为此,您需要定义一个自定义索引函数,将文档缩减为您关心的数据.试验它们的最简单方法是从编写查询开始,一旦您对结果感到满意,就将其转换为 indexCreate() 语句.

It seems like what you want is an index that's derived from the values in a field of the document. To do that, you want to define a custom indexing function that reduces the document to just the data you care about. The easiest way to experiment with them is to start by writing a query, and once you're happy with the results, converting it into an indexCreate() statement.

这是一个语句,它获取您的示例文档(ID 为 1),并从其 bitrate<中的所有对象中提取 formatrate 项/code> 数组,然后将它们合并在一起以创建一组不同的字符串:

Here's a statement that grabs your sample document (with id 1), and plucks the format and rate terms from all of the objects in its bitrate array, and then merges them together to create a distinct set of strings:

r.db('music').table('catalog').get(1).do(function(row) {
  return row('bitrates').map(function(bitrate) {
    return [bitrate('format'), bitrate('rate')];
  }).reduce(function(left, right) {
    return left.setUnion(right);
  })
})

运行此语句将返回以下内容:

Running this statement will return the following:

["mp3", "128K", "aac", "192K"]

这看起来不错,所以我们可以使用我们的函数来创建索引.在这种情况下,由于我们期望索引函数返回一组项目,我们还希望指定 {multi: true} 以确保我们可以通过 items 在集合中,而不是集合本身:

This looks good, so we can use our function to create an index. In this case, since we're expecting the indexing function to return a set of items, we also want to specify {multi: true} to ensure we can query by the items in the set, not the set itself:

r.db('music').table('catalog').indexCreate('bitrates', function(row) {
  return row('bitrates').map(function(bitrate) {
    return [bitrate('format'), bitrate('rate')];
  }).reduce(function(left, right) {
    return left.setUnion(right);
  })
}, {multi: true})

创建后,您可以像这样查询索引:

Once created, you can query your index like this:

r.db('music').table('catalog').getAll('mp3', {index: 'bitrates'})

您还可以提供多个查询词,以匹配与任何项目匹配的行:

You can also supply multiple query terms, to match rows that match any of the items:

r.db('music').table('catalog').getAll('mp3', '128K', {index: 'bitrates'})

但是,如果单个文档与查询中的多个术语匹配,它将被多次返回.要解决此问题,请添加 distinct():

However, if a single document matches more than one term in your query, it will be returned more than once. To fix this, add distinct():

r.db('music').table('catalog').getAll('mp3', '128K', {index: 'bitrates'}).distinct()

如有必要,您也可以考虑使用 downcase() 来规范二级索引中使用的术语的大小写.

If necessary, you might also consider using downcase() to normalize the casing of the terms used in the secondary index.

您也可以完全跳过所有索引业务并使用 filter() 查询:

You could also skip all of the indexing business entirely and use a filter() query:

r.db('music').table('catalog').filter(function(row) {
  return row('bitrates').map(function(bitrates) {
    return [bitrates('format'), bitrates('rate')];
  }).reduce(function(left, right) {
    return left.setUnion(right);
  }).contains('mp3');
})

也就是说,如果您几乎总是以相同的方式查询您的表,那么使用自定义函数生成二级索引将显着提高性能.

That said, if you're almost always querying your table in the same manner, generating a secondary index using a custom function will result in dramatically better performance.

这篇关于如何在 RethinkDB 中通过对象数组查询多索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆