RethinkDB - 查找缺少字段的文档 [英] RethinkDB - Find documents with missing field

查看:32
本文介绍了RethinkDB - 查找缺少字段的文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写最佳查询来查找所有没有特定字段的文档.有没有比我在下面列出的例子更好的方法呢?

I'm trying to write the most optimal query to find all of the documents that do not have a specific field. Is there any better way to do this than the examples I have listed below?

// Get the ids of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).pluck("id")

// Get a count of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).count()

现在,这些查询在包含约 40k 文档的表上花费大约 300-400 毫秒,这似乎相当慢.此外,在这种特定情况下,位置"属性包含纬度/经度并具有地理空间索引.

Right now, these queries take about 300-400ms on a table with ~40k documents, which seems rather slow. Furthermore, in this specific case, the "location" attribute contains latitude/longitude and has a geospatial index.

有没有办法做到这一点?谢谢!

Is there any way to accomplish this? Thanks!

推荐答案

一个幼稚的建议

您可以将 hasFields 方法与 not 过滤掉不需要的文件的方法:

You could use the hasFields method along with the not method on to filter out unwanted documents:

r.db("mydb").table("mytable")
  .filter(function (row) {
    return row.hasFields({ location: true }).not()
  })

这可能更快也可能不会更快,但值得一试.

This might or might not be faster, but it's worth trying.

使用二级索引

理想情况下,您需要一种方法将 location 设为二级索引,然后使用 getAllbetween 因为使用索引的查询总是快点.您可以解决的一种方法是使表中的所有行都具有值 false 的位置值(如果它们没有位置).然后,您将为位置创建二级索引.最后,您可以根据需要使用 getAll 查询表!

Ideally, you'd want a way to make location a secondary index and then use getAll or between since queries using indexes are always faster. A way you could work around that is making all rows in your table have a value false value for their location, if they don't have a location. Then, you would create a secondary index for location. Finally, you can then query the table using getAll as much as you want!

  1. 向所有没有位置的字段添加位置属性

为此,您需要先将 location: false 插入到所有没有位置的行中.您可以按如下方式执行此操作:

For that, you'd need to first insert location: false into all rows without a location. You could do this as follows:

r.db("mydb").table("mytable")
  .filter(function (row) {
    return row.hasFields({ location: true }).not()
  })
  .update({
    location: false
  })

此后,您需要找到一种方法,每次添加没有位置的文档时插入 location: false.

After this, you would need to find a way to insert location: false every time you add a document without a location.

  1. 为表创建二级索引

既然所有文档都有一个 location 字段,我们可以为 location 创建二级索引.

Now that all documents have a location field, we can create a secondary index for location.

r.db("mydb").table("mytable")
 .indexCreate('location')

请记住,您只需添加 { location: false }仅创建一次索引.

Keep in mind that you only have to add the { location: false } and create the index only once.

  1. 使用getAll

现在我们可以使用 getAll 来查询使用 location 索引的文档.

Now we can just use getAll to query documents using the location index.

r.db("mydb").table("mytable")
 .getAll(false, { index: 'location' })

这可能比上面的查询更快.

This will probably be faster than the query above.

使用二级索引(函数)

您还可以创建二级索引作为函数.基本上,您创建一个函数,然后使用 getAll 查询该函数的结果.这可能比我之前提出的更容易、更直接.

You can also create a secondary index as a function. Basically, you create a function and then query the results of that function using getAll. This is probably easier and more straight-forward than what I proposed before.

  1. 创建索引

这是:

r.db("mydb").table("mytable")
 .indexCreate('has_location', 
   function(x) { return x.hasFields('location'); 
 })

  1. 使用 getAll.

这是:

r.db("mydb").table("mytable")
 .getAll(false, { index: 'has_location' })

这篇关于RethinkDB - 查找缺少字段的文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆