RethinkDB - 查找缺少字段的文档 [英] RethinkDB - Find documents with missing field
问题描述
我正在尝试编写最佳查询来查找所有没有特定字段的文档.有没有比我在下面列出的例子更好的方法呢?
I'm trying to write the most optimal query to find all of the documents that do not have a specific field. Is there any better way to do this than the examples I have listed below?
// Get the ids of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).pluck("id")
// Get a count of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).count()
现在,这些查询在包含约 40k 文档的表上花费大约 300-400 毫秒,这似乎相当慢.此外,在这种特定情况下,位置"属性包含纬度/经度并具有地理空间索引.
Right now, these queries take about 300-400ms on a table with ~40k documents, which seems rather slow. Furthermore, in this specific case, the "location" attribute contains latitude/longitude and has a geospatial index.
有没有办法做到这一点?谢谢!
Is there any way to accomplish this? Thanks!
推荐答案
一个幼稚的建议
您可以将 hasFields
方法与 not
过滤掉不需要的文件的方法:
You could use the hasFields
method along with the not
method on to filter out unwanted documents:
r.db("mydb").table("mytable")
.filter(function (row) {
return row.hasFields({ location: true }).not()
})
这可能更快也可能不会更快,但值得一试.
This might or might not be faster, but it's worth trying.
使用二级索引
理想情况下,您需要一种方法将 location
设为二级索引,然后使用 getAll
或 between
因为使用索引的查询总是快点.您可以解决的一种方法是使表中的所有行都具有值 false
的位置值(如果它们没有位置).然后,您将为位置创建二级索引.最后,您可以根据需要使用 getAll
查询表!
Ideally, you'd want a way to make location
a secondary index and then use getAll
or between
since queries using indexes are always faster. A way you could work around that is making all rows in your table have a value false
value for their location, if they don't have a location. Then, you would create a secondary index for location. Finally, you can then query the table using getAll
as much as you want!
- 向所有没有位置的字段添加位置属性
为此,您需要先将 location: false
插入到所有没有位置的行中.您可以按如下方式执行此操作:
For that, you'd need to first insert location: false
into all rows without a location. You could do this as follows:
r.db("mydb").table("mytable")
.filter(function (row) {
return row.hasFields({ location: true }).not()
})
.update({
location: false
})
此后,您将需要找到一种方法,每次添加没有位置的文档时插入 location: false
.
After this, you would need to find a way to insert location: false
every time you add a document without a location.
- 为表创建二级索引
既然所有文档都有一个 location
字段,我们可以为 location
创建二级索引.
Now that all documents have a location
field, we can create a secondary index for location
.
r.db("mydb").table("mytable")
.indexCreate('location')
请记住,您只需添加 { location: false }
并仅创建一次索引.
Keep in mind that you only have to add the { location: false }
and create the index only once.
- 使用
getAll
现在我们可以使用 getAll
来查询使用 location
索引的文档.
Now we can just use getAll
to query documents using the location
index.
r.db("mydb").table("mytable")
.getAll(false, { index: 'location' })
这可能比上面的查询更快.
This will probably be faster than the query above.
使用二级索引(函数)
您还可以创建二级索引作为函数.基本上,您创建一个函数,然后使用 getAll
查询该函数的结果.这可能比我之前提出的更容易、更直接.
You can also create a secondary index as a function. Basically, you create a function and then query the results of that function using getAll
. This is probably easier and more straight-forward than what I proposed before.
- 创建索引
这是:
r.db("mydb").table("mytable")
.indexCreate('has_location',
function(x) { return x.hasFields('location');
})
- 使用
getAll
.
这是:
r.db("mydb").table("mytable")
.getAll(false, { index: 'has_location' })
这篇关于RethinkDB - 查找缺少字段的文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!