查找缺少任意字段的 CouchDB 文档 [英] Find CouchDB docs missing an arbitrary field
问题描述
我需要一个 CouchDB 视图,我可以在其中取回所有没有任意字段的文档.如果您事先知道文档可能没有哪些字段,这很容易做到.例如,这可以让你发送 view/my_view/?key="foo"
轻松检索没有 "foo" 字段的文档:
I need a CouchDB view where I can get back all the documents that don't have an arbitrary field. This is easy to do if you know in advance what fields a document might not have. For example, this lets you send view/my_view/?key="foo"
to easily retrieve docs without the "foo" field:
function (doc) {
var fields = [ "foo", "bar", "etc" ];
for (var idx in fields) {
if (!doc.hasOwnProperty(fields[idx])) {
emit(fields[idx], 1);
}
}
}
但是,您只能询问视图中设置的三个字段;像 view/my_view/?key="baz"
这样的东西不会给你任何东西,即使你有很多文档缺少该字段.我需要一个视图——我不需要提前指定可能缺少的字段.有什么想法吗?
However, you're limited to asking about the three fields set in the view; something like view/my_view/?key="baz"
won't get you anything, even if you have many docs missing that field. I need a view where it will--where I don't need to specify possible missing fields in advance. Any thoughts?
推荐答案
这种技术被称为泰式按摩.当(且仅当)视图以文档 ID 为键时,使用它在视图中不有效地查找文档.
This technique is called the Thai massage. Use it to efficiently find documents not in a view if (and only if) the view is keyed on the document id.
function(doc) {
// _view/fields map, showing all fields of all docs
// In principle you could emit e.g. "foo.bar.baz"
// for nested objects. Obviously I do not.
for (var field in doc)
emit(field, doc._id);
}
function(keys, vals, is_rerun) {
// _view/fields reduce; could also be the string "_count"
return re ? sum(vals) : vals.length;
}
要查找没有该字段的文档,
To find documents not having that field,
GET/db/_all_docs
并记住所有 IDGET/db/_design/ex/_view/fields?reduce=false&key="some_field"
- 比较
_all_docs
中的 id 与查询中的 id.
GET /db/_all_docs
and remember all the idsGET /db/_design/ex/_view/fields?reduce=false&key="some_field"
- Compare the ids from
_all_docs
vs the ids from the query.
_all_docs
中但不在视图中的 id 是缺少该字段的那些.
The ids in _all_docs
but not in the view are those missing that field.
将 id 保存在内存中听起来很糟糕,但您不必这样做!您可以使用合并排序策略,同时迭代两个查询.您从 has 列表的第一个 id(来自视图)和 full 列表的第一个 id(来自 _all_docs)开始.
It sounds bad to keep the ids in memory, but you don't have to! You can use a merge sort strategy, iterating through both queries simultaneously. You start with the first id of the has list (from the view) and the first id of the full list (from _all_docs).
- 如果满 <有,缺少字段,用下一个完整元素重做
- 如果full = has,它有字段,用下一个full元素重做
- 如果 full > has,用下一个 has 元素重做
- If full < has, it is missing the field, redo with the next full element
- If full = has, it has the field, redo with the next full element
- If full > has, redo with the next has element
根据您的语言,这可能很困难.但在 Javascript 或其他事件驱动的编程框架中非常容易.
Depending on your language, that might be difficult. But it is pretty easy in Javascript, for example, or other event-driven programming frameworks.
这篇关于查找缺少任意字段的 CouchDB 文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!