CouchDB视图-在键数组上进行筛选和分组 [英] CouchDB View - Filter and Group By on Key Array
问题描述
我在CouchDB视图中有一组键, [doc.time,doc.address]
。两者都不是唯一的。 doc.time
是UNIX时间戳,而 doc.address
是字符串。 reduce
函数设置为 _sum
,因为每组键的唯一值是数字。
I have an array of keys in a CouchDB view, [doc.time, doc.address]
. Neither is unique. doc.time
is a UNIX timestamp and doc.address
is a string. The reduce
function is set to _sum
as the only value for each set of keys is a number.
我想要的是按 doc.time
过滤,然后将其余记录按 doc.address分组
。如果我将 doc.time
作为第一个键,则无论我指定为 group_level $ c是什么,我似乎都无法按唯一地址进行分组。 $ c>。如果我先放置
doc.address
,我似乎无法按时间过滤查询。
What I want is to filter by doc.time
, then group the remaining records by doc.address
. If I put doc.time
as the first key, I cannot seem to group by unique addresses no matter what I specify as a group_level
. If I put doc.address
first, I cannot seem to filter the query by time.
查询:?group_level = 1& startkey = [0,1230000000]& endkey = [{},1340000000]
第一个关键字: doc.address
在 doc.time
问题:不能按时间过滤
代码:
rows: [
{
key: [ "1126GDuGLQTX3LFHHmjCctdn8WKDjn7QNA" ],
value: 50
},
{
key: [ "112AobLhjLJQ3LGqXFrsdnWMPqWCQqoiS6" ],
value: 50
}
]
查询:?group_level = 1& startkey = [1230000000]& endkey = [1340000000,{}]
第一个键: doc.time
在 doc.address之前
问题:看不到,我没有按 doc.address分组
Problem: Cannot see and I am not grouped by doc.address
代码:
rows: [
{
key: [ 1231469665 ],
value: 50
},
{
key: [ 1231469744 ],
value: 50
}
]
推荐答案
您提到:
...如果我把
doc.time
作为第一个键,我似乎无法按唯一地址分组无论我指定为group_level ...
... If I put
doc.time
as the first key, I cannot seem to group by unique addresses no matter what I specify as a group_level ...
查询参数 group_level = N
在 N th
逗号上分割字符串,并通过字符串匹配将左元素分组在一起。因此,当您的 array键如下所示: [doc.time,doc.address]
时,您将无法按地址
,该地址不在逗号的左侧。
The query parameter group_level=N
splits the string on the Nth
comma and groups the left elements together by string match. Therefore, When your array key is like this: [doc.time, doc.address]
, you won't be able to group by address
, which is not on the left side of the comma.
...如果我先放置
doc.address
,我似乎无法按时间过滤查询...
... If I put
doc.address
first, I cannot seem to filter the query by time ...
当您的数组键类似于: [doc.address,doc.time]
时,请注意您正在发出<地图函数中的em>数组键。您需要考虑CouchDB中关于数组键或复合键的后续点:
When your array key is like: [doc.address, doc.time]
, notice that you are emitting an array key inside your Map function. You need to consider the following points regarding array key or compound key in CouchDB:
在此引用:
...首先要注意的事项和非常非常重要...来自javascript Map函数的数组输出...这些索引键中的每个都是字符串,并按字符串作为字符串(包括方括号和逗号)按字符排序...
... First thing of note and very important ... an array output ... from the javascript Map function ... each of those Index Keys are strings, and are ordered character by character as strings, including the brackets and commas ...
上述引用对CA中CouchDB索引的工作方式有重大影响复合键或数组键的本身。
The above statement and explanations on the reference have a significant impact on how CouchDB indexing works in the case of compound key or array key.
为澄清起见,让我们在<$ c上创建如下所示的文档$ c>样本数据库:
To clarify, lets create documents like below on a sample
database:
{"time":"2011","address":"CT"}
{"time":"2012","address":"CT"}
...
{"time":"2011","address":"TX"}
...
{"time":"2015","address":"TX"}
...
{"time":"2014","address":"NY"}
...
{"time":"2014","address":"CA"}
{"time":"2015","address":"CA"}
{"time":"2016","address":"CA"}
我实现了这样的视图地图功能:
I implemented a view map function like this:
function (doc) {
if(doc.time && doc.address){
emit([doc.address, doc.time], null);
}
}
目前,我没有使用任何 Reduce 函数,因为可以忽略任何分组或 reducing ,而将注意力放在简单的简单索引上。上面的视图正在生成以下用于索引的键/值对:
For now, I'm not using any Reduce function, because, lets ignore any grouping or reducing and focus on plain simple indexing. The above view is generating the following key/value pairs for indexing:
$ curl -k -X GET 'https://admin:****@192.168.1.106:6984/sample/_design/by_addr_time/_view/by_addr_time'
{"total_rows":25,"offset":0,"rows":[
{"id":"doc_0022","key":["CA","2014"],"value":null},
{"id":"doc_0023","key":["CA","2015"],"value":null},
{"id":"doc_0024","key":["CA","2016"],"value":null},
{"id":"doc_0000","key":["CT","2011"],"value":null},
{"id":"doc_0001","key":["CT","2012"],"value":null},
{"id":"doc_0002","key":["CT","2013"],"value":null},
{"id":"doc_0003","key":["CT","2014"],"value":null},
{"id":"doc_0004","key":["CT","2015"],"value":null},
{"id":"doc_0005","key":["CT","2016"],"value":null},
{"id":"doc_0014","key":["NY","2011"],"value":null},
{"id":"doc_0015","key":["NY","2012"],"value":null},
{"id":"doc_0016","key":["NY","2013"],"value":null},
{"id":"doc_0017","key":["NY","2014"],"value":null},
{"id":"doc_0018","key":["NY","2015"],"value":null},
{"id":"doc_0019","key":["NY","2016"],"value":null},
{"id":"doc_0020","key":["NY","2017"],"value":null},
{"id":"doc_0021","key":["NY","2018"],"value":null},
{"id":"doc_0006","key":["TX","2011"],"value":null},
{"id":"doc_0008","key":["TX","2012"],"value":null},
{"id":"doc_0007","key":["TX","2013"],"value":null},
{"id":"doc_0009","key":["TX","2014"],"value":null},
{"id":"doc_0010","key":["TX","2015"],"value":null},
{"id":"doc_0011","key":["TX","2016"],"value":null},
{"id":"doc_0012","key":["TX","2017"],"value":null},
{"id":"doc_0013","key":["TX","2018"],"value":null}
]}
现在,我要执行一个查询,以按<$ c过滤视图$ c> doc.time 。我的查询参数是:
Now, I'm going to do a query to filter the view by doc.time
. My query parameters are:
?startkey=["AA","2017"]&endkey=["ZZ","2018"]
我希望上述查询仅返回时间
字段,位于 2017
和 2018
之间,地址<这些文档的/ code>字段可以具有 any 值,因为我是从
AA
到 ZZ
,其中包括我数据库中的所有地址。我正在使用 curl
这样的查询:
I expect the above query to return only the docs with the time
field between 2017
and 2018
, the address
field of those docs can have any value since I specified from AA
to ZZ
which includes all addresses on my database. I'm doing the query with curl
like this:
$ curl -k -X GET 'https://admin:****@192.168.1.106:6984/sample/_design/by_addr_time/_view/by_addr_time?startkey=\["AA","2017"\]&endkey=\["ZZ","2018"\]'
{"total_rows":25,"offset":0,"rows":[
{"id":"doc_0022","key":["CA","2014"],"value":null},
{"id":"doc_0023","key":["CA","2015"],"value":null},
{"id":"doc_0024","key":["CA","2016"],"value":null},
{"id":"doc_0000","key":["CT","2011"],"value":null},
{"id":"doc_0001","key":["CT","2012"],"value":null},
{"id":"doc_0002","key":["CT","2013"],"value":null},
{"id":"doc_0003","key":["CT","2014"],"value":null},
{"id":"doc_0004","key":["CT","2015"],"value":null},
{"id":"doc_0005","key":["CT","2016"],"value":null},
{"id":"doc_0014","key":["NY","2011"],"value":null},
{"id":"doc_0015","key":["NY","2012"],"value":null},
{"id":"doc_0016","key":["NY","2013"],"value":null},
{"id":"doc_0017","key":["NY","2014"],"value":null},
{"id":"doc_0018","key":["NY","2015"],"value":null},
{"id":"doc_0019","key":["NY","2016"],"value":null},
{"id":"doc_0020","key":["NY","2017"],"value":null},
{"id":"doc_0021","key":["NY","2018"],"value":null},
{"id":"doc_0006","key":["TX","2011"],"value":null},
{"id":"doc_0008","key":["TX","2012"],"value":null},
{"id":"doc_0007","key":["TX","2013"],"value":null},
{"id":"doc_0009","key":["TX","2014"],"value":null},
{"id":"doc_0010","key":["TX","2015"],"value":null},
{"id":"doc_0011","key":["TX","2016"],"value":null},
{"id":"doc_0012","key":["TX","2017"],"value":null},
{"id":"doc_0013","key":["TX","2018"],"value":null}
]}
上述查询返回的响应似乎令人震惊。因为看起来它不只返回在 2017
和 2018之间提交的
。这就是数组键的CouchDB索引的工作方式。 CouchDB对数组键进行索引,就好像整个数组是一个包含该数组的括号和逗号的字符串一样!如果您阅读参考,它将开始使时间
的文档
The response returned by the above query seems shocking. Because it looks like it did NOT return only the docs with time
filed between 2017
and 2018
. That's just how the CouchDB indexing for array keys work. CouchDB does the indexing of array keys as if the whole array is a string including the brackets and commas of the array! If you read the reference, it would start to make sense.
现在让我们更改查询:
?startkey=["CT","2016"]&endkey=["TX","2011"]
根据我们的解释,上述查询的结果如下所示:
The result of the above query is shown below, based on our explanations, this should make sense:
$ curl -k -X GET 'https://admin:****@192.168.1.106:6984/sample/_design/by_addr_time/_view/by_addr_time?startkey=\["CT","2016"\]&endkey=\["TX","2011"\]'
{"total_rows":25,"offset":8,"rows":[
{"id":"doc_0005","key":["CT","2016"],"value":null},
{"id":"doc_0014","key":["NY","2011"],"value":null},
{"id":"doc_0015","key":["NY","2012"],"value":null},
{"id":"doc_0016","key":["NY","2013"],"value":null},
{"id":"doc_0017","key":["NY","2014"],"value":null},
{"id":"doc_0018","key":["NY","2015"],"value":null},
{"id":"doc_0019","key":["NY","2016"],"value":null},
{"id":"doc_0020","key":["NY","2017"],"value":null},
{"id":"doc_0021","key":["NY","2018"],"value":null},
{"id":"doc_0006","key":["TX","2011"],"value":null}
]}
UPDATE
UPDATE
...我要过滤的是
doc.time
,然后将其余记录按doc.address
...
... What I want is to filter by
doc.time
, then group the remaining records bydoc.address
...
$ b分组$ b
那么,我们该怎么办? 问题和答案,并提供了基本思路。
So, what should we do? There is a good question and answer and provides the basic ideas.
我不确定哪个主意最好,但是我实现了这样一个主意:创建了一个名为 t_red
如下所示,内置 _count
reduce:
I'm not sure which idea is the best, but I implemented one idea like this: created a view named t_red
like below with a builtin _count
reduce:
function (doc) {
if(doc.time && doc.address){
emit([doc.time, doc.address], null);
}
}
我还创建了一个名为<$ c $的视图c> a_red 具有内置 _count
减少:
Also, I created a view named a_red
with a builtin _count
reduce:
function (doc) {
if(doc.address && doc.time){
emit([doc.address, doc.time], null);
}
}
然后我在 NodeJS上开发了以下代码查询在 2012
和 2015 $ c $之间的
doc.time
c>,然后根据 doc.address
将结果分组,控制台日志将在代码内显示为注释。我希望这段代码会有所帮助(不要混淆!):
Then I developed the following code on NodeJS to query doc.time
between 2012
and 2015
and then group the results according to the doc.address
, console logs are shown inside the code as comments. I hope this code will be helpful (not confusing!):
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0"; // Ignore rejection, becasue CouchDB SSL certificate is self-signed
const fetch=require('node-fetch')
// query "t_red" view/index
fetch(`https://admin:****@192.168.1.106:6984/sample/_design/t_red/_view/t_red?group_level=2&startkey=["2012", "AA"]&endkey=["2015", "ZZ"]`, {
method: 'GET',
headers: {
'Content-Type': 'application/json',
}
}).then(
res=>res.json()
).then(data=>{
let unique_addr=[]
data.rows.map(row=>{
console.log('row.key-> ', row.key, ' row.value-> ', row.value)
// console log is shown below:
//
// row.key-> [ '2012', 'CT' ] row.value-> 1
// row.key-> [ '2012', 'NY' ] row.value-> 1
// row.key-> [ '2012', 'TX' ] row.value-> 1
// row.key-> [ '2013', 'CT' ] row.value-> 1
// row.key-> [ '2013', 'NY' ] row.value-> 1
// row.key-> [ '2013', 'TX' ] row.value-> 1
// row.key-> [ '2014', 'CA' ] row.value-> 1
// row.key-> [ '2014', 'CT' ] row.value-> 1
// row.key-> [ '2014', 'NY' ] row.value-> 1
// row.key-> [ '2014', 'TX' ] row.value-> 1
// row.key-> [ '2015', 'CA' ] row.value-> 1
// row.key-> [ '2015', 'CT' ] row.value-> 1
// row.key-> [ '2015', 'NY' ] row.value-> 1
// row.key-> [ '2015', 'TX' ] row.value-> 1
if(unique_addr.indexOf(row.key[1])==-1){ // Push unique addresses into an array
unique_addr.push(row.key[1])
}
})
console.log(unique_addr)
// Console log is shown below:
//
// [ 'CT', 'NY', 'TX', 'CA' ]
return unique_addr
}).then(unique_addr=>{
// Group the unique addresses
let group_by_address=unique_addr.map(addr=>{
// For each unique address, do a query of "a_red" view/index
return fetch(`https://admin:****@192.168.1.106:6984/sample/_design/a_red/_view/a_red?group_level=2&startkey=["${addr}","2012"]&endkey=["${addr}","2015"]`, {
method: 'GET',
headers: {
'Content-Type': 'application/json',
}
}).then(
res=>res.json()
).then(data=>{
data.rows.map(row=>{console.log('row.key-> ', row.key, ' row.value-> ', row.value)})
// Console logs related to this section of code are shown below
//row.key-> [ 'CA', '2014' ] row.value-> 1
//row.key-> [ 'CA', '2015' ] row.value-> 1
//row.key-> [ 'NY', '2012' ] row.value-> 1
//row.key-> [ 'NY', '2013' ] row.value-> 1
//row.key-> [ 'NY', '2014' ] row.value-> 1
//row.key-> [ 'NY', '2015' ] row.value-> 1
//row.key-> [ 'CT', '2012' ] row.value-> 1
//row.key-> [ 'CT', '2013' ] row.value-> 1
//row.key-> [ 'CT', '2014' ] row.value-> 1
//row.key-> [ 'CT', '2015' ] row.value-> 1
//row.key-> [ 'TX', '2012' ] row.value-> 1
//row.key-> [ 'TX', '2013' ] row.value-> 1
//row.key-> [ 'TX', '2014' ] row.value-> 1
//row.key-> [ 'TX', '2015' ] row.value-> 1
let obj={}
obj[addr]=data.rows.length // This object contains unique address and its corresponding frequency in above query
return obj
}).catch(err=>{
console.log('err-> ', err)
})
})
return group_by_address
}).then(group_by_address=>{
group_by_address.map(group=>{
group.then(()=>{
console.log('Grouped by address-> ', group)
// Console logs related this section of code are shown below:
//Grouped by address-> Promise { { CA: 2 } }
//Grouped by address-> Promise { { NY: 4 } }
//Grouped by address-> Promise { { CT: 4 } }
//Grouped by address-> Promise { { TX: 4 } }
})
})
}).catch(err=>{
console.log('err-> ', err)
})
这篇关于CouchDB视图-在键数组上进行筛选和分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!