带条件group by语句的MongoDB查询 [英] MongoDB query with conditional group by statement
问题描述
我需要从mongoDB的数据库中导出客户记录.导出的客户记录不应有重复的值. "firstName + lastName + code"是用于删除记录重复数据的键,如果数据库中存在两个具有相同键的记录,那么我需要将源字段的优先级赋予电子邮件以外的其他值.
I need to export customer records from database of mongoDB. Exported customer records should not have duplicated values. "firstName+lastName+code" is the key to DE-duped the record and If there are two records present in database with same key then I need to give preference to source field with value other than email.
客户(id,firstName,lastName,code,source
)集合是这个.
如果有3条记录具有相同的唯一键和3个不同的来源,那么我只需要在2个来源(电视,互联网)之间选择一个记录{或者如果有n个来源,我只需要一个记录}就不带有电子邮件"(因为只有一个具有唯一键的记录存在且源是电子邮件时,才会选择电子邮件) 使用以下查询:
If there are record 3 records with same unique key and 3 different sources then i need to choose only one record between 2 sources(TV,internet){or if there are n number of sources i need the one record only}not with the 'email'(as email will be choosen when only one record is present with the unique key and source is email) query using:
`db.customer.aggregate([
{ "$match" : { "active" : true , "dealerCode" : { "$in" : [ "111391"]} , "source" : { "$in" : [ "email","TV","internet"]}}},
{$group:{"_id":{
"firstName":"$personalInfo.firstName",
"lastName":"$personalInfo.lastName",
"code":"$vehicle.code"},
"source":{
$addToSet:{"source":"$source"}
}
}
},
{$redact:
{$cond:[{$eq:[{$ifNull:["$source","other"]},"email"]},"$$PRUNE","$$DESCEND"]}
},
{$project:
{"source":
{$map:
{"input":
{$cond:[
{$eq:[{$size:"$source"},0]},[{"source":"email"}],"$source"]
},"as":"inp","in":"$$inp.source"}
},
"record":{"_id":1}
}
} ])`
样本输出:
{ "_id" : { "firstName" : "sGI6YaJ36WRfI4xuJQzI7A==", "lastName" : "99eQ7i+uTOqO8X+IPW+NOA==", "code" : "1GTHK23688F113955" }, "source" : [ "internet" ] }
{ "_id" : { "firstName" : "WYDROTF/9vs9O7XhdIKd5Q==", "lastName" : "BM18Uq/ltcbdx0UJOXh7Sw==", "code" : "1G4GE5GV5AF180133" }, "source" : [ "internet" ] }
{ "_id" : { "firstName" : "id+U2gYNHQaNQRWXpe34MA==", "lastName" : "AIs1G33QnH9RB0nupJEvjw==", "code" : "1G4GE5EV0AF177966" }, "source" : [ "internet" ] }
{ "_id" : { "firstName" : "qhreJVuUA5l8lnBPVhMAdw==", "lastName" : "petb0Qx3YPfebSioY0wL9w==", "code" : "1G1AL55F277253143" }, "source" : [ "TV" ] }
{ "_id" : { "firstName" : "qhreJVuUA5l8lnBPVhMAdw==", "lastName" : "6LB/NmhbfqTagbOnHFGoog==", "code" : "1GCVKREC0EZ168134" }, "source" : [ "TV", "internet" ] }
sample output:
{ "_id" : { "firstName" : "sGI6YaJ36WRfI4xuJQzI7A==", "lastName" : "99eQ7i+uTOqO8X+IPW+NOA==", "code" : "1GTHK23688F113955" }, "source" : [ "internet" ] }
{ "_id" : { "firstName" : "WYDROTF/9vs9O7XhdIKd5Q==", "lastName" : "BM18Uq/ltcbdx0UJOXh7Sw==", "code" : "1G4GE5GV5AF180133" }, "source" : [ "internet" ] }
{ "_id" : { "firstName" : "id+U2gYNHQaNQRWXpe34MA==", "lastName" : "AIs1G33QnH9RB0nupJEvjw==", "code" : "1G4GE5EV0AF177966" }, "source" : [ "internet" ] }
{ "_id" : { "firstName" : "qhreJVuUA5l8lnBPVhMAdw==", "lastName" : "petb0Qx3YPfebSioY0wL9w==", "code" : "1G1AL55F277253143" }, "source" : [ "TV" ] }
{ "_id" : { "firstName" : "qhreJVuUA5l8lnBPVhMAdw==", "lastName" : "6LB/NmhbfqTagbOnHFGoog==", "code" : "1GCVKREC0EZ168134" }, "source" : [ "TV", "internet" ] }
此查询有问题,请提出:(
This is a problem with this query please suggest :(
推荐答案
您的代码不起作用,因为这些累加器运算符可用于$group
阶段.
Your code doesn't work, because $cond is not an accumulator operator. Only these accumulator operators, can be used in a $group
stage.
假设您的记录中所包含的source
值不超过两个,您可以添加条件$project
阶段并将$group
阶段修改为
Assuming your records contain not more than two possible values of source
as you mention in your question, you could add a conditional $project
stage and modify the $group
stage as,
代码:
db.customer.aggregate([
{$group:{"_id":{"id":"$id",
"firstName":"$firstName",
"lastName":"$lastName",
"code":"$code"},
"sourceA":{$first:"$source"},
"sourceB":{$last:"$source"}}},
{$project:{"source":{$cond:[{$eq:["$sourceA","email"]},
"$sourceB",
"$sourceA"]}}}
])
如果source可能有两个以上的值,则可以执行以下操作:
In case there can be more that two possible values for source, then you could do the following:
-
Group
.积累source
的唯一值,使用 $ addToSet 运算符. - 使用 $ redact 仅保留
email
以外的其他值. -
Project
必填字段,如果source
数组为空(所有元素均已删除),请添加一个 值email
. -
Unwind
源字段,将其列出为字段而不是数组. (可选)
firstName
,lastName
和code
中的Group
by theid
,firstName
,lastName
andcode
. Accumulate the unique values ofsource
, using the $addToSet operator.- Use $redact to keep only the values other than
email
. Project
the required fields, if thesource
array is empty(all the elements have been removed), add a valueemail
to it.Unwind
the source field to list it as a field and not an array. (optional)
代码:
db.customer.aggregate([
{$group:{"_id":{"id":"$id",
"firstName":"$firstName",
"lastName":"$lastName",
"code":"$code"},
"sourceArr":{$addToSet:{"source":"$source"}}}},
{$redact:{$cond:[{$eq:[{$ifNull:["$source","other"]},"email"]},
"$$PRUNE",
"$$DESCEND"]}},
{$project:{"source":{$map:{"input":
{$cond:[{$eq:[{$size:"$sourceArr"},
0]},
[{"source":"item"}],
"$sourceArr"]},
"as":"inp",
"in":"$$inp.source"}}}}
])
这篇关于带条件group by语句的MongoDB查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!