MongoDb Aggregate,在7天内找到重复的记录 [英] MongoDb Aggregate , find duplicate records within 7 days

查看:98
本文介绍了MongoDb Aggregate,在7天内找到重复的记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须为此用例创建一张支票-

I have to create a check for this use case-

重复付款支票

•金额相同

我没有使用mongoDb,因为我更容易用sql编写

I haven't used mongoDb as much would have been easier for me to write in sql

这是我尝试的不包含7天的部分

This is what I am trying without the 7 days part

db.transactiondetails.aggregate({$group: {"_id":{"account_number":"$account_number","amount":"$amount"},"count": { $sum: 1 }}}) 

在哪里得到这样的东西:

Where I get something like this :

{ "_id" : { "account_number" : "xxxxxxxy", "amount" : 19760 }, "count" : 2 }
{ "_id" : { "account_number" : "xxxxzzzz", "amount" : 20140 }, "count" : 2 }
...

我有 created_at updated_at 是日期字段,我正在使用updated_at进行重复

I have created_at and updated_at which are date fields , I am using updated_at for duplicates

例如:

"created_at" : ISODate("2019-01-07T15:40:53.683Z"),
"updated_at" : ISODate("2019-01-09T06:48:44.839Z"), 

在sql中,我们可以创建7天的组,对于每个日期,都有一个开始日期加上7天,我们需要在其中查找重复项。

In sql we can create groups of 7 days, for each date there will be a start date plus 7 days in which we need to find the duplicates.

正在运行7天的小组,我需要查找重复项。

It is running 7 day groups where I need to find duplicates.

任何帮助如何进行此操作

Any help how to go about this will be appreciated.

推荐答案

检查是否满足您的要求:

Check if this meets your requirements:


  1. 我们对文档进行排序(假设您有索引)。我们需要它在后续步骤中迭代数组。

  2. 我们按 account_number + amount 并使用文档创建数组(数据 tmp

  3. 我们 $ unwind (展平) tmp 数组来计算项目 i的天数 到项目 i + 1-n

  4. 我们计算在不同日期有多少重复项

  5. 跳过所有 counts = 0

  1. We sort documents (I assume you have indexes). We need it to iterate array in the next steps.
  2. We group by account_number + amount and create arrays (data, tmp) with documents
  3. We $unwind (flatten) tmp array to calculate how many days past for item i to item i+1 - n
  4. We count how many duplicates we have for different dates
  5. Skip all counts = 0



< hr>


db.transactiondetails.aggregate([
  {
    $sort: {
      account_number: 1,
      amount: 1,
      updated_at: 1
    }
  },
  {
    $group: {
      "_id": {
        "account_number": "$account_number",
        "amount": "$amount"
      },
      "data": {
        $push: "$$ROOT"
      },
      "tmp": {
        $push: "$$ROOT"
      }
    }
  },
  {
    $unwind: "$tmp"
  },
  {
    $project: {
      _id: {
        account_number: "$_id.account_number",
        amount: "$_id.amount",
        updated_at: "$tmp.updated_at"
      },
      data: {
        $map: {
          input: {
            $slice: [
              "$data",
              {
                $add: [
                  {
                    $indexOfArray: [
                      "$data",
                      "$tmp"
                    ]
                  },
                  1
                ]
              },
              {
                $size: "$data"
              }
            ]
          },
          in: {
            "_id": "$$this._id",
            "account_number": "$$this.account_number",
            "amount": "$$this.amount",
            "created_at": "$$this.created_at",
            "updated_at": "$$this.updated_at",
            "days": {
              $divide: [
                {
                  $subtract: [
                    "$$this.updated_at",
                    "$tmp.updated_at"
                  ]
                },
                {
                  $multiply: [
                    24,
                    60,
                    60,
                    1000
                  ]
                }
              ]
            }
          }
        }
      }
    }
  },
  {
    $project: {
      count: {
        $size: {
          $filter: {
            input: "$data",
            cond: {
              $lte: [
                "$$this.days",
                7
              ]
            }
          }
        }
      }
    }
  },
  {
    $match: {
      "count": {
        $gt: 0
      }
    }
  }
])

MongoPlayground

这篇关于MongoDb Aggregate,在7天内找到重复的记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆