如何仅查询组中具有最新时间戳的文档? [英] How to query only documents with the latest timestamp from a group?

查看:72
本文介绍了如何仅查询组中具有最新时间戳的文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我查询的MongoDB集合中,每个文档在特定时间代表一个项目.更新文档时,将使用相同的项目ID和新的时间戳创建一个新文档.所有项目都有唯一的项目ID.

In the MongoDB collection I'm querying, each document represents an item at a specific time. When updating a document, a new document is created with the same item id and a new timestamp. All items have unique item ids.

为说明起见,请考虑以下示例.我们从一个项目的一个修订版本开始:

To illustrate, consider this example. We start with one revision of an item:

{
    _id: x,
    itemId: 123,
    createdOn: ISODate("2013-01-30T11:16:20.102Z"),
    field1: "foo",
    field2: "bar
}

更新后,我们对该项目进行了两次修订,具有相同的 itemId 和不同的时间戳.

After an update, we have two revisions of the item, with the same itemId and different timestamps.

[{
  _id: x,
  itemId: 123,
  createdOn: ISODate("2013-01-30T11:16:20.102Z"),
  field1: "foo",
  field2: "bar"
},
{
  _id: y,
  itemId: 123,
  createdOn: ISODate("2014-02-09T14:26:20.102Z"),
  field1: "baz",
  field2: "fiz"
}]

如何找到最新版本中满足特定查询的所有项目?

我目前(错误的)方法是先找到匹配的文档,然后按时间戳排序,再按 itemId 分组,然后从组中的第一个文档返回值:

My current (wrong) approach is to first find the matching documents, then sort by timestamp, group them by itemId, and return the values from the first document in the group:

ItemModel.aggregate({ $match: { field1: "foo"} }).sort({createdOn: -1}).group(
    {
        _id: '$itemId', // grouping key
        createdOn: {$first: '$createdOn'},
        field1: {$first: '$field1'},
        field2: {$first: '$field2'}
    }).exec(...);

这是错误的,因为它与项目的旧修订版匹配.仅项目的最新修订版本应匹配.在上面的示例中,此方法返回项目"123",而正确的结果是空结果集.

This is wrong because it matches old revisions of items. Only the latest revisions of items should match. In the example above, this approach returns item "123", while the correct result is an empty result set.

推荐答案

在聚合管道中可以完成所有操作时,您正在此处混合使用几种方法.否则,只是按照正确的顺序进行操作即可.

You are mixing a few methods here when you can be doing everything in the aggregation pipeline. Otherwise it's just a matter of getting your steps in the right order:

db.collection.aggregate([
    {$sort: { createdOn: -1 }},
    {$group: { _id: "$itemId", 
        createdOn: {$first: "$createdOn"},
        field1: {$first: "$field1" },
        field2: {$first: "$field2" }
    }},
    {$match: { field1: "foo" }}
])

因此,请首先对最新文档进行排序.在itemId上分组(将为$ first保留订单),然后根据需要使用$ match进行过滤.但是您分组的文档将是最新的文档.

So sort first for newest documents. Group on the itemId ( order will be maintained for $first ), and then filter with $match if you must. But your grouped documents will be latest ones.

这篇关于如何仅查询组中具有最新时间戳的文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆