如何仅从组中查询具有最新时间戳的文档? [英] How to query only documents with the latest timestamp from a group?

查看:16
本文介绍了如何仅从组中查询具有最新时间戳的文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我查询的 MongoDB 集合中,每个文档代表特定时间的一个项目.更新文档时,会创建一个具有相同项目 ID 和新时间戳的新文档.所有项目都有唯一的项目 ID.

In the MongoDB collection I'm querying, each document represents an item at a specific time. When updating a document, a new document is created with the same item id and a new timestamp. All items have unique item ids.

为了说明,考虑这个例子.我们从一个项目的一个修订开始:

To illustrate, consider this example. We start with one revision of an item:

{
    _id: x,
    itemId: 123,
    createdOn: ISODate("2013-01-30T11:16:20.102Z"),
    field1: "foo",
    field2: "bar
}

更新后,我们有两个版本的项目,具有相同的 itemId 和不同的时间戳.

After an update, we have two revisions of the item, with the same itemId and different timestamps.

[{
  _id: x,
  itemId: 123,
  createdOn: ISODate("2013-01-30T11:16:20.102Z"),
  field1: "foo",
  field2: "bar"
},
{
  _id: y,
  itemId: 123,
  createdOn: ISODate("2014-02-09T14:26:20.102Z"),
  field1: "baz",
  field2: "fiz"
}]

如何找到在其最新修订版中满足特定查询的所有项目?

我目前(错误)的做法是先找到匹配的文档,然后按时间戳排序,按itemId分组,然后从组中的第一个文档返回值:

My current (wrong) approach is to first find the matching documents, then sort by timestamp, group them by itemId, and return the values from the first document in the group:

ItemModel.aggregate({ $match: { field1: "foo"} }).sort({createdOn: -1}).group(
    {
        _id: '$itemId', // grouping key
        createdOn: {$first: '$createdOn'},
        field1: {$first: '$field1'},
        field2: {$first: '$field2'}
    }).exec(...);

这是错误的,因为它匹配项目的旧版本.只有项目的最新版本应该匹配.在上面的示例中,此方法返回项目123",而正确的结果是一个空结果集.

This is wrong because it matches old revisions of items. Only the latest revisions of items should match. In the example above, this approach returns item "123", while the correct result is an empty result set.

推荐答案

当你可以在聚合管道中做所有事情时,你在这里混合了一些方法.否则,只需按正确的顺序执行步骤即可:

You are mixing a few methods here when you can be doing everything in the aggregation pipeline. Otherwise it's just a matter of getting your steps in the right order:

db.collection.aggregate([
    {$sort: { createdOn: -1 }},
    {$group: { _id: "$itemId", 
        createdOn: {$first: "$createdOn"},
        field1: {$first: "$field1" },
        field2: {$first: "$field2" }
    }},
    {$match: { field1: "foo" }}
])

所以首先对最新的文档进行排序.对 itemId 进行分组( $first 将保持顺序),然​​后在必要时使用 $match 进行过滤.但您的分组文档将是最新的.

So sort first for newest documents. Group on the itemId ( order will be maintained for $first ), and then filter with $match if you must. But your grouped documents will be latest ones.

这篇关于如何仅从组中查询具有最新时间戳的文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆