重新排序缩小视图的结果 [英] Re-sort a result of reduced view

查看:69
本文介绍了重新排序缩小视图的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下文档结构:

线程:

 - doc_type   1
 - _id        
 - subject    (string)

帖子:

 - doc_type   2
 - _id        
 - thread_id  (_id of Thread)
 - time       (milliseconds since 1970)
 - comment    (string)

我需要按主题的最后一个帖子以及最新的5个帖子排序的主题。
我想避免每次执行新文章时都更新线程文档,以消除跨db节点的分布式环境中发生冲突的可能性。此外,它将对数据库起作用,而数据库应该为您工作。

I need the threads sorted by the last post on a thread, together with latest 5 posts. I thought to avoid updating the thread document every time a new post is done in order to eliminate probability of conflicts in a distributed environment across db nodes. Besides, it will be working for the DB where the DB should be working for you.

为简单起见-让我们从查找最新帖子开始。可以用相同的方式收集这5个帖子。

For simplicity - lets' just start with finding the latest post. The 5 posts can be gathered the same way.

现在,我不确定我的方向正确,但是,查看此处我找到了如何使用归约函数(该归约函数使用group-)在线程中查找最后一个帖子返回从文档类型1提取的线程主题,从文档类型2提取的最后一个发布文档的级别。

Now, I'm not sure I'm on the right direction, however, looking here I found how to find the last post in a thread using a reduce function that uses a group-level to return thread subject taken from doc-type 1, and the last post document taken from doc-type 2.

BTW-与链接中的示例相反,在在我的情况下,线程总是创建有第一篇文章(例如,某个线程的创建日期将是其第一篇文章的日期)。

BTW - opposed to the sample in the link, in my case a thread is always created with a first post, (so, for example, the creation date of a Thread will be the date of it's first Post).

地图:

function(doc){
  switch(doc.doc_type){
     case 1: emit([doc._id],doc); return;
     case 2: emit([doc.thread_id],doc); return;
  }
}

减少: $ b真实世界密钥上的$ b更为复杂,因此必须与适当的组级别一起使用。
在这里我也忽略了重新减少的情况,只是为了简单起见。
您可以在此处找到完整图片:

reduce: on real world keys are more compound, so it must be used with appropriate group-level. I also ignore here the case of re-reduce, just for simplicity's sake. You can find full picture here:

function(keys, vals, rr){
   var result = { subject: null, lastPost: null, count :0 };
   //I'll ignore the re-reduce case for simplicity
   vals.forEach(function(doc){
      switch(doc.doc_type){
         case 1: 
            result.subject = doc.subject; 
            return;
         case 2: 
            if (result.lastPost.time < doc.time) result.lastPost = doc; 
            result.count++;
            return;
      }
   });
   return result;
}

但是后来如何 page 对其进行排序最新发布日期
是否可以从查询结果中提供文档ID作为另一个查询的过滤条件(最好使用一次往返)?

But how do I page it afterwards sorted by the latest-post date? Is there a way to feed doc-ids from a result of a query as the filter criteria of another (preferably, using one round-trip)?

线程中的帖子数没有限制,所以我有点不愿意在这里传递列表功能,当页面大小也可以变化时,导致最后一个帖子完全不显示的原因。

There is no limit to the number of posts in a thread, so I'm a little reluctant to relay on list function here, when the page-size can also vary, what will result in the last post not showing at all.

推荐答案

如果您只在最后一篇或最后五篇文章之后,那么有一种简单得多的方法。实际上,您可以完全避免使用reducer。

If you're only after the last post or the last five posts, there's a much simpler method. You can completely avoid the reducer, in fact.

如果将时间添加为键的第二部分,则可以结合使用endkey,下降和limit以获得基于thread_id的最后N条帖子。

If you add the time as the second portion of the key, you can use a combination of endkey, descending, and limit to get the last N posts based on the thread_id.

这是我根据您的模式用一些测试数据编写的MapReduce:

Here's the MapReduce I wrote with some test data based on your schemas:

function(doc) {
  if (doc.type) {
    if (doc.subject) {
      emit([doc._id, doc.time], doc.subject);
      emit([doc._id, 'Z'], doc.subject);
    } else {
      emit([doc.thread_id, doc.time], {_id: doc._id});
    }
  }
}

Z键是允许您从项目列表的底部获得主题。

The strange output of the 'Z' key is to allow you to get the subject from the "bottom" of the list of items.

查询参数类似于:

?endkey=["thread_id"]&descending=true&limit=6

限制应为N + 1,其中N是您要回复的帖子数。在结果中,您将拥有帖子文档中的线程主题和_id对象(或您想要的任何对象)。

The limit should be N+1 where N is the number of posts you'd like back. In the results you'll have the thread subject and _id objects (or whatever you'd like) from the post documents.

在此示例中输出_id对象,因此如果需要完整的帖子,可以将其与 include_docs = true 一起使用。将您想要的后期文档中的其他任何数据(标题等)扔掉,以保持整体索引的大小较小,并在需要文档全部内容的地方使用include_docs。但是,如果您始终需要完整的后期文档,请在发出时将其输出,因为这样可以使您更快地响应(尽管磁盘上的索引大小更大)。

The _id objects are output in this example so you can use it with include_docs=true if you want the full post. Toss in whatever other data from the post document you want (title, etc) to keep the overall index size low and use include_docs in those places where you need the full contents of the document. However, if you always need the full post document, output it in the emit as that will give you a faster response (though a larger index size on disk).

也,如果您需要按最后发布排序的所有线程的列表,以及每个线程有5个发布,则需要输出诸如 [time,thread_id,'thread'] 和 [time,thread_id,'post'] 并使用 _list 收集每个线程下方的帖子文档,因为时间排序将导致线程和帖子在结果中分开。然后,可以使用 _list 函数再次组合/查找它们。但是,执行两个请求可能仍然更容易/更轻松。

Also, if you need a list of all threads sorted by last post as well as 5 posts per thread, you'd need to output keys like [time, thread_id, 'thread'] and [time, thread_id, 'post'] and use a _list to collect the posts "under" each thread document as the time sorting will cause threads and posts to be farther apart in the results. A _list function can then be used to combine/find them again. However, doing two requests may still be easier/lighter.

这篇关于重新排序缩小视图的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆