ouchdb-重新排序缩小视图的结果 [英] couchdb - re-sort a result of reduced view

查看:120
本文介绍了ouchdb-重新排序缩小视图的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下文档结构:

线程:

 - doc_type   1
 - _id        
 - subject    (string)

帖子:

 - doc_type   2
 - _id        
 - thread_id  (_id of Thread)
 - time       (milliseconds since 1970)
 - comment    (string)

我需要按主题的最后一个帖子以及最新的5个帖子排序的主题. 我想避免每次执行新文章时都更新线程文档,以消除跨db节点的分布式环境中发生冲突的可能性.此外,它将为数据库工作,而数据库应该为您工作.

I need the threads sorted by the last post on a thread, together with latest 5 posts. I thought to avoid updating the thread document every time a new post is done in order to eliminate probability of conflicts in a distributed environment across db nodes. Besides, it will be working for the DB where the DB should be working for you.

为简单起见-让我们从查找最新帖子开始.可以用相同的方式收集5个帖子.

For simplicity - lets' just start with finding the latest post. The 5 posts can be gathered the same way.

现在,我不确定我的方向正确,但是,请查看

Now, I'm not sure I'm on the right direction, however, looking here I found how to find the last post in a thread using a reduce function that uses a group-level to return thread subject taken from doc-type 1, and the last post document taken from doc-type 2.

顺便说一句-与链接中的示例相反,在我的情况下,始终创建带有第一条帖子的线程(例如,线程的创建日期将是其第一条帖子的日期).

BTW - opposed to the sample in the link, in my case a thread is always created with a first post, (so, for example, the creation date of a Thread will be the date of it's first Post).

地图:

function(doc){
  switch(doc.doc_type){
     case 1: emit([doc._id],doc); return;
     case 2: emit([doc.thread_id],doc); return;
  }
}

减少: 在现实世界中,密钥比较复杂,因此必须与适当的组级别一起使用. 为了简单起见,在这里我也忽略了重新减少的情况. 您可以在此处找到

reduce: on real world keys are more compound, so it must be used with appropriate group-level. I also ignore here the case of re-reduce, just for simplicity's sake. You can find full picture here:

function(keys, vals, rr){
   var result = { subject: null, lastPost: null, count :0 };
   //I'll ignore the re-reduce case for simplicity
   vals.forEach(function(doc){
      switch(doc.doc_type){
         case 1: 
            result.subject = doc.subject; 
            return;
         case 2: 
            if (result.lastPost.time < doc.time) result.lastPost = doc; 
            result.count++;
            return;
      }
   });
   return result;
}

但是我随后如何页面最新发布日期排序? 有没有一种方法可以将查询结果中的doc-id作为另一个查询的过滤条件(最好使用一次往返)?

But how do I page it afterwards sorted by the latest-post date? Is there a way to feed doc-ids from a result of a query as the filter criteria of another (preferably, using one round-trip)?

线程中的帖子数没有限制,所以我有点不愿意在这里中继列表功能,当页面大小也可以变化时,导致最后一个帖子完全不显示的原因...

There is no limit to the number of posts in a thread, so I'm a little reluctant to relay on list function here, when the page-size can also vary, what will result in the last post not showing at all...

有人吗?

推荐答案

如果您只关注最后一篇或最后五篇文章,那么这里有一种简单得多的方法.实际上,您可以完全避免使用减速器.

If you're only after the last post or the last five posts, there's a much simpler method. You can completely avoid the reducer, in fact.

如果将时间添加为键的第二部分,则可以结合使用endkey,降序和限制来获得基于thread_id的最后N个帖子.

If you add the time as the second portion of the key, you can use a combination of endkey, descending, and limit to get the last N posts based on the thread_id.

这是我根据您的模式使用一些测试数据编写的MapReduce:

Here's the MapReduce I wrote with some test data based on your schemas:

function(doc) {
  if (doc.type) {
    if (doc.subject) {
      emit([doc._id, doc.time], doc.subject);
      emit([doc._id, 'Z'], doc.subject);
    } else {
      emit([doc.thread_id, doc.time], {_id: doc._id});
    }
  }
}

'Z'键的奇怪输出是允许您从项目列表的底部"获得主题.

The strange output of the 'Z' key is to allow you to get the subject from the "bottom" of the list of items.

查询参数类似于:

?endkey=["thread_id"]&descending=true&limit=6

限制应为N + 1,其中N是您要回复的帖子数.在结果中,您将拥有帖子文档中的线程主题和_id对象(或您想要的任何对象).

The limit should be N+1 where N is the number of posts you'd like back. In the results you'll have the thread subject and _id objects (or whatever you'd like) from the post documents.

在此示例中输出了_id对象,因此如果您想查看完整的文章,可以将其与include_docs=true一起使用.将您想要的后期文档中的其他任何数据(标题等)扔掉,以保持整体索引的大小较小,并在需要文档全部内容的地方使用include_docs.但是,如果您始终需要完整的后期文档,请在发出时将其输出,因为这样可以使您更快地响应(尽管磁盘上的索引大小较大).

The _id objects are output in this example so you can use it with include_docs=true if you want the full post. Toss in whatever other data from the post document you want (title, etc) to keep the overall index size low and use include_docs in those places where you need the full contents of the document. However, if you always need the full post document, output it in the emit as that will give you a faster response (though a larger index size on disk).

此外,如果您需要按上一帖子排序的所有线程的列表,以及每个线程5个帖子,则需要输出[time, thread_id, 'thread'][time, thread_id, 'post']之类的键,并使用_list来收集帖子时间排序将在每个线程文档的下方",因为时间排序将导致线程和帖子在结果中的距离更远.然后可以使用_list函数再次组合/查找它们.但是,执行两个请求可能仍然更容易/更轻松.

Also, if you need a list of all threads sorted by last post as well as 5 posts per thread, you'd need to output keys like [time, thread_id, 'thread'] and [time, thread_id, 'post'] and use a _list to collect the posts "under" each thread document as the time sorting will cause threads and posts to be farther apart in the results. A _list function can then be used to combine/find them again. However, doing two requests may still be easier/lighter.

这篇关于ouchdb-重新排序缩小视图的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆