CouchDB中的链接地图/约简 [英] Chained map/reduce in couchDB

查看:50
本文介绍了CouchDB中的链接地图/约简的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在ouchDB中,我有一组类似于以下的项目(例如,为简化起见):

In couchDB, I have a set of items like the following (simplified for example's sake):

{_id: 1, date: "Jul 1", user: "user1"}
{_id: 2, date: "Jul 2", user: "user1"}
{_id: 3, date: "Jul 3", user: "user2"}
...etc...

d想获得按日期排序的最新活动列表,且没有重复的用户_id。我可以使用如下结果创建视图:

I'd like to get a list of "most recent activity", sorted by date, with no duplicate user _ids. I can create a view with results like so:

{key: "July 3", _id: 3, user: "user2"}
{key: "July 2", _id: 2, user: "user1"}
{key: "July 1", _id: 1, user: "user1"}

但这包含同一用户的重复条目。或者,我可以创建一个映射{key:user,value:date}并缩小为

but this contains duplicate entries for the same user. Or I can create a view that maps {key: user, value: date} and reduces to

{key: "user1", mostRecentDate: "July 2"}
{key: "user2", mostRecentDate: "July 3"}

,但不是按最新排序。

我知道显而易见的解决方案-不支持缩小另一个视图的结果。 BigCouch 支持链式地图/归约,但似乎已经过时/不受支持(2012年最新版本)。

I know that the obvious solution - reducing over the results of another view isn't supported. BigCouch supports chained map/reduce, but appears to be rather out of date / unsupported (last release 2012).

这似乎是一个非常普遍的问题-现有的解决方案有哪些(交换数据库之外)?

This seems like a rather common problem - what are some existing solutions (beyond "switch databases")?

推荐答案

以下是有关如何使用ouchdb 1.xxx进行链式地图约简的一般思路。我们想要的是一种将一个映射/归约结果传递给另一个的能力。

Here is a general idea of how you can do chained map reduce with couchdb 1.xxx. What we want is the ability to pass the the results of one map/reduce to another.


  1. 订阅按视图过滤的_changes提要。这将为您提供由map函数实际发出的文档列表。

  1. Subscribe to the _changes feed filtered by the view. This will give you a list of docs that will actually be emitted by the map function.

接下来,我们需要为这些过滤后的文档调用view函数。这很简单,因为我们可以将键列表传递给视图,因此我们只需传递键并获得所需的视图结果子集即可。

Next we need to call the view function for these filtered docs. It's simple since we can pass a list of keys to the view so we simply pass the keys and get the desired result subset of the view.

接下来,我们将结果推送到一个单独的数据库中或将其放入同一个数据库中。我们可以使用批量插入来更快地执行插入。如果您使用单独的数据库,则甚至可以从视图结果中重用 _id的,这样批量更新会容易得多。

Next we push this result either in a separate database or in the same one. We can use bulk inserts to perform the inserts faster. If you use a separate database you can even reuse the _id's from the view results so the bulk updates would be a lot easier.

在此数据库中,我们定义了另一个视图,该视图根据值对结果进行排序。

Within this database we define another view that sorts our results based on value.

{键: user1,mostRecentDate: 7月2日}
{键: user2,mostRecentDate: 7月3日}

{key: "user1", mostRecentDate: "July 2"} {key: "user2", mostRecentDate: "July 3"}

由于您已经执行了此步骤,因此所需要做的就是在 mostRecentDate 在第二个数据库中,您将获得按日期排序的用户活动。

since you have already gotten to this step all you need to do is create a view on mostRecentDate in the second database and you will get user activity sorted by date.

我希望您使用的是虚拟缩小。返回null且仅用于 group = true 的一个。

I hope you are using a dummy reduce though. One that returns null and is only used for group=true.

在第4步中使用列表功能可以使您的生活更轻松。由于批量更新要求文档列表的格式为 { docs:[....]} ,因此您可以轻松地通过列表功能一次性获得它

using a list function in step 4 can make your life easier. As bulk updates require the list of docs to be in the form {"docs":[....]} you can easily get it in one go with a list function.

这篇关于CouchDB中的链接地图/约简的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆