在CouchDB中使用map reduce可以输出更少的行 [英] Using map reduce in CouchDB to output fewer rows

查看:178
本文介绍了在CouchDB中使用map reduce可以输出更少的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您有两种文件类型:客户订单客户文档包含基本信息,例如姓名,地址等,并且订单包含每次客户订购某物的所有订单信息。当存储文档时,type = order或type = customer。

Lets say you have two document types, customers and orders. A customer document contains basic information like name, address etc. and orders contain all the order information each time a customer orders something. When storing the documents, the type = order or the type = customer.

如果我对一组10个客户和30个订单执行映射功能,则会输出40行。一些行将是客户,一些将是订单。

If I do a map function over a set of 10 customers and 30 orders it will output 40 rows. Some rows will be customers, some will be orders.

问题是,如何编写reduce,以便订单信息stuffed里面有客户信息的行?所以它会返回10行(10个客户),但每个客户的所有相关订单。

The question is, how do I write the reduce, so that the order information is "stuffed" inside of the rows that has the customer information? So it will return 10 rows (10 customers), but all the relevant orders for each customer.

基本上我不想在输出上有单独的记录,我想将它们(订单分成一个客户行),我认为reduce是方式?

Basically I don't want separate records on the output, I want to combine them (orders into one customer row) and I think reduce is the way?

推荐答案

这称为查看排序规则,这是一个非常有用的CouchDB技术。

This is called view collation and it is a very useful CouchDB technique.

幸运的是,你甚至不需要一个 reduce 步骤。只需使用 map 即可将客户及其订单捆绑在一起。

Fortunately, you don't even need a reduce step. Just use map to get the customers and their orders "clumped" together.

关键是每个客户都需要一个唯一的ID,并且必须从客户文档和订单文档中获知。

The key is that you need a unique id for each customer, and it has to be known both from customer docs and from order docs.

示例客户:

{ "_id": "customer me@example.com"
, "type": "customer"
, "name": "Jason"
}

示例顺序:

{ "_id": "abcdef123456"
, "type": "order"
, "for_customer": "customer me@example.com"
}

方便地将客户ID用作文档 _id ,但重要的是两个文档都知道客户的身份。

I have conveniently used the customer ID as the document _id but the important thing is that both docs know the customer's identity.

目标是地图查询,如果您指定?key =customer me@example.com

The goal is a map query, where if you specify ?key="customer me@example.com" then you will get back (1) first, the customer info, and (2) any and all orders placed.

这个地图功能将会这样做:

This map function would do that:

function(doc) {
  var CUSTOMER_VAL = 1;
  var ORDER_VAL    = 2;
  var key;

  if(doc.type === "customer") {
    key = [doc._id, CUSTOMER_VAL];
    emit(key, doc);
  }

  if(doc.type === "order") {
    key = [doc.for_customer, ORDER_VAL];
    emit(key, doc);
  }
}

所有行将主要根据客户about和tiebreaker排序是整数1或2.这使得客户文档总是排序在它们对应的订单文档之上。

All rows will sort primarily on the customer the document is about, and the "tiebreaker" sort is either the integer 1 or 2. That makes customer docs always sort above their corresponding order docs.

["customer me@example.com", 1], ...customer doc...
["customer me@example.com", 2], ...customer's order...
["customer me@example.com", 2], ...customer's other order.
... etc...
["customer another@customer.com", 1], ... different customer...
["customer another@customer.com", 2], ... different customer's order

PS如果你遵循所有这一切:而不是 1 2 更好的值可能是 null 给客户,然后是订单的订单时间戳。他们会像以前一样排序,除非现在你有一个按时间顺序的订单列表。

P.S. If you follow all that: instead of 1 and 2 a better value might be null for the customer, then the order timestamp for the order. They will sort identically as before except now you have a chronological list of orders.

这篇关于在CouchDB中使用map reduce可以输出更少的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆