在 CouchDB 中使用 map reduce 输出更少的行 [英] Using map reduce in CouchDB to output fewer rows

查看:20
本文介绍了在 CouchDB 中使用 map reduce 输出更少的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您有两种文档类型,customersorders.customer 文档包含姓名、地址等基本信息,orders 包含客户每次订购时的所有订单信息.存储文档时,类型=订单或类型=客户.

Lets say you have two document types, customers and orders. A customer document contains basic information like name, address etc. and orders contain all the order information each time a customer orders something. When storing the documents, the type = order or the type = customer.

如果我对一组 10 个客户和 30 个订单执行映射函数,它将输出 40 行.有些行是客户,有些是订单.

If I do a map function over a set of 10 customers and 30 orders it will output 40 rows. Some rows will be customers, some will be orders.

问题是,我如何编写reduce,以便订单信息填充"在具有客户信息的行内?所以它将返回 10 行(10 个客户),但是每个客户的所有相关订单.

The question is, how do I write the reduce, so that the order information is "stuffed" inside of the rows that has the customer information? So it will return 10 rows (10 customers), but all the relevant orders for each customer.

基本上我不想在输出上有单独的记录,我想将它们组合起来(订单到一个客户行中),我认为 reduce 是一种方式?

Basically I don't want separate records on the output, I want to combine them (orders into one customer row) and I think reduce is the way?

推荐答案

这称为视图整理,它是一种非常有用的 CouchDB 技术.

This is called view collation and it is a very useful CouchDB technique.

幸运的是,您甚至不需要 reduce 步骤.只需使用 map 将客户和他们的订单聚集"在一起.

Fortunately, you don't even need a reduce step. Just use map to get the customers and their orders "clumped" together.

关键是您需要为每个客户提供一个唯一 ID,并且必须从客户文档和订单文档中获知.

The key is that you need a unique id for each customer, and it has to be known both from customer docs and from order docs.

示例客户:

{ "_id": "customer me@example.com"
, "type": "customer"
, "name": "Jason"
}

示例订单:

{ "_id": "abcdef123456"
, "type": "order"
, "for_customer": "customer me@example.com"
}

我已经方便地使用客户 ID 作为文档 _id 但重要的是两个文档都知道客户的身份.

I have conveniently used the customer ID as the document _id but the important thing is that both docs know the customer's identity.

目标是地图查询,如果您指定 ?key="customer me@example.com" 那么您将首先返回 (1)、客户信息和 (2)所下的所有订单.

The goal is a map query, where if you specify ?key="customer me@example.com" then you will get back (1) first, the customer info, and (2) any and all orders placed.

这个地图函数可以做到:

This map function would do that:

function(doc) {
  var CUSTOMER_VAL = 1;
  var ORDER_VAL    = 2;
  var key;

  if(doc.type === "customer") {
    key = [doc._id, CUSTOMER_VAL];
    emit(key, doc);
  }

  if(doc.type === "order") {
    key = [doc.for_customer, ORDER_VAL];
    emit(key, doc);
  }
}

所有行将主要根据文档所涉及的客户进行排序,决胜局"排序是整数 1 或 2.这使得客户文档始终排在其相应订单文档的上方.

All rows will sort primarily on the customer the document is about, and the "tiebreaker" sort is either the integer 1 or 2. That makes customer docs always sort above their corresponding order docs.

["customer me@example.com", 1], ...customer doc...
["customer me@example.com", 2], ...customer's order...
["customer me@example.com", 2], ...customer's other order.
... etc...
["customer another@customer.com", 1], ... different customer...
["customer another@customer.com", 2], ... different customer's order

附言如果您遵循所有这些:而不是 12 更好的值可能是客户的 null,然后是订单的订单时间戳.除了现在您有一个按时间顺序排列的订单列表外,它们的排序方式与以前相同.

P.S. If you follow all that: instead of 1 and 2 a better value might be null for the customer, then the order timestamp for the order. They will sort identically as before except now you have a chronological list of orders.

这篇关于在 CouchDB 中使用 map reduce 输出更少的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆