是否建议使用MapReduce来“展平" CouchDB中的不规则实体? [英] Is it advisable to use MapReduce to 'flatten' irregular entities in CouchDB?

查看:105
本文介绍了是否建议使用MapReduce来“展平" CouchDB中的不规则实体?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在有关CouchDB的问题中,我之前曾问过(

In a question on CouchDB I asked previously (Can you implement document joins using CouchDB 2.0 'Mango'?), the answer mentioned creating domain objects instead of storing relational data in Couch.

但是,我的用例不一定是将关系数据存储在Couch中,而是将关系数据展平.例如,我有一个从多个供应商处收集的Invoice实体.因此,我对该实体有两种不同的架构.

My use case, however, is not necessarily to store relational data in Couch but to flatten relational data. For example, I have the entity of Invoice that I collect from several suppliers. So I have two different schemas for that entity.

所以我可能会在Couch中得到2个看起来像这样的文档:

So I might end up with 2 docs in Couch that look like this:

{
    "type": "Invoice",
    "subType": "supplier B",
    "total": 22.5,
    "date": "10 Jan 2017",
    "customerName": "me"
}

{
    "type": "Invoice",
    "subType": "supplier A",
    "InvoiceTotal": 10.2,
    "OrderDate": <some other date format>,
    "customerName": "me"
}

我也有这样的文档:

{
    "type": "Customer",
    "name": "me",
    "details": "etc..."
}

然后,我的目的是平化"发票实体,然后加入reduce函数.因此,地图函数如下所示:

My intention then is to 'flatten' the Invoice entities, and then join on the reduce function. So, the map function looks like this:

function(doc) {
    switch(doc.type) {
        case 'Customer':
            emit(doc.customerName, { doc information ..., type: "Customer" });
            break;
        case 'Invoice':
            switch (doc.subType) {
                case 'supplier B':
                    emit (doc.customerName, { total:  doc.total, date: doc.date, type: "Invoice"});
                    break;

                case 'supplier A':
                    emit (doc.customerName, { total:  doc.InvoiceTotal, date: doc.OrderDate, type: "Invoice"});
                    break;
            }
            break;
    }
}

然后,我将使用reduce函数来比较具有相同customerName(即联接)的文档.

Then I would use the reduce function to compare docs with the same customerName (i.e. a join).

使用CouchDB这样做是否明智?如果没有,为什么?

Is this advisable using CouchDB? If not, why?

推荐答案

完全可以通过视图标准化"您的不同架构(或subTypes).但是,从长远来看,您无法基于这些规范化的架构创建视图.

It is totally ok to "normalize" your different schemas (or subTypes) via a view. You cannot create views based on those normalized schemas, though, and on the long run, it might be hard to manage different schemas.

更好的解决方案可能是在将文档写入CouchDB之前对其进行规范化.如果仍然需要原始结构的文档,则可以添加子属性original,以原始形式存储文档.这将使处理数据更加容易:

The better solution might be to normalize the documents before writing them to CouchDB. If you still need the documents in their original schema, you can add a sub-property original where you store your documents in their original form. This would make working on data much easier:

{
  "type": "Invoice",
  "total": 22.5,
  "date": "2017-01-10T00:00:00.000Z",
  "customerName": "me",
  "original": {
    "supplier": "supplier B",
    "total": 22.5,
    "date": "10 Jan 2017",
    "customerName": "me"
  }
},

{
  "type": "Invoice",
  "total": 10.2,
  "date": "2017-01-12T00:00:00:00.000Z,
  "customerName": "me",
  "original": {
    "subType": "supplier A",
    "InvoiceTotal": 10.2,
    "OrderDate": <some other date format>,
    "customerName": "me"
  }
}

Id还将日期转换为ISO格式,因为它与new Date()解析得很好,可以正确排序并且易于阅读.您可以轻松地按年,月,日以及其他方式发出发票.

I d' also convert the date to ISO format because it parses well with new Date(), sorts correctly and is human-readable. You can easily emit invoices grouped by year, month, day and whatever with that.

最好只对内置函数使用reduce,因为reduce必须在查询中重新执行,并且即使数据库没有发生任何更改,在许多文档上执行JavaScript也是一项复杂且耗时的操作.您可以在中找到有关缩减过程的更多信息. CouchDB进程.在将文档存储到CouchDB中之前,应尽可能多地对其进行预处理.

Use reduce preferably only with built-in functions, because reduces have to be re-executed on queries, and executing JavaScript on many documents is a complex and time-intensive operation, even if the database has not changed at all. You find more information about the reduce process in the CouchDB process. It makes more sense to preprocess the documents as much as you can before storing them in CouchDB.

这篇关于是否建议使用MapReduce来“展平" CouchDB中的不规则实体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆