CouchDB-筛选复制-速度可以提高吗? [英] CouchDB - Filtered Replication - Can the speed be improved?

查看:44
本文介绍了CouchDB-筛选复制-速度可以提高吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据库(300MB和42,924个文档),其中包含来自约200个用户的约20种不同类型的文档。文档的大小从几字节到几千字节(大约150KB)。

I have a single database (300MB & 42,924 documents) consisting of about 20 different kinds of documents from about 200 users. The documents range in size from a few bytes to many KiloBytes (150KB or so).

卸载服务器时,以下复制筛选器功能大约需要2.5分钟才能完成。 。
加载服务器时,它需要10分钟以上的时间。

When the server is unloaded, the following replication filter function takes about 2.5 minutes to complete. When the server is loaded, it takes >10 minutes.

任何人都可以评论是否需要这些时间,如果没有,请提出如何优化的建议为了使
获得更好的性能?

Can anyone comment on whether these times are expected, and if not, suggest how I might optimize things in order to get better performance?

function(doc, req) {
    acceptedDate = true;
    if(doc.date) {
        var docDate = new Date();
        var dateKey = doc.date;
        docDate.setFullYear(dateKey[0], dateKey[1], dateKey[2]);

        var reqYear = req.query.year;
        var reqMonth = req.query.month;
        var reqDay = req.query.day;
        var reqDate = new Date();
        reqDate.setFullYear(reqYear, reqMonth, reqDay);

        acceptedDate = docDate.getTime() >= reqDate.getTime();
    }

    return doc.user_id && doc.user_id == req.query.userid && doc._id.indexOf("_design") != 0 && acceptedDate; 
}


推荐答案

过滤复制的工作很慢,因为每个获取的文档都运行复杂的逻辑来决定是否复制它:

Filtered replications works slow because for each fetched document runs complex logic to decide whether to replicate it or not:


  1. CouchDB获取下一个文档;

  2. 因为必须应用过滤器功能,文档才转换为JSON;

  3. JSON化的文档通过stdio传递到查询服务器;

  4. 查询服务器处理文档并通过JSON对其进行解码;

  5. 现在,查询服务器查找并运行您的过滤器函数,该函数返回 true false 到CouchDB的值;

  6. 如果结果为 true 个文档将被复制; li>
  7. 转到第1页,循环浏览所有文档;

  1. CouchDB fetches next document;
  2. Because filter function has to be applied the document gets converted to JSON;
  3. JSONifyed document passes through stdio to query server;
  4. Query server handles document and decodes it from JSON;
  5. Now, query server lookups and runs your filter function which returns true or false value to CouchDB;
  6. If result is true document goes to be replicated;
  7. Go to p.1 and loop for all documents;

对于未过滤的复制,请执行以下操作列表,扔掉第2-5页,让第6页始终为 true 结果。这种开销减慢了整个复制过程的速度。

For non-filtered replications take this list, throw away p.2-5 and let p.6 has always true result. This overhead slows down whole replication process.

要显着提高过滤的复制速度,可以通过 Erlang本机服务器。它们在CouchDB中运行,不通过任何stdio接口,并且不应用JSON解码/编码开销。

To significantly improve filtered replication speed, you may use Erlang filters via Erlang native server. They runs inside CouchDB, doesn't pass through any stdio interface and there is no JSON decode/encode overhead applied.

注意,Erlang查询服务器未运行在沙箱中就像JavaScript一样,因此您需要真正信任使用它运行的代码。

另一种选择是优化过滤器功能,例如减少对象创建,方法调用,但实际上您不会因此获得任何好处。

Another option is to optimize your filter function e.g. reduce object creation, method calls, but actually you wouldn't win much with this.

这篇关于CouchDB-筛选复制-速度可以提高吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆