CouchDB-筛选复制-速度可以提高吗? [英] CouchDB - Filtered Replication - Can the speed be improved?
问题描述
我有一个数据库(300MB和42,924个文档),其中包含来自约200个用户的约20种不同类型的文档。文档的大小从几字节到几千字节(大约150KB)。
I have a single database (300MB & 42,924 documents) consisting of about 20 different kinds of documents from about 200 users. The documents range in size from a few bytes to many KiloBytes (150KB or so).
卸载服务器时,以下复制筛选器功能大约需要2.5分钟才能完成。 。
加载服务器时,它需要10分钟以上的时间。
When the server is unloaded, the following replication filter function takes about 2.5 minutes to complete. When the server is loaded, it takes >10 minutes.
任何人都可以评论是否需要这些时间,如果没有,请提出如何优化的建议为了使
获得更好的性能?
Can anyone comment on whether these times are expected, and if not, suggest how I might optimize things in order to get better performance?
function(doc, req) {
acceptedDate = true;
if(doc.date) {
var docDate = new Date();
var dateKey = doc.date;
docDate.setFullYear(dateKey[0], dateKey[1], dateKey[2]);
var reqYear = req.query.year;
var reqMonth = req.query.month;
var reqDay = req.query.day;
var reqDate = new Date();
reqDate.setFullYear(reqYear, reqMonth, reqDay);
acceptedDate = docDate.getTime() >= reqDate.getTime();
}
return doc.user_id && doc.user_id == req.query.userid && doc._id.indexOf("_design") != 0 && acceptedDate;
}
推荐答案
过滤复制的工作很慢,因为每个获取的文档都运行复杂的逻辑来决定是否复制它:
Filtered replications works slow because for each fetched document runs complex logic to decide whether to replicate it or not:
- CouchDB获取下一个文档;
- 因为必须应用过滤器功能,文档才转换为JSON;
- JSON化的文档通过stdio传递到查询服务器;
- 查询服务器处理文档并通过JSON对其进行解码;
- 现在,查询服务器查找并运行您的过滤器函数,该函数返回
true
或false
到CouchDB的值; - 如果结果为
true
个文档将被复制; li>
- 转到第1页,循环浏览所有文档;
- CouchDB fetches next document;
- Because filter function has to be applied the document gets converted to JSON;
- JSONifyed document passes through stdio to query server;
- Query server handles document and decodes it from JSON;
- Now, query server lookups and runs your filter function which returns
true
orfalse
value to CouchDB; - If result is
true
document goes to be replicated; - Go to p.1 and loop for all documents;
对于未过滤的复制,请执行以下操作列表,扔掉第2-5页,让第6页始终为 true
结果。这种开销减慢了整个复制过程的速度。
For non-filtered replications take this list, throw away p.2-5 and let p.6 has always true
result. This overhead slows down whole replication process.
要显着提高过滤的复制速度,可以通过 Erlang本机服务器。它们在CouchDB中运行,不通过任何stdio接口,并且不应用JSON解码/编码开销。
To significantly improve filtered replication speed, you may use Erlang filters via Erlang native server. They runs inside CouchDB, doesn't pass through any stdio interface and there is no JSON decode/encode overhead applied.
注意,Erlang查询服务器未运行在沙箱中就像JavaScript一样,因此您需要真正信任使用它运行的代码。
另一种选择是优化过滤器功能,例如减少对象创建,方法调用,但实际上您不会因此获得任何好处。
Another option is to optimize your filter function e.g. reduce object creation, method calls, but actually you wouldn't win much with this.
这篇关于CouchDB-筛选复制-速度可以提高吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!