SolrCloud的可扩展性是否扩展到索引? [英] Does SolrCloud's scalability extend to indexing?

查看:88
本文介绍了SolrCloud的可扩展性是否扩展到索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我所看到的所有文献中,SolrCloud的可扩展性似乎只涉及查询.意思是,复制和分片将客户端查询的负载分布在更大的CPU和更大的带宽上.

In all the literature I've seen, the scalability of SolrCloud appears to concern querying only. Meaning, replication and sharding distributes the load of client queries accross greater CPU and wider bandwidth.

那如何编制索引呢?

SolrCloud的可扩展性是否可以改善索引性能?可以将其配置为加快索引时间吗?就我而言,我们需要经常向索引提交新内容;这种特殊情况会改变什么吗?

Does SolrCloud's scalability improving index performance? Can it be configured to speed up index time? In my case, we need to commit new content to the index frequently; does that special case change anything.

马克·米勒(Mark Miller)在Lucene Revolution 2012上的演示文稿令人着迷,并涵盖了一些索引编制细节.但似乎某些云功能(如复制)可能会使索引变慢,而不是变快.有人尝试过SolrCloud吗?

Mark Miller's presentation from Lucene Revolution 2012 is fascinating and covers some details of indexing. But it seems that certain cloud features (like replication) could conceivably make indexing slower, not faster. Anyone tried SolrCloud?

推荐答案

好吧,我终于能够建立一个合适的云环境进行测试,并且简要地说,即使使用RAMDirectory,索引速度也注定要失败.我不知道索引速度是否与云中关注者的数量或集合的数量相关,但是拥有1个领导者2个关注者结构和8个集合会使索引速度降低4到5倍.我能够在17分钟内为大约350万个文档建立索引,而对于云中的每个实例都具有相同的配置,我只能在17分钟内为650K文档建立索引...我不确定如何加快SolrCloud的索引编制速度和某种程度感到惊讶的是,我对云的期望被一一摧毁,因为我在使用它时不断遇到新的错误和问题.

Well, I am finally able to set up a proper cloud environment for testing and briefly, indexing speed is doomed even with RAMDirectory. I dont know if the indexing speed could be related the number of followers in cloud or number of collections, but having 1 leader 2 follower structure with 8 collections makes indexing 4 to 5 times slower. I am able to index around 3.5M docs in 17 minutes while with the same configs for each instance in the cloud, i can only index 650K docs in 17 minutes... I am not sure how to speed up SolrCloud indexing speed and some kinda surprised see that my expectations about cloud is destroyed one by one as I keep getting new bugs and problems while working on it.

如果这也发生在其他设置上,我不明白将云用于Solr的意义是什么.我的意思是,如果索引编制速度如此之快,我可以更快地对经典独立solr实例上的所有内容重新编制索引.

If this is happening on any other settings too, I dont understand what is the point of using cloud for Solr. I mean if indexing speed is rising this much, i can reindex everything on a classical standalone solr instance much faster.

如果有人尝试过SolrCloud或在实际环境中拥有它,那么看到SolrCloud的其他体验将非常好

Seeing some other experiences with SolrCloud would be really nice, if anyone tried it or anyone has it on a real environment

这篇关于SolrCloud的可扩展性是否扩展到索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆