Solr的多核VS VS分片1大集合 [英] solr multicore vs sharding vs 1 big collection

查看:279
本文介绍了Solr的多核VS VS分片1大集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在有40万份文件和25 GB的索引大小的单个集合。藏品得到每n分钟更新一次,因此删除的文件的数量在不断增加。
集合中的数据是超过1000个以上的客户记录的合并。每个每个客户的文档数平均为大约10万的记录。

I currently have a single collection with 40 million documents and index size of 25 GB. The collections gets updated every n minutes and as a result the number of deleted documents is constantly growing. The data in the collection is an amalgamation of more than 1000+ customer records. The number of documents per each customer is around 100,000 records on average.

现在这样说,我在试图让越来越删除的文档大小的手柄。因为生长指数大小的两个磁盘空间和存储器被用完。并想将其降低到可管理的大小。

Now that being said, I 'm trying to get an handle on the growing deleted document size. Because of the growing index size both the disk space and memory is being used up. And would like to reduce it to a manageable size.

我一直在想将数据分成多个核心,1为每一个客户的。这将让我轻松地管理较小的集合,也快速创建/更新集合。我担心的是藏品的数量可能会成为一个问题。如何解决这个问题的任何建议。

I have been thinking of splitting the data into multiple core, 1 for each customer. This would allow me manage the smaller collection easily and can create/update the collection also fast. My concern is that number of collections might become an issue. Any suggestions on how to address this problem.

Solr: 4.9
Index size:25 GB
Max doc: 40 million
Doc count:29 million

感谢

推荐答案

我有那种类似问题具有多个客户和大型索引数据。

I had the similar sort of issue having multiple customer and big indexed data.

我已经通过了为客户创建一个单独的内核版本3.4付诸实施。

I have the implemented it with version 3.4 by creating a separate core for a customer.

即每个客户一个核心。创建核心是某种像我们分片的情况下,做创建索引或分割的数据...

i.e One core per customer. Creating core is some sort of creating indexes or splitting the data as like we do in case of sharding...

在这里,你在不同的小段拆分大型索引数据。

Here you are splitting the large indexed data in different smaller segments.

无论SEACH会发生它会在较小的索引段进行..所以响应时间会更快。

Whatever the seach will happen it will carry in the smaller indexed segment.. so the response time would be faster..

我有近700个核心截至目前其运行良好,创造了我。

I have almost 700 core created as of now and its running fine for me.

截至目前我没有面对任何问题与管理核心...

As of now I did not face any issue with managing the core...

我会建议去与核心,分片组合...

I would suggest to go with combination of core and sharding...

这将帮助你在实现

允许有用于与不同的行为的每个芯的不同的配置,这将不会对其他核的影响。

Allows to have a different configuration for each core with different behavior and that will not have impact on other cores.

您可以在每个不同的内核像执行更新操作,负载等。

you can perform action like update, load etc. on each core differently.

这篇关于Solr的多核VS VS分片1大集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆