Solr云分片 [英] Solr cloud sharding

查看:73
本文介绍了Solr云分片的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我有一个Zookeeper实例来控制3台服务器上的复制.它是Solr集成的动物园管理员.它在我基于Web的应用程序中效果很好.

Currently I have a zookeeper instance controlling replication on 3 servers. It is the solr integrated zookeeper. It works well in my web based application.

我有一个新要求,它将要求在云中进行分片,并且我不确定如何实现它.基本上,我想将只能由我自己更新的数据(分片1)与用户可以更新的数据(分片2)分开.有时,我会完全替换分片1中的数据目录,但是我不会想打扰用户在分片2中创建的数据.

I have a new requirement which will require sharding in the cloud and I am not sure how to implement it. Basically I want to separate the data which can only be updated by me, shard 1, from the data that users can update, shard 2. From time to time I will be completely replacing the data directory in shard 1 - but I don't want to disturb the user created data in shard 2.

碎片1不需要复制,因为我可以在选择更新时将新数据复制到每个服务器,但是碎片2确实需要复制.

Shard 1 does not need replication since I can copy the new data to each server when I chose to update it however shard 2 does need replication.

当前,我在运行zookeeper的服务器上运行以下命令-

Currently I run the following command on the server running zookeeper -

java -Dbootstrap_confdir=solr/myApp/conf -Dcollection.configName=myConfig -DzkRun -DnumShards=1 -jar start.jar

以及其他2个非Zookeeper服务器上的以下命令

And the following command on the other 2 non zookeeper servers

java -Djetty-port=8983 -DzkHost=129.**.30.11:9983 -jar start.jar&

这将创建一个单独的sold实例* 3

This creates a single shard solr instance * 3

我想我只需要向此配置添加1个静态分片,但是我不确定要完成此操作的命令顺序.

I think I just need to add 1 static shard to this configuration however I am not sure the sequence of commands to accomplish it.

非常感谢

推荐答案

首先,您正在使用Zookeeper维护碎片和领导者/副本.因此,如果您想要一个包含两个实例的碎片,而另一个仅包含领导者的碎片,则必须将命令修改为:1)提供-DnumShards = 2,以便动物园管理员知道您需要两个碎片2)还要为此第一个solr实例指定-DzkHost参数.

Firstly you are using zookeeper to maintain your shards and leaders/replicas. So if you want to have one shard with two instances and another shard with only a leader then you will have to modify your command as: 1)provide -DnumShards=2 so that the zookeeper knows that you need two shards 2)specify the -DzkHost parameter for this first solr instance also.

java -Dbootstrap_confdir=solr/myApp/conf -Dcollection.configName=myConfig -DzkRun -DnumShards=2 -DzkHost=** -jar start.jar

执行此操作时,由于尚未创建shard2,因此会在控制台上看到一些错误.现在启动其他两台服务器,您应该看到一个带有两个服务器(leader和副本)的shard1,而shard2将只有一个实例,即Leader

When you do this you will see some errors on console since shard2 is not created as yet. Now start your other two servers and you should see a shard1 with two servers(leader and replica) and shard2 will have only one instance i.e leader

如果要分离索引并控制这些索引,则必须创建两个集合而不是两个分片.

If you want separation of indexes and control over those indexes.You will have to create two collections instead of two shards.

Explanation

您有3台服务器!!!因此,当您使用Zookeeper启动solrCloud时.将会发生以下情况:

you have 3 servers right!!! so when you will start solrCloud using zookeeper. following things will happen as:

1)与zookeeper一起启动第一个solr服务器,您将获得1个用于solr cloud的碎片shard1

1) start first solr server along with the zookeeper and you will get 1 shard for solr cloud as shard1

2)启动第二个solr服务器并指向zookeeper ...由于您已声明DnumShards = 2,因此Zookeeper将检查是否需要再创建1个分片,因此它将为您的集合创建shard2.现在,您将能够看到带有2个分片的1个集合的管理控制台.

2) start second solr server and point to the zookeeper... since you have declared DnumShards=2 ,Zookeeper will check that it needs to create 1 more shard, so it creates shard2 for your collection. By now you will be able to see your admin console with 2 shards for 1 collection.

3)现在启动您的第3台服务器,并将其指向zookeeper,然后zookeeper看到那里有2个分片,因此它将为shard1创建一个副本,而不是新的分片.

3) Now start your 3rd server and point it to zookeeper and now zookeeper sees that 2 shards are there so it will now create a replica for shard1 instead of a new shard.

所以会像

collection ---> shard1 ---> server1,server3

collection--->shard1--->server1,server3

      --->shard2--->server2

这篇关于Solr云分片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆