如果我使用不与solrcloud中的zookeeper通信的负载均衡器,功能是否会丢失? [英] Is there any loss of functionality if I use load balancer which does not communicate with zookeeper in solrcloud?

查看:123
本文介绍了如果我使用不与solrcloud中的zookeeper通信的负载均衡器,功能是否会丢失?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在solr云设置中,有8个solr节点和3个zookeeper节点.有一个负载均衡器可获取所有索引和搜索查询,并将它们分配到solr云中的这8个solr节点.在将solr查询发送到特定的solr节点之前,它首先检查服务端点是否处于活动状态.仅当它处于活动状态时,它才会将请求发送到该特定的solr节点. Zookeeper处理碎片中的领导者选举.在此设置中,zookeeper不处理查询分发.这种设置对分布式查询不利吗?由于负载均衡器执行查询分发工作,因此缺少solrcloud提供的其他功能.

In a solr cloud setup, there are 8 solr nodes and 3 zookeeper nodes. There is one load balancer that gets all the indexing and search queries and distributes them to these 8 solr nodes in solr cloud. Before sending the solr query to particular solr node, it first checks if the service endpoint is active. Only if it is active then it sends the request to that particular solr node. Zookeeper handles the elections of leaders in shard. In this setup, zookeeper is not handling the query distribution. Is this set-up bad for distributed queries? What other functionality offered by solrcloud is missed due load balancer doing the work of query distribution.

请注意,负载平衡器是必需的,因为有不同的客户端(Java,Ruby,JavaScript)访问solr服务.只有SolrJ可以使用CloudSolrServer类与Zookeeper通信.此外,它还有助于扩展Zookeeper节点,而无需从客户端更改任何设置.

Please note that, load balancer is necessary because there are different clients (Java, Ruby, JavaScript) accessing the solr service. Only SolrJ has the ability to communicate with zookeeper using CloudSolrServer class). Also, it helps to scale zookeeper nodes without changing any setting from client side.

推荐答案

SolrJ CloudSolrClient有两个优点:

The SolrJ CloudSolrClient has a couple of advantages:

  1. 节点自动发现:它始终使用SolrCloud群集本身使用的ZK机制知道群集中有哪些节点.

  1. Node autodiscovery: It always knows what nodes are in the cluster, using the same ZK mechanism that the SolrCloud cluster itself uses.

特定于查询的路由:尽管任何请求都可以发送到SolrCloud集群中的任何节点,但是其中许多将导致对应该处理该请求的实际节点的简单代理

Query-specific routing: Although any request can go to any node in the SolrCloud cluster, many of these will result in a simple proxy to the actual node that should handle the request

2a:索引请求直接路由到处理该文档ID的分片的负责人.对于批量插入请求,这可能意味着几个子请求,将批次的文档直接种植到每个适当的分片上.

2a: Indexing requests are routed directly to the leader of the shard handling that document's id. For a bulk-insert request, this can mean several sub-requests, farming out batches of documents directly to each appropriate shard.

2b:对集合的查询被路由到具有该集合中的分片的节点.

2b: Queries to a collection are routed to a node that has a shard from that collection.

CloudSolrClient已经知道这些内容并直接进行路由,从而避免了集群中的代理请求.

The CloudSolrClient already knows this stuff and routes directly, avoiding the proxy request within the cluster.

总而言之,内部路由请求非常轻巧.您将为请求增加一些延迟,增加内部网络带宽,并将最少量的CPU使用率添加到SolrCloud集群中.

All that said, the internal routing requests are pretty lightweight. You'll add some latency to the requests, increase internal network bandwidth, and add the tiniest bit of CPU usage to the SolrCloud cluster.

所以我要说的是,如果要重现这些优势太困难了,Solr会处理好事情,没有它们,您可能会好起来的.

So what I'm saying is that if it's too difficult to reproduce these advantages, Solr will handle things, and you'll probably get by just fine without them.

这篇关于如果我使用不与solrcloud中的zookeeper通信的负载均衡器,功能是否会丢失?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆