ElasticSearch:Jest vs Rest vs TransportClient与NodeClient [英] ElasticSearch: Jest vs Rest vs TransportClient vs NodeClient

查看:1252
本文介绍了ElasticSearch:Jest vs Rest vs TransportClient与NodeClient的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经通过 https:// www .elastic.co / blog / found-interfacing-elasticsearch-picking-client



但是它没有给出任何基准或性能数字来帮助选择客户端。而且我发现设置TransportClient 或< a href =https://stackoverflow.com/questions/35759866/unable-to-run-hello-world-java-client-of-elastic-search>设置NodeClient ,因为它的文档是如果有人在选择客户端方面已经做了一些基准测试,那我真的很感激,并且更多的关注调整已建立的客户而不是评估什么客户选择。



我们的应用程序是一个写入沉重的应用程序,我们计划拥有一个

解决方案

所有这些客户端都可以进行查询,他们都有他们的利弊(以下列表并不详尽):




  • A 节点客户端为集群提供单跳,但由于它将也是集群的一部分,它也可以在集群中引起太多的喋喋不便。

  • A 传输客户端不是集群的一部分,因此需要一个两跳往返,并且一次与单个节点进行通信循环方式(从构建过程中提供的列表)

  • Jest基本上是缺少客户端用于ES REST界面

  • 如果您觉得您不需要Jest提供的所有内容,只需要与几个端点进行交互,您也可以使用Spring REST模板,Apache HTTP等创建自己的REST客户端。



如果你有一个写入沉重的应用程序,我建议你根本不使用任何这些客户端。主要原因是它们都是同步的性质,如果您的架构或网络的任何组件由于某种原因而失败,那么您将丢失数据,这可能不是一个选项你如果你有足够的数据要摄取,你通常会采用异步方式,即将数据存储在一个临时的(耐用的)队列中(Kafka,Redis, JMS等),然后让另一个进程将其流传给ES。有很多方法可以做到这一点,但一个非常简单的方法是使用 Logstash



无论您决定将数据存储在Kafka还是JMS或Redis中,可以让Logstash使用您的数据并将其流式传输到ES,即您让Logstash担心重写的部分,它做得很好。这可以非常容易地实现





使用这种调优的设置,您可以处理非常繁重的写入负载,而无需担心您要使用哪个客户端,以及您如何d调整它尽管如此,问题仍然是查询问题,但是由于写入部分在您的案例中至关重要,因此您需要使其变得稳固,唯一严肃的方法是进行异步处理,并使得开发良好且经过测试的ETL(如Logstash,或流利等)为您做。



更新



值得注意的是,从ES 5.0起,将会有一个新的 Java REST客户端可用。


I have gone through the official documentation at https://www.elastic.co/blog/found-interfacing-elasticsearch-picking-client

But it does not give any benchmarks or performance numbers to help choose among the clients. And I am finding it non-trivial to setup a TransportClient or setup a NodeClient because the documentation for that is also really sparse with little to no examples whatsoever.

So if someone has already done some benchmarking on choosing a client, I would really appreciate that and focus more on tuning an established client rather than evaluating what client to choose.

Our application is a write-heavy application and we plan to have a 50-shard, 50-replica ES cluster for that.

解决方案

All those clients are fine for querying and they all have their pros and cons (below list is not exhaustive):

  • A Node client provides a single hop into the cluster but since it will also be part of the cluster it can also induce too much chatter within the cluster
  • A Transport client is not part of the cluster, hence requires a two-hop roundtrip, and communicates with a single node at a time in a round-robin fashion (from the list provided during its construction)
  • Jest is basically the missing client for the ES REST interface
  • If you feel like you don't need all what Jest has to offer and simply want to interact with a few endpoints, you might as well create your own REST client by using Spring REST template, Apache HTTP, etc

If you're going to have a write-heavy application I suggest you don't even use any of those clients at all. The main reason is that they are all synchronous in nature and if any component of your architecture or the network were to fail for some reason, then you'd lose data, and that might not be an option for you.

If you have plenty of data to ingest, you normally go the asynchronous way, i.e. storing your data in a temporary (yet durable) queue (Kafka, Redis, JMS, etc) and then let another process stream it to ES. There are many ways to do that, but a very simple one is to use Logstash for that.

Whether you decide to store your data in Kafka or JMS or Redis, you can then let Logstash consume your data and stream it to ES, i.e. you let Logstash worry about the heavy write part, which it does very well. That can be achieved very easily with

With that kind of well-tuned setup, you can handle very heavy write loads without needing to worry about which client you want to use and how you need to tune it. The question is still open for querying, though, but since the write part is paramount in your case, you need to make it solid, the only serious way is by going asynchronous and let a well-developed and tested ETL (such as Logstash, or fluentd, etc) do it for you.

UPDATE

It is worth noting that as of ES 5.0, there will be a new Java REST client available.

这篇关于ElasticSearch:Jest vs Rest vs TransportClient与NodeClient的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆