使用Flink Rich InputFormat创建Elasticsearch的输入格式 [英] Create Input Format of Elasticsearch using Flink Rich InputFormat

查看：140 发布时间：2021/4/8 18:36:18 elasticsearch apache-flink flink-batch

本文介绍了使用Flink Rich InputFormat创建Elasticsearch的输入格式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们正在使用Elasticsearch 6.8.4和Flink 1.0.18.

We are using Elasticsearch 6.8.4 and Flink 1.0.18.

我们在elasticsearch中有一个带有1个分片和1个副本的索引，我想创建自定义输入格式，以使用具有超过1个输入分割的apache Flink数据集API在Elasticsearch中读写数据，以实现更好的性能.那有什么办法可以达到这个要求?

We have an index with 1 shard and 1 replica in elasticsearch and I want to create the custom input format to read and write data in elasticsearch using apache Flink dataset API with more than 1 input splits in order to achieve better performance. so is there any way I can achieve this requirement?

注意:每个文档的大小较大(将近8mb)，并且由于尺寸限制，一次只能读取10个文档，并且每个读取请求都希望检索500k条记录.

Note: Per document size is larger(almost 8mb) and I can read only 10 documents at a time because of size constraint and per reading request, we want to retrieve 500k records.

据我了解，并行性的数量应等于数据源的分片/分区的数量.但是，由于我们仅存储少量数据，所以分片数量仅保持为1，而我们拥有的静态数据每个月的增加很少.

As per my understanding, no.of parallelism should be equal to number of shards/partitions of the data source. however, since we store only a small amount of data we have kept the number of shard as only 1 and we have a static data it gets increased very slightly per month.

任何帮助或源代码示例将不胜感激.

Any help or example of source code will be much appreciated.

使用Flink Rich InputFormat创建Elasticsearch的输入格式 [英] Create Input Format of Elasticsearch using Flink Rich InputFormat

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用Flink Rich InputFormat创建Elasticsearch的输入格式 [英] Create Input Format of Elasticsearch using Flink Rich InputFormat

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭