最大弹性搜索文件大小是多少? [英] What is the maximum Elasticsearch document size?

查看:112
本文介绍了最大弹性搜索文件大小是多少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读了关于Lucene deing的注释,限于2Gb文件。对于可以在Elasticsearch中编入索引的文档的大小,是否有任何其他限制?

解决方案

Lucene使用内部的字节缓冲区32位整数进行寻址。根据定义,这限制了文档的大小。所以2GB是理论上最大的。



在ElasticSearch中:



ESGitHub代码中的 max http请求大小,它设置为 Integer.MAX_VALUE 2 ^ 31-1 。因此,基本上, 2GB是通过HTTP 批量索引的最大文档大小。而且还要补充一点,ES在完成之前不处理HTTP请求。



良好做法




  • 如果可以帮助它,请不要使用非常大的java堆:将其设置为必要的大小(理想情况下不超过机器的RAM的一半)您使用Elasticsearch的整体最大工作集大小。这将留下剩余的(希望大小的)RAM用于操作系统管理IO缓存。

  • 在客户端,始终使用批量API,它在一个请求中对多个文档进行索引,并尝试每个批量请求发送的正确文件数量。最佳尺寸取决于许多因素,但是尝试错误的方向太少而不是太多的文档。使用与客户端线程或单独异步请求的并发大容量请求。



要进一步学习,请参阅这些链接:



1)弹性搜索索引的性能注意事项



2)通过HTTP进行批量索引的最大大小文档


I read notes about Lucene deing limited to 2Gb documents. Are there any additional limitations on the size of documents that can be indexed in Elasticsearch?

解决方案

Lucene uses a byte buffer internally that uses 32bit integers for addressing. By definition this limits the size of the documents. So 2GB is max in theory.

In ElasticSearch:

There is a max http request size in the ES GitHub code, and it is set against Integer.MAX_VALUE or 2^31-1. So, basically, 2GB is the maximum document size for bulk indexing over HTTP. And also to add to it, ES does not process an HTTP request until it completes.

Good Practices:

  • Do not use a very large java heap if you can help it: set it only as large as is necessary (ideally no more than half of the machine’s RAM) to hold the overall maximum working set size for your usage of Elasticsearch. This leaves the remaining (hopefully sizable) RAM for the OS to manage for IO caching.
  • In client side, always use the bulk api, which indexes multiple documents in one request, and experiment with the right number of documents to send with each bulk request. The optimal size depends on many factors, but try to err in the direction of too few rather than too many documents. Use concurrent bulk requests with client-side threads or separate asynchronous requests.

For further study refer to these links:

1) Performance considerations for elasticsearch indexing

2) Document maximum size for bulk indexing over HTTP

这篇关于最大弹性搜索文件大小是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆