在哪里可以找到 Apache Lucene/Solr 的性能基准 [英] Where can I find performance benchmarks for Apache Lucene/Solr
问题描述
是否有任何链接/资源可用于大型数据集上的 Lucene/Solr 性能基准测试.500GB~5TB以上的数据集
Are there any links/resources towards performance benchmarks for Lucene/Solr on large datasets. Data sets above the range of 500GB ~ 5TB
谢谢
推荐答案
Lucene 提交者 Mike McCandless 在 benchmarks 上运行定期跟踪性能改进和回归.它们是使用 Wikipedia 导出的,可能比您要查找的要小一些.
Lucene committer Mike McCandless runs benchmarks on a regular basis to track down performances improvements and regressions. They are made with Wikipedia exports, which might be a little bit smaller than what you are looking for.
但性能并不太取决于输入大小,而是取决于文档的数量和唯一术语.如果您已经有一些类似于您需要索引的数据,我建议您查看 Mike 的测试工具,根据您的需要调整它,并使用您自己的数据集和硬件运行它,以尝试找出您可以预期的性能数据.
But the performance doesn't depend so much on the input size, but rather on the number of documents and unique terms. If you already have some data similar to what you will need to index, I would recommend you check out Mike's test tool, adapt it to your needs, and run it with your own dataset and hardware to try to find out what kind of performance numbers you can expect.
这篇关于在哪里可以找到 Apache Lucene/Solr 的性能基准的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!