Lucene上打开文件太多错误 [英] Too many open files Error on Lucene

查看：152 发布时间：2018/8/2 13:54:05 linux indexing lucene ioexception file-not-found

本文介绍了Lucene上打开文件太多错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究的项目是索引一定数量的数据（带有长文本），并将它们与每个间隔的单词列表（大约15到30分钟）进行比较。

The project I'm working on is indexing a certain number of data (with long texts) and comparing them with list of words per interval (about 15 to 30 minutes).

过了一段时间，比如第35轮，在第36轮开始索引新的数据集时发生了这个错误：

After some time, say 35th round, while starting to index new set of data on 36th round this error occurred:

    [ERROR] (2011-06-01 10:08:59,169) org.demo.service.LuceneService.countDocsInIndex(?:?) : Exception on countDocsInIndex: 
    java.io.FileNotFoundException: /usr/share/demo/index/tag/data/_z.tvd (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.<init>(SimpleFSDirectory.java:69)
        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.<init>(SimpleFSDirectory.java:90)
        at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.<init>(NIOFSDirectory.java:91)
        at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:78)
        at org.apache.lucene.index.TermVectorsReader.<init>(TermVectorsReader.java:81)
        at org.apache.lucene.index.SegmentReader$CoreReaders.openDocStores(SegmentReader.java:299)
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:580)
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:556)
        at org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:113)
        at org.apache.lucene.index.ReadOnlyDirectoryReader.<init>(ReadOnlyDirectoryReader.java:29)
        at org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:81)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:736)
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:428)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:274)
        at org.demo.service.LuceneService.countDocsInIndex(Unknown Source)
        at org.demo.processing.worker.DataFilterWorker.indexTweets(Unknown Source)
        at org.demo.processing.worker.DataFilterWorker.processTweets(Unknown Source)
        at org.demo.processing.worker.DataFilterWorker.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:636)

我已经尝试通过以下方式设置最大打开文件数：

I've already tried setting maximum number of open files by:

        ulimit -n <number>

但是一段时间后，当间隔有大约1050行长文本时，会发生同样的错误。但它只发生过一次。

But after some time, when the interval has about 1050 rows of long texts, the same error occurs. But it only occurred once.

我应该遵循从（打开的文件过多） - SOLR 或这是一个关于被索引的数据量的问题？

Should I follow the advice of modifying Lucene IndexWriter's mergeFactor from (Too many open files) - SOLR or is this an issue on the amount of data being indexed?

我还读到它是批量索引或交互式索引之间的选择。
如何通过频繁更新来确定索引是否是交互式的？
我应该在交互式索引下对这个项目进行分类吗？

I've also read that it's a choice between batch indexing or interactive indexing. How would one determine if indexing is interactive, just by frequent updates? Should I categorize this project under interactive indexing then?

更新：我正在添加我的IndexWriter片段：

UPDATE: I'm adding snippet of my IndexWriter:

        writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), IndexWriter.MaxFieldLength.UNLIMITED);

似乎maxMerge（？或字段长度...）已经设置为无限制。

Seems like maxMerge (? or field length...) is already set to unlimited.

Lucene上打开文件太多错误 [英] Too many open files Error on Lucene

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

Lucene上打开文件太多错误 [英] Too many open files Error on Lucene

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭