如何在lucene中将RAMDirectory集成到FSDirectory中 [英] how to integrate RAMDirectory into FSDirectory in lucene

查看:171
本文介绍了如何在lucene中将RAMDirectory集成到FSDirectory中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在有一个问题,这个问题关于lucene。我试图制作一个可以进行索引的lucene源代码,并使用RAMDirectory将它们首先存储在内存
中,然后使用FSDirectory将内存中的此索引刷新到磁盘
中。我对这段代码做了一些修改,但是
无济于事。也许你们中的一些人可以帮我一点。



所以在将这些源
代码放入FSDirectory之前,我最好的方法是将RAMDirectory集成到这个源
代码中。任何帮助将被赞赏
虽然
这里是源代码。

  import org.apache.lucene .analysis.SimpleAnalyzer; 
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;

import java.io.File;
import java.io.FileReader;
import java.io.IOException;

public class SimpleFileIndexer {
public static void main(String [] args)throws Exception {
File indexDir = new File(C:/ Users / Raden / Documents / lucene / LuceneHibernate / ADI);
文件dataDir =新文件(C:/ Users / Raden / Documents / lucene / LuceneHibernate / adi);
String suffix =txt;
SimpleFileIndexer indexer = new SimpleFileIndexer();
int numIndex = indexer.index(indexDir,dataDir,suffix);
System.out.println(索引的文件总数+ numIndex);
}

private int index(文件indexDir,File dataDir,String suffix)抛出异常{
IndexWriter indexWriter = new IndexWriter(
FSDirectory.open(indexDir),
new SimpleAnalyzer(),
true,
IndexWriter.MaxFieldLength.LIMITED);
indexWriter.setUseCompoundFile(false);
indexDirectory(indexWriter,dataDir,suffix);
int numIndexed = indexWriter.maxDoc();
indexWriter.optimize();
indexWriter.close();
返回numIndexed;
}

private void indexDirectory(IndexWriter indexWriter,File dataDir,String suffix)throws IOException {
File [] files = dataDir.listFiles();
for(int i = 0; i< files.length; i ++){
File f = files [i];
if(f.isDirectory()){
indexDirectory(indexWriter,f,suffix);
} else {
indexFileWithIndexWriter(indexWriter,f,suffix);
}
}
}

private void indexFileWithIndexWriter(IndexWriter indexWriter,File f,String suffix)抛出IOException {
if(f.isHidden() || f.isDirectory()||!f.canRead()||!f.exists()){
return;
}
if(suffix!= null&&!f.getName()。endsWith(suffix)){
return;
}
System.out.println(索引文件+ f.getCanonicalPath());
Document doc = new Document();
doc.add(new Field(contents,new FileReader(f)));
doc.add(new Field(filename,f.getCanonicalPath(),Field.Store.YES,Field.Index.ANALYZED));
indexWriter.addDocument(doc);
}
}


解决方案

我我不确定你会从中获得任何性能提升,但你可以在 RAMDirectory 上完成所有索引,然后将目录复制到FSDirectory。 / p>

像这样:

  private int index(File indexDir,File dataDir ,String后缀)抛出异常{
RAMDirectory ramDir = new RAMDirectory(); // 1
IndexWriter indexWriter = new IndexWriter(
ramDir,// 2
new SimpleAnalyzer(),
true,
IndexWriter.MaxFieldLength.LIMITED);
indexWriter.setUseCompoundFile(false);
indexDirectory(indexWriter,dataDir,suffix);
int numIndexed = indexWriter.maxDoc();
indexWriter.optimize();
indexWriter.close();

Directory.copy(ramDir,FSDirectory.open(indexDir),false); // 3

返回numIndexed;
}


I had a question now, this one regarding lucene. I was trying to make a lucene source code that can do indexing and store them first in a memory using RAMDirectory and then flush this index in a memory into a disk using FSDirectory. I had done some modifications of this code but to no avail. maybe some of you can help me out a bit.

so what's the best way for me to integrate RAMDirectory in this source code before putting them in FSDirectory. any help would be appreciated though here is the source code.

import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;

import java.io.File;
import java.io.FileReader;
import java.io.IOException;

public class SimpleFileIndexer {
    public static void main(String[] args) throws Exception {
        File indexDir = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
        File dataDir = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
        String suffix = "txt";
        SimpleFileIndexer indexer = new SimpleFileIndexer();
        int numIndex = indexer.index(indexDir, dataDir, suffix);
        System.out.println("Total files indexed " + numIndex);
    }

    private int index(File indexDir, File dataDir, String suffix) throws Exception {
        IndexWriter indexWriter = new IndexWriter(
                FSDirectory.open(indexDir),
                new SimpleAnalyzer(),
                true,
                IndexWriter.MaxFieldLength.LIMITED);
        indexWriter.setUseCompoundFile(false);
        indexDirectory(indexWriter, dataDir, suffix);
        int numIndexed = indexWriter.maxDoc();
        indexWriter.optimize();
        indexWriter.close();
        return numIndexed;
    }

    private void indexDirectory(IndexWriter indexWriter, File dataDir, String suffix) throws IOException {
        File[] files = dataDir.listFiles();
        for (int i = 0; i < files.length; i++) {
            File f = files[i];
            if (f.isDirectory()) {
                indexDirectory(indexWriter, f, suffix);
            } else {
                indexFileWithIndexWriter(indexWriter, f, suffix);
            }
        }
    }

    private void indexFileWithIndexWriter(IndexWriter indexWriter, File f, String suffix) throws IOException {
        if (f.isHidden() || f.isDirectory() || !f.canRead() || !f.exists()) {
            return;
        }
        if (suffix != null && !f.getName().endsWith(suffix)) {
            return;
        }
        System.out.println("Indexing file " + f.getCanonicalPath());
        Document doc = new Document();
        doc.add(new Field("contents", new FileReader(f)));
        doc.add(new Field("filename", f.getCanonicalPath(), Field.Store.YES, Field.Index.ANALYZED));
        indexWriter.addDocument(doc);
    }
}

解决方案

I'm not really sure that you'll get any performance gain from doing this, but you could do all the indexing on a RAMDirectory and then copy the directory to an FSDirectory.

Like this:

private int index(File indexDir, File dataDir, String suffix) throws Exception {
    RAMDirectory ramDir = new RAMDirectory();          // 1
    IndexWriter indexWriter = new IndexWriter(
            ramDir,                                    // 2
            new SimpleAnalyzer(),
            true,
            IndexWriter.MaxFieldLength.LIMITED);
    indexWriter.setUseCompoundFile(false);
    indexDirectory(indexWriter, dataDir, suffix);
    int numIndexed = indexWriter.maxDoc();
    indexWriter.optimize();
    indexWriter.close();

    Directory.copy(ramDir, FSDirectory.open(indexDir), false); // 3

    return numIndexed;
}

这篇关于如何在lucene中将RAMDirectory集成到FSDirectory中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆