从lucene中按术语删除文档 [英] Deleting document by Term from lucene

查看:106
本文介绍了从lucene中按术语删除文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码未按预期的条款删除文档:

The following code does not delete the document by Term as expected:

        RAMDirectory idx     = new RAMDirectory();
        IndexWriter writer  = new IndexWriter(idx,
                                   new SnowballAnalyzer(Version.LUCENE_30, "English"),
                                   IndexWriter.MaxFieldLength.LIMITED);
        Document doc = new Document();
        doc.add(new Field("title", "mydoc", Field.Store.YES, Field.Index.NO));
        doc.add(new Field("content", "some content, deleteme", Field.Store.YES, Field.Inde
x.ANALYZED));
        writer.addDocument(doc);
        Document doc2 = new Document();        
        doc2.add(new Field("title", "mydoc2", Field.Store.YES, Field.Index.NO));
        doc2.add(new Field("content", "other content, don't deleteme", Field.Store.YES, Field.I
ndex.ANALYZED));
        writer.addDocument(doc2);
        writer.optimize();
        writer.close();

        /*
        IndexReader reader = IndexReader.open(idx, false);
        int docs_up_for_deletion = reader.docFreq(new Term("title"));
        int before = reader.numDocs();
        int docs_deleted = reader.deleteDocuments(new Term("title", "mydoc"));
        reader.close();
        */

        IndexWriter writer2  = new IndexWriter(idx,
                                   new SnowballAnalyzer(Version.LUCENE_30, "English"),
                                   IndexWriter.MaxFieldLength.LIMITED);
        int before = writer2.numDocs();
        writer2.deleteDocuments(new Term("title", "mydoc"));
        writer2.commit();
        writer2.optimize();
        int after = writer2.numDocs();
        writer2.close();
        int docs_deleted = before - after;

我尝试使用IndexReader和IndexWriter删除,但都无法正常工作.

I've tried deleting with the IndexReader and IndexWriter and neither works.

我还尝试在上述代码之后添加另一个IndexReader搜索,以防万一该数字仅在关闭writer2后才更新(在

I've also tried adding another IndexReader search after the above code just in case the number only gets updated after closing writer2 (mentioned in this FAQ), but that doesn't help. Doing a writer.deleteAll() works, just not the delete by Term.

我发现了一个古老的参考事实,即只能删除Field.Keyword类型的字段,但这在Lucene 3.x中不再是有效的字段类型

I found an old reference to the fact that only fields of type Field.Keyword can be deleted, but this is no longer a valid field type in Lucene 3.x

推荐答案

您的标题字段未建立索引.更改

Your title field is not indexed. Change

new Field("title", "mydoc", Field.Store.YES, Field.Index.NO)

new Field("title", "mydoc", Field.Store.YES, Field.Index.ANALYZED)

new Field("title", "mydoc", Field.Store.YES, Field.Index.NOT_ANALYZED)

取决于您是否要对字段进行分析.

depending on whether or not you want your field analyzed.

这篇关于从lucene中按术语删除文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆