避免在重建期间删除当前的 Lucene.NET 索引 [英] Avoid removal of current Lucene.NET index during rebuild

查看:24
本文介绍了避免在重建期间删除当前的 Lucene.NET 索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Lucene.NET 的新手,但我正在使用 一个开源工具Sitecore CMS,它使用 Lucene.NET 来索引 CMS 中的大量内容.我昨天确认,当我重建索引时,当前的索引文件会擦除干净,因此任何依赖索引的东西在大约 30-60 秒(完整索引重建的时间量)内没有数据.是否有最佳实践或方法使 Lucene.NET 在完全重建新索引之前不会覆盖当前索引文件?我基本上认为我希望它写入新的临时索引文件,并在重建完成后让这些文件覆盖当前索引.

I'm new to Lucene.NET but I'm using an open source tool built for Sitecore CMS that uses Lucene.NET to index lots of content from the CMS. I confirmed yesterday that when I rebuild my indexes, the current index files wipe clean so anything that relies on the index gets no data for about 30-60 seconds (the amount of time for a full index rebuild). Is there a best practice or way to make Lucene.NET not overwrite the current index files until the new index is completely rebuilt? I'm basically thinking I'd like it to write to new temp index files and when the rebuild is done have those files overwrite the current index.

我所说的例子:

  • 构建新索引(约 30 秒)
  • 索引有大约 500 个文档
  • 使用代码访问索引中的数据并显示在网站上
  • 重建索引(约 30 秒)
    • 现在读取数据索引的任何代码都不会返回任何内容,因为索引文件正在被覆盖;导致网站未显示任何数据

    提前致谢

    推荐答案

    我对Sitecore"本身没有经验,但这是我的故事.

    I have no experience with "Sitecore" itself but here's my story.

    我们最近为我们的电子商务子系统合并了基于索引的搜索(使用 Lucene.Net).我们案例的索引更新过程可能需要大约半小时(约 50,000 个产品本身 + 大量相关信息).为了防止在更新索引期间出现拒绝服务"响应,我们首先创建它的备份"版本(只需将索引目录复制到另一个位置),并且所有进一步的请求都被重定向到使用这个备份"版本.索引更新完成后,我们删除备份,以便客户端开始使用更新(或实时")版本的索引.这对于在更新过程中可能发生的任何未处理的异常也很有帮助,因为您最终可能会遇到根本没有索引的情况(在我们的情况下,客户端始终可以使用备份"版本).

    We've recently incorporated the index-based search (using Lucene.Net) for our eCommerce sub-system. The index update process for our case might take about half a hour (~50,000 products themselves + lots of related information). To prevent a "denial of service" responses during the update of the index we first create a "backup" version of the it (simply copying index directory to another location) and all further requests are redirected to use this "backup" version. When the index update is completed we delete the backup in order for clients to start using the updated (or "live") version of the index. This is also helps in case of any unhandled exceptions that might occur during the update process becase you might end up in a situation of having no index at all (and in our case clients can always use the "backup" version).

    API 参考 (Lucene 2.4) 声明如下:

    The API reference (Lucene 2.4) of the Lucene.Net.Index.IndexWriter object states the following:

    请注意,您可以使用以下命令打开索引create=true 即使读者在使用索引.老读者会继续搜索时间点"他们已经打开的快照,并且不会查看新创建的索引,直到他们重新打开.

    Note that you can open an index with create=true even while readers are using the index. The old readers will continue to search the "point in time" snapshot they had opened, and won't see the newly created index until they re-open.

    因此,至少您不必担心当前在您的索引中搜索的客户端.

    So at least you shouldn't worry about the clients that are currently searching within your index.

    希望这将帮助您做出正确的决定.

    Hope this will help you to make a right decision.

    这篇关于避免在重建期间删除当前的 Lucene.NET 索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆