LockObtainFailedException 使用 solr 更新 Lucene 搜索索引 [英] LockObtainFailedException updating Lucene search index using solr

查看:16
本文介绍了LockObtainFailedException 使用 solr 更新 Lucene 搜索索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在谷歌上搜索了很多.大多数这些问题是由 JVM 崩溃后遗留的锁引起的.这不是我的情况.

I've googled this a lot. Most of these issues are caused by a lock being left around after a JVM crash. This is not my case.

我有一个包含多个读者和作者的索引.我正在尝试进行质量索引更新(删除和添加 - 这就是 lucene 进行更新的方式).我正在使用 solr 的嵌入式服务器(org.apache.solr.client.solrj.embedded.EmbeddedSolrServer).其他作者正在使用远程非流式服务器 (org.apache.solr.client.solrj.impl.CommonsHttpSolrServer).

I have an index with multiple readers and writers. I'm am trying to do a mass index update (delete and add -- that's how lucene does updates). I'm using solr's embedded server (org.apache.solr.client.solrj.embedded.EmbeddedSolrServer). Other writers are using the remote, non-streaming server (org.apache.solr.client.solrj.impl.CommonsHttpSolrServer).

我开始这个大规模更新,它运行了一段时间,然后死了

I kick off this mass update, it runs fine for a while, then dies with a

原因:org.apache.lucene.store.LockObtainFailedException:锁获取超时:NativeFSLock@/.../lucene-ff783c5d8800fd9722a95494d07d7e37-write.lock

Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/.../lucene-ff783c5d8800fd9722a95494d07d7e37-write.lock

我在 solrconfig.xml 中调整了我的锁定超时

I've adjusted my lock timeouts in solrconfig.xml

<writeLockTimeout>20000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>

我即将开始阅读 lucene 代码来解决这个问题.任何帮助所以我不必这样做会很棒!

I'm about to start reading the lucene code to figure this out. Any help so I don't have to do this would be great!

我所有的更新都通过以下代码(Scala):

All my updates go through the following code (Scala):

val req = new UpdateRequest
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, false, false)
req.add(docs)

val rsp = req.process(solrServer)

solrServer 是 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer、org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer 或 org.apache.solr 的实例.client.solrj.embedded.EmbeddedSolrServer.

solrServer is an instance of org.apache.solr.client.solrj.impl.CommonsHttpSolrServer, org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer, or org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.

另一个我停止使用 EmbeddedSolrServer,现在可以使用了.我有两个单独的进程来更新 solr 搜索索引:

ANOTHER I stopped using EmbeddedSolrServer and it works now. I have two separate processes that update the solr search index:

1) 小服务程序2) 命令行工具

1) Servlet 2) Command line tool

命令行工具正在使用 EmbeddedSolrServer,它最终会因 LockObtainFailedException 而崩溃.当我开始使用 StreamingUpdateSolrServer 时,问题就消失了.

The command line tool was using the EmbeddedSolrServer and it would eventually crash with the LockObtainFailedException. When I started using StreamingUpdateSolrServer, the problems went away.

我仍然有点困惑 EmbeddedSolrServer 到底能不能工作.有人可以解释一下.我认为它会与 Servlet 进程配合得很好,他们会在另一个进程写入时等待.

I'm still a little confused that the EmbeddedSolrServer would work at all. Can someone explain this. I thought that it would play nice with the Servlet process and they would wait while the other is writing.

推荐答案

我假设您正在执行以下操作:

I'm assuming that you're doing something like:

writer1.writeSomeStuff();
writer2.writeSomeStuff();  // this one doesn't write

这行不通的原因是,除非你关闭它,否则作家会保持打开状态.所以 writer1 写入并保持锁定,即使在它完成写入之后.(一旦 writer 获得锁,它永远不会释放,直到它被销毁.) writer2 无法获得锁,因为 writer1 仍然持有它,所以它抛出一个 LockObtainFailedException.

The reason this won't work is because the writer stays open unless you close it. So writer1 writes and holds on to the lock, even after it's done writing. (Once a writer gets a lock, it never releases until it's destroyed.) writer2 can't get the lock, since writer1 is still holding onto it, so it throws a LockObtainFailedException.

如果您想使用两个作家,您需要执行以下操作:

If you want to use two writers, you'd need to do something like:

writer1.writeSomeStuff();
writer1.close();
writer2.open();
writer2.writeSomeStuff();
writer2.close();

由于您一次只能打开一个编写器,这几乎抵消了使用多个编写器所带来的任何好处.(实际上一直打开和关闭它们会更糟糕,因为您将不断地付出热身惩罚.)

Since you can only have one writer open at a time, this pretty much negates any benefit you would get from using multiple writers. (It's actually much worse to open and close them all the time since you'll be constantly paying a warmup penalty.)

因此,我怀疑您的潜在问题的答案是:不要使用多个作家.使用具有多个线程访问它的单个编写器(IndexWriter 是线程安全的).如果您通过 REST 或其他一些 HTTP API 连接到 Solr,则单个 Solr 编写器应该能够处理许多请求.

So the answer to what I suspect is your underlying question is: don't use multiple writers. Use a single writer with multiple threads accessing it (IndexWriter is thread safe). If you're connecting to Solr via REST or some other HTTP API, a single Solr writer should be able to handle many requests.

我不确定您的用例是什么,但另一个可能的答案是查看 Solr 的建议 用于管理多个索引.尤其是热插拔内核的能力可能会引起人们的兴趣.

I'm not sure what your use case is, but another possible answer is to see Solr's Recommendations for managing multiple indices. Particularly the ability to hot-swap cores might be of interest.

这篇关于LockObtainFailedException 使用 solr 更新 Lucene 搜索索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆