删除文档后 SOLr 索引大小是否会减小? [英] Do SOLr index size decrease after deleting documents?

查看:32
本文介绍了删除文档后 SOLr 索引大小是否会减小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 SOLr 实例,我在其中索引了来自客户端的大量文档,以便用户可以在 Web 应用程序中搜索它们.

I have a SOLr instance where i index a large number of documents from my client so users can search them in a web application.

因为我们有大量文件,他们只需要搜索最近的文件(90 天左右),我们有一个计划的工作,从索引中删除旧文档.

Because we have a large number of files and they need to search the recent ones only (90 days or so) we have a scheduled job that remove old documents from index.

问题是,磁盘空间每天增加大约 2Gb,即使有删除.

The problem is, the disk space is increasing about 2Gb a day, even with the deletions.

这是正常行为还是我们应该采取更多措施来保持索引大小稳定?

Is this a normal behavior or should we do something more to keep index in a stable size?

我们正在使用 Java 应用程序向索引添加和删除文件.

We are using a Java application to add and remove files to the index.

推荐答案

删除只会将文档标记为已删除——它们仍然存在于索引中.由于删除它们需要重写索引文件,因此不会执行实际删除在您发出优化命令之前.

Deletions will only mark documents as deleted - they're still present in the index. Since removing them would require rewriting the index files, the actual removal is not performed before you issue an optimize command.

当您发出提交时,还有一个 expungeDeletes 选项,但据我所知,最好在正常工作时间之外发出优化.如果您每晚删除文档,您可以在删除后发出优化,甚至更频繁地发出,例如每两天或三天.

There's also an option to expungeDeletes when you issue a commit, but as far as I can see, it's better to issue an optimize outside of normal operating hours. If you remove documents nightly, you can issue the optimize after removal, or even more infrequent, such as every second or third day.

优化需要与索引占用相同数量的可用磁盘空间(因为最坏的情况是整个索引被再次写入).

Optimizing requires the same amount in free disk space as the index takes up (since worst case is the whole index being written again).

这篇关于删除文档后 SOLr 索引大小是否会减小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆