如何在 Java 中快速检索目录列表? [英] How to retrieve a list of directories QUICKLY in Java?

查看:28
本文介绍了如何在 Java 中快速检索目录列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设一个非常简单的程序列出了给定目录的所有子目录.听起来够简单吗?除了在 Java 中列出所有子目录的唯一方法是使用 FilenameFilter 结合 File.list().

Suppose a very simple program that lists out all the subdirectories of a given directory. Sound simple enough? Except the only way to list all subdirectories in Java is to use FilenameFilter combined with File.list().

这适用于微不足道的情况,但是当文件夹有 150,000 个文件和 2 个子文件夹时,在那里等待 45 秒迭代所有文件并测试 file.isDirectory() 是愚蠢的.有没有更好的方法来列出子目录?

This works for the trivial case, but when the folder has say 150,000 files and 2 sub folders, it's silly waiting there for 45 seconds iterating through all the files and testing for file.isDirectory(). Is there a better way to list sub directories??

附注.抱歉,请保存关于同一目录中文件过多的讲座.我们的生活环境将此作为要求的一部分.

PS. Sorry, please save the lectures on having too many files in the same directory. Our live environment has this as part of the requirement.

推荐答案

如前所述,这基本上是硬件问题.磁盘访问总是很慢,而且大多数文件系统并不是真正设计用来处理包含这么多文件的目录.

As has already been mentioned, this is basicly a hardware problem. Disk access is always slow, and most file systems aren't really designed to handle directories with that many files.

如果您出于某种原因必须将所有文件存储在同一目录中,我认为您必须维护自己的缓存.这可以使用本地数据库(例如 sqlite、HeidiSQL 或 HSQL)来完成.如果您想要极致性能,请使用 java TreeSet 并将其缓存在内存中.这意味着至少您不必经常阅读目录,并且可能会在后台完成.通过使用系统本机文件更新通知 API(Linux 上的 inotify)订阅目录更改,您可以进一步减少刷新列表的需要.

If you for some reason have to store all the files in the same directory, I think you'll have to maintain your own cache. This could be done using a local database such as sqlite, HeidiSQL or HSQL. If you want extreme performance, use a java TreeSet and cache it in memory. This means at the very least that you'll have to read the directory less often, and it could possibly be done in the background. You could reduce the need to refresh the list even further by using your systems native file update notification API (inotify on linux) to subscribe to changes to the directory.

这对您来说似乎是不可能的,但是我曾经通过将文件散列"到子目录中解决了一个类似的问题.就我而言,挑战是存储数百万张带有数字 ID 的图像.我构建的目录结构如下:

This doesn't seem to be possible for you, but I once solved a similiar problem by "hashing" the files into subdirectories. In my case, the challenge was to store a couple of millions images with numeric ids. I constructed the directory structure as follows:

images/[id - (id % 1000000)]/[id - (id % 1000)]/[id].jpg

这对我们来说效果很好,这是我推荐的解决方案.您可以通过简单地获取文件名的前两个字母,然后获取接下来的两个字母来执行类似于字母数字文件名的操作.我也这样做过一次,它也完成了这项工作.

This has worked well for us, and it's the solution that I would recommend. You could do something similiar to alpha-numeric filenames by simply taking the first two letters of the filename, and then the next two letters. I've done this as well once, and it did the job as well.

这篇关于如何在 Java 中快速检索目录列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆