如何在Java中快速检索目录列表? [英] How to retrieve a list of directories QUICKLY in Java?
问题描述
这适用于微不足道的情况,但是当文件夹有150,000个文件和2个子文件夹时,它会在所有文件中循环45秒, file.isDirectory()。有没有更好的方式来列出子目录?
PS。对不起,请保存在同一目录中有太多文件的讲座。正如已经提到的那样,这是一个基本的硬件问题。磁盘访问总是很慢,大多数文件系统并没有真正用于处理那些有很多文件的目录。
如果由于某种原因必须将所有文件存储在相同的目录,我想你将不得不维护自己的缓存。这可以使用本地数据库,如sqlite,HeidiSQL或HSQL来完成。如果你想获得极高的性能,可以使用一个Java TreeSet并将其缓存在内存中。这意味着你至少不得不经常阅读目录,而且可能在后台完成。您可以通过使用系统本地文件更新通知API(inotify on linux)来订阅对目录的更改,从而进一步刷新列表的需要。
对你来说似乎没有可能,但是我曾经通过将文件散列到子目录中来解决了一个类似的问题。在我的情况下,挑战是用数字ID来存储几千万的图像。我构建了如下的目录结构:
$ $ $ $ $ $ $ $图片/ [id - (id%1000000)] / [id - (id% 1000)] / [id] .jpg
这对我们来说非常合适,我会推荐。通过简单地取文件名的前两个字母,然后再接下来的两个字母,就可以做类似于字母数字文件名的事情。我也这样做了一次,也做了这个工作。
Suppose a very simple program that lists out all the subdirectories of a given directory. Sound simple enough? Except the only way to list all subdirectories in Java is to use FilenameFilter combined with File.list().
This works for the trivial case, but when the folder has say 150,000 files and 2 sub folders, it's silly waiting there for 45 seconds iterating through all the files and testing for file.isDirectory(). Is there a better way to list sub directories??
PS. Sorry, please save the lectures on having too many files in the same directory. Our live environment has this as part of the requirement.
As has already been mentioned, this is basicly a hardware problem. Disk access is always slow, and most file systems aren't really designed to handle directories with that many files.
If you for some reason have to store all the files in the same directory, I think you'll have to maintain your own cache. This could be done using a local database such as sqlite, HeidiSQL or HSQL. If you want extreme performance, use a java TreeSet and cache it in memory. This means at the very least that you'll have to read the directory less often, and it could possibly be done in the background. You could reduce the need to refresh the list even further by using your systems native file update notification API (inotify on linux) to subscribe to changes to the directory.
This doesn't seem to be possible for you, but I once solved a similiar problem by "hashing" the files into subdirectories. In my case, the challenge was to store a couple of millions images with numeric ids. I constructed the directory structure as follows:
images/[id - (id % 1000000)]/[id - (id % 1000)]/[id].jpg
This has worked well for us, and it's the solution that I would recommend. You could do something similiar to alpha-numeric filenames by simply taking the first two letters of the filename, and then the next two letters. I've done this as well once, and it did the job as well.
这篇关于如何在Java中快速检索目录列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!