为什么使用java.nio.files.File :: list会导致此广度优先的文件遍历程序因“打开的文件太多"而崩溃.错误? [英] Why does usage of java.nio.files.File::list is causing this breadth-first file traversal program to crash with the "Too many open files" error?
问题描述
Stream
是惰性的,因此以下语句不会将path
引用的目录的整个子级加载到内存中;而是逐个加载它们,并且在每次调用forEach
之后,p
引用的目录都可以进行垃圾回收,因此其文件描述符也应该关闭:
Stream
s are lazy, hence the following statement does not load the entire children of the directory referenced by the path
into memory; instead it loads them one by one, and after each invocation of forEach
, the directory referenced by p
is eligible for garbage collection, so its file descriptor should also become closed:
Files.list(path).forEach(p ->
absoluteFileNameQueue.add(
p.toAbsolutePath().toString()
)
);
基于此假设,我实现了广度优先的文件遍历工具:
Based on this assumption, I have implemented a breadth-first file traversal tool:
public class FileSystemTraverser {
public void traverse(String path) throws IOException {
traverse(Paths.get(path));
}
public void traverse(Path root) throws IOException {
final Queue<String> absoluteFileNameQueue = new ArrayDeque<>();
absoluteFileNameQueue.add(root.toAbsolutePath().toString());
int maxSize = 0;
int count = 0;
while (!absoluteFileNameQueue.isEmpty()) {
maxSize = max(maxSize, absoluteFileNameQueue.size());
count += 1;
Path path = Paths.get(absoluteFileNameQueue.poll());
if (Files.isDirectory(path)) {
Files.list(path).forEach(p ->
absoluteFileNameQueue.add(
p.toAbsolutePath().toString()
)
);
}
if (count % 10_000 == 0) {
System.out.println("maxSize = " + maxSize);
System.out.println("count = " + count);
}
}
System.out.println("maxSize = " + maxSize);
System.out.println("count = " + count);
}
}
我以一种非常简单的方式使用它:
And I use it in a fairly straightforward way:
public class App {
public static void main(String[] args) throws IOException {
FileSystemTraverser traverser = new FileSystemTraverser();
traverser.traverse("/media/Backup");
}
}
/media/Backup
中安装的磁盘大约有300万个文件.
The disk mounted in /media/Backup
has about 3 million files.
由于某种原因,大约140,000标记,程序由于以下堆栈跟踪而崩溃:
For some reason, around the 140,000 mark, the program crashes with this stack trace:
Exception in thread "main" java.nio.file.FileSystemException: /media/Backup/Disk Images/Library/Containers/com.apple.photos.VideoConversionService/Data/Documents: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:427)
at java.nio.file.Files.newDirectoryStream(Files.java:457)
at java.nio.file.Files.list(Files.java:3451)
在我看来,由于某些原因,文件描述符未关闭或Path
对象未进行垃圾回收,导致应用最终崩溃.
It seems to me for some reason the file descriptors are not getting closed or the Path
objects are not garbage collected that causes the app to eventually crash.
- 操作系统:是Ubuntu 15.0.4
- 内核:4.4.0-28通用
- ulimit:无限
- 文件系统:btrfs
- Java运行时:已与OpenJDK 1.8.0_91和Oracle JDK 1.8.0_91一起测试
有什么想法我在这里遗漏了什么,以及如何解决此问题(不求助于java.io.File::list
(即,停留在NIO2和Path
s的范围内))?
Any ideas what am I missing here and how can I fix this problem (without resorting to java.io.File::list
(i.e. by staying within the ream of NIO2 and Path
s)?
我怀疑JVM是否使文件描述符保持打开状态.我把这个堆转储放在了大约120,000个文件标记处:
I doubt that JVM is keeping the file descriptors open. I took this heap dump around the 120,000 files mark:
我在VisualVM中安装了一个文件描述符探测插件,实际上它表明FD并没有得到处理(如cerebrotecnologico和k5正确指出的那样):
I installed a file descriptor probing plugin in VisualVM and indeed it revealed that the FDs are not getting disposed of (as correctly pointed out by cerebrotecnologico and k5):
推荐答案
似乎没有正确关闭从Files.list(Path)返回的流.另外,您不应该在流上不使用forEach并不确定它不是并行的(因此.sequential()).
Seems like the Stream returned from Files.list(Path) is not closed correctly. In addition you should not be using forEach on a stream you are not certain it is not parallel (hence the .sequential()).
try (Stream<Path> stream = Files.list(path)) {
stream.map(p -> p.toAbsolutePath().toString()).sequential().forEach(absoluteFileNameQueue::add);
}
这篇关于为什么使用java.nio.files.File :: list会导致此广度优先的文件遍历程序因“打开的文件太多"而崩溃.错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!