如何从文件系统获取文件属性流? [英] How to get Streams of File Attributes from the FileSystem?
问题描述
我正在编写一个 Web 服务器,并试图确保我尽可能高效,尽量减少文件系统调用.问题在于返回 Streams 的方法,例如 java.nio.file.Files.list 返回 Paths
的 Stream,我想要 BasicFileAttributes,以便我可以返回每个路径的创建时间和更新时间(比如返回 LDP 容器).
I am writing a Web Server and am trying to make sure I am as efficient as possible, minimizing File System Calls. The problem is that the methods that return Streams such as java.nio.file.Files.list return a Stream of Paths
, and I would like to have a Stream of BasicFileAttributes, so that I can return the creation time and update time for each Path (on say returning results for an LDP Container).
当然,一个简单的解决方案是使用一个函数来map
Stream 的每个元素,该函数获取路径并返回文件属性 (p: Path) =>Files.getAttributeView...
但这听起来像是会为每个 Path 调用 FS,这似乎是一种浪费,因为要获取文件信息,JDK 不能远离 Attribute 信息.
Of course a simple solution would be to map
each element of the Stream with a function that takes the path and returns a file attribute (p: Path) => Files.getAttributeView...
but that sounds like it would make a call to the FS for each Path, which seems like a waste, because to get the file information the JDK can't have been far from the Attribute info.
我实际上是从 2009 OpenJDK 中看到这封邮件的邮件列表表明他们已经讨论过添加一个可以返回一对路径和属性的 API...
I actually came across this mail from 2009 OpenJDK mailing list that states that they had discussed adding an API that would return a pair of a Path and Attributes...
我在 JDK java.nio.file.FileTreeWalker
上找到了一个非公共类,它有一个允许获取属性 FileTreeWalker.Event
的 api.这实际上利用了一个 sun.nio.fs.BasicFileAttributesHolder
,它允许 Path 保留属性的缓存.但它不是公开的,也不清楚它在哪里工作.
I found a non-public class on the JDK java.nio.file.FileTreeWalker
which has an api that would allow one to fetch the attributes FileTreeWalker.Event
. That actually makes use of a sun.nio.fs.BasicFileAttributesHolder
which allows a Path to keep a cache of the Attributes. But it's not public and it is not clear where it works.
当然还有整个 FileVisitor API,并且具有返回 Path
和 BasicFileAttributes
的方法,如下所示:
There is of course also the whole FileVisitor API, and that has methods that return both a Path
and BasicFileAttributes
as shown here:
public FileVisitResult visitFile(Path file, BasicFileAttributes attr) {...}
所以我在寻找是否有办法将它变成一个 Stream,它尊重 Reactive 的背压原则由 Akka 推送的宣言,没有占用太多资源.我检查了开源 Alpakka File 项目,但这也是流媒体返回 Path
s ...
So I am looking if there is a way to turn that into a Stream which respects the principle of back pressure from the Reactive Manifesto that was pushed by Akka, without it hogging too many resources. I checked the open source Alpakka File project, but that is also streaming the Files
methods that return Path
s ...
推荐答案
您可以使用接受 BiPredicateFiles.find
访问文件属性及其路径.并在测试每个路径时存储该值.
You can access file attributes with their path by using Files.find
which accepts a BiPredicate<Path, BasicFileAttributes> and store the value as it tests each path.
BiPredicate 中的副作用操作将启用对两个对象的操作,而无需触及路径中每个项目的文件系统.使用您的谓词条件 yourPred
,下面的副作用 predicate
将收集属性供您在流处理中检索:
The side effect action inside the BiPredicate will enable operations on both objects without needing to touch the file system per item in the path. With your predicate condition yourPred
, side effect predicate
below will collect the attributes for you to retrieve inside the stream processing:
public static void main(String[] args) throws IOException {
Path dir = Path.of(args[0]);
// Use `ConcurrentHashMap` if using `stream.parallel()`
HashMap <Path,BasicFileAttributes> attrs = new HashMap<>();
BiPredicate<Path, BasicFileAttributes> yourPred = (p,a) -> true;
BiPredicate<Path, BasicFileAttributes> predicate = (p,a) -> {
return yourPred.test(p, a)
// && p.getNameCount() == dir.getNameCount()+1 // Simulates Files.list
&& attrs.put(p, a) == null;
};
try(var stream = Files.find(dir, Integer.MAX_VALUE, predicate)) {
stream.forEach(p-> System.out.println(p.toString()+" => "+attrs.get(p)));
// Or: if your put all your handling code in the predicate use stream.count();
}
}
为了模拟 File.list
的效果,使用一级 find
扫描器:
To similate the effect of File.list
use a one level find
scanner:
BiPredicate<Path, BasicFileAttributes> yourPred = (p,a) -> p.getNameCount() == dir.getNameCount()+1;
对于大型文件夹扫描,您应该在使用路径后插入 attrs.remove(p);
来清理 attrs 映射.
For a large folder scan you should clean up the attrs map as you go by inserting attrs.remove(p);
after consuming the path.
编辑
上面的答案可以重构为返回 Map.Entry
流的 3 行调用,或者很容易添加一个类/记录来保存 Path/BasicFileAttribute 对和改为返回 Stream
:
The answer above can be refactored to a 3 line call returning stream of Map.Entry<Path, BasicFileAttributes>
, or it's easy to add a class/record to hold the Path/BasicFileAttribute pair and return Stream<PathInfo>
instead:
/**
* Call Files.find() returning a stream with both Path+BasicFileAttributes
* as type Map.Entry<Path, BasicFileAttributes>
* <p>Could declare a specific record to replace Map.Entry as:
* record PathInfo(Path path, BasicFileAttributes attr) { };
*/
public static Stream<Map.Entry<Path, BasicFileAttributes>>
find(Path dir, int maxDepth, BiPredicate<Path, BasicFileAttributes> matcher, FileVisitOption... options) throws IOException {
HashMap <Path,BasicFileAttributes> attrs = new HashMap<>();
BiPredicate<Path, BasicFileAttributes> predicate = (p,a) -> (matcher == null || matcher.test(p, a)) && attrs.put(p, a) == null;
return Files.find(dir, maxDepth, predicate, options).map(p -> Map.entry(p, attrs.remove(p)));
}
这篇关于如何从文件系统获取文件属性流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!