如何高效地管理Java文件系统上的文件? [英] How to efficiently manage files on a filesystem in Java?

查看:171
本文介绍了如何高效地管理Java文件系统上的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建几个JAX-WS端点,为此我希望保存收到和发送的消息供以后检查。为了做到这一点,我计划将这些消息(XML文件)保存在文件系统中,在一些合理的层次结构中。每天将有数百甚至数千个文件。我还需要为每个文件存储元数据。

我正在考虑将元数据(只是几个字段)放到数据库表中,但将XML文件内容本身文件在一个文件系统,以防止膨胀数据库的内容数据(这是很少阅读)。

有一些简单的库,可以帮助我保存,加载,删除等文件?自己实现它并不难,但是我想知道是否有现成的解决方案?只是一个简单的库,已经提供了方便地访问文件系统(最好在不同的操作系统)。

或者我甚至需要,我应该只是去原始/定制的Java
是否有一些简单的库,
帮助我保存,加载,解压缩

解决方案

删除
等文件?这并不是那么棘手的
来实现,但是如果存在解决方案,我想知道
吗?只需
a简单的库,已经提供了
方便的文件系统访问(最好
在不同的操作系统上)。



那么,如果你需要做的事情非常简单,你应该能够用 java.io.File (delete,检查是否存在,读取,写入等)和一些流操作与 FileInputStream FileOutputStream中

您也可以在Apache commons-io 及其方便的 FileUtils 更多的实用功能。

Java独立于操作系统。您只需确保使用 File.pathSeparator ,或使用构造函数 File(File parent,String child)所以你不需要明确地提到分隔符。

Java文件API是相对较高级的,可以抽象出许多操作系统的差异。大部分时间都足够了。只有当你需要一些不属于API的相对OS特有的特征时,它也有一些缺点,例如检查磁盘上文件的物理大小(而不是逻辑大小),* nix的安全权限,硬盘的可用空间/配额等。

大多数操作系统都有一个用于文件读写的内部缓冲区。使用 FileOutputStream.write FileOutputStream.flush 确保数据已经发送到操作系统,但是没有必要写在磁盘。 Java API也支持这种低级集成来管理这些缓冲问题(例如此处这个文件和目录都被抽象为 File ,而你需要检查 isDirectory 。这可能令人困惑,例如,如果你有一个文件 x 和一个目录 / x (我不请记住如何处理这个问题,但有一个方法)。
$ b $ p Web服务



Web服务可以使用 xs:base64Binary 传递数据,或者使用 MTOM (消息传输优化机制)(如果文件很大)

事务注意数据库是事务性的,文件系统不是。因此,如果操作失败并重新尝试,您可能需要添加一些检查。

您可以使用涉及某种形式的分布式事务的复杂设计(请参阅
$ b

  • 更新。如果用户想要覆盖文件,则实际上会创建一个新文件。逻辑文件名和物理文件之间的间接级别存储在数据库中。这样,一次写入就不会覆盖物理文件,以确保回滚是一致的。

  • 创建。用户想要创建文件的同样的故事

  • 删除。如果用户想要删除一个文件,那么只能在数据库中执行。定期作业轮询文件系统以识别未在数据库中列出的文件,并将其删除。这两个阶段删除确保删除操作可以回滚。



  • 这不像在实际事务数据库中写入BLOB那么健壮,但是提供了一些健壮性。你可以看看 commons-transaction ,但是我觉得这个项目已经死了(2007年)。


    I am creating a few JAX-WS endpoints, for which I want to save the received and sent messages for later inspection. To do this, I am planning to save the messages (XML files) into filesystem, in some sensible hierarchy. There will be hundreds, even thousands of files per day. I also need to store metadata for each file.

    I am considering to put the metadata (just a couple of fields) into database table, but the XML file content itself into files in a filesystem in order not to bloat the database with content data (that is seldomly read).

    Is there some simple library that helps me in saving, loading, deleting etc. the files? It's not that tricky to implement it myself, but I wonder if there are existing solutions? Just a simple library that already provides easy access to filesystem (preferrably over different operating systems).

    Or do I even need that, should I just go with raw/custom Java?

    解决方案

    Is there some simple library that helps me in saving, loading, deleting etc. the files? It's not that tricky to implement it myself, but I wonder if there are existing solutions? Just a simple library that already provides easy access to filesystem (preferrably over different operating systems).

    Java API

    Well, if what you need to do is really simple, you should be able to achieve your goal with java.io.File (delete, check existence, read, write, etc.) and a few stream manipulations with FileInputStream and FileOutputStream.

    You can also throw in Apache commons-io and its handy FileUtils for a few more utility functions.

    Java is independent of the OS. You just need to make sure you use File.pathSeparator, or use the constructor File(File parent, String child) so that you don't need to explicitly mention the separator.

    The Java file API is relatively high-level to abstract the differences of the many OS. Most of the time it's sufficient. It has some shortcomings only if you need some relatively OS-specific feature which is not in the API, e.g. check the physical size of a file on the disk (not the the logical size), security rights on *nix, free space/quota of the hard drive, etc.

    Most OS have an internal buffer for file writing/reading. Using FileOutputStream.write and FileOutputStream.flush ensure the data have been sent to the OS, but not necessary written on the disk. The Java API support also this low-level integration to manage these buffering issue (example here) for system such as database.

    Also both file and directory are abstracted with File and you need to check with isDirectory. This can be confusing, for instance if you have one file x, and one directory /x (I don't remember exactly how to handle this issue, but there is a way).

    Web service

    The web service can use either xs:base64Binary to pass the data, or use MTOM (Message Transmission Optimization Mechanism) if files are large.

    Transactions

    Note that the database is transactional and the file system not. So you might have to add a few checks if operations fails and are re-tried.

    You could go with a complicated design involving some form of distributed transaction (see this answer), or try to go with a simpler design that provides the level of robustness that you need. A possible design could be:

    • Update. If the user wants to overwrite a file, you actually create a new one. The level of indirection between the logical file name and the physical file is stored in database. This way you never overwrite a physical file once written, to ensure rollback is consistent.
    • Create. Same story when user want to create a file
    • Delete. If the user want to delete a file, you do it only in database first. A periodic job polls the file system to identify files which are not listed in database, and removes them. This two-phase deletes ensures that the delete operation can be rolled back.

    This is not as robust as writting BLOB in real transactional database, but provide some robustness. You could otherwise have a look at commons-transaction, but I feel like the project is dead (2007).

    这篇关于如何高效地管理Java文件系统上的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆