如何在 Java 中安全地编码字符串以用作文件名? [英] How can I safely encode a string in Java to use as a filename?

查看:18
本文介绍了如何在 Java 中安全地编码字符串以用作文件名?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我收到来自外部进程的字符串.我想使用该字符串来创建文件名,然后写入该文件.这是我的代码片段:

I'm receiving a string from an external process. I want to use that String to make a filename, and then write to that file. Here's my code snippet to do this:

    String s = ... // comes from external source
    File currentFile = new File(System.getProperty("user.home"), s);
    PrintWriter currentWriter = new PrintWriter(currentFile);

如果 s 包含无效字符,例如基于 Unix 的操作系统中的/",则(正确地)抛出 java.io.FileNotFoundException.

If s contains an invalid character, such as '/' in a Unix-based OS, then a java.io.FileNotFoundException is (rightly) thrown.

如何安全地对字符串进行编码,以便将其用作文件名?

How can I safely encode the String so that it can be used as a filename?

我希望有一个 API 调用可以为我执行此操作.

What I'm hoping for is an API call that does this for me.

我可以做到:

    String s = ... // comes from external source
    File currentFile = new File(System.getProperty("user.home"), URLEncoder.encode(s, "UTF-8"));
    PrintWriter currentWriter = new PrintWriter(currentFile);

但我不确定 URLEncoder 是否可靠用于此目的.

But I'm not sure whether URLEncoder it is reliable for this purpose.

推荐答案

如果您希望结果与原始文件相似,SHA-1 或任何其他散列方案都不是答案.如果必须避免冲突,那么简单的替换或删除坏"字符也不是解决办法.

If you want the result to resemble the original file, SHA-1 or any other hashing scheme is not the answer. If collisions must be avoided, then simple replacement or removal of "bad" characters is not the answer either.

相反,您想要这样的东西.(注意:这应该被视为一个说明性的例子,而不是复制和粘贴的东西.)

Instead you want something like this. (Note: this should be treated as an illustrative example, not something to copy and paste.)

char fileSep = '/'; // ... or do this portably.
char escape = '%'; // ... or some other legal char.
String s = ...
int len = s.length();
StringBuilder sb = new StringBuilder(len);
for (int i = 0; i < len; i++) {
    char ch = s.charAt(i);
    if (ch < ' ' || ch >= 0x7F || ch == fileSep || ... // add other illegal chars
        || (ch == '.' && i == 0) // we don't want to collide with "." or ".."!
        || ch == escape) {
        sb.append(escape);
        if (ch < 0x10) {
            sb.append('0');
        }
        sb.append(Integer.toHexString(ch));
    } else {
        sb.append(ch);
    }
}
File currentFile = new File(System.getProperty("user.home"), sb.toString());
PrintWriter currentWriter = new PrintWriter(currentFile);

这个解决方案提供了一种可逆编码(没有冲突),其中编码的字符串在大多数情况下类似于原始字符串.我假设您使用的是 8 位字符.

This solution gives a reversible encoding (with no collisions) where the encoded strings resemble the original strings in most cases. I'm assuming that you are using 8-bit characters.

URLEncoder 有效,但它的缺点是它编码了大量合法的文件名字符.

URLEncoder works, but it has the disadvantage that it encodes a whole lot of legal file name characters.

如果您想要一个不保证可逆的解决方案,那么只需删除坏"字符,而不是用转义序列替换它们.

If you want a not-guaranteed-to-be-reversible solution, then simply remove the 'bad' characters rather than replacing them with escape sequences.

上述编码的逆向实现应该同样简单.

The reverse of the above encoding should be equally straight-forward to implement.

这篇关于如何在 Java 中安全地编码字符串以用作文件名?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆