文件下载(IE)上的文件名损坏 [英] File name corruption on file download (IE)
问题描述
我已经实现了一个简单的文件上传下载机制。当用户单击文件名时,将使用以下HTTP标头下载文件:
I have implemented a simple file upload-download mechanism. When a user clicks a file name, the file is downloaded with these HTTP headers:
HTTP/1.1 200 OK
Date: Tue, 30 Sep 2008 14:00:39 GMT
Server: Microsoft-IIS/6.0
Content-Disposition: attachment; filename=filename.doc;
Content-Type: application/octet-stream
Content-Length: 10754
我也支持日文文件名。为了做到这一点,我用这个java方法编码文件名:
I also support Japanese file names. In order to do that, I encode the file name with this java method:
private String encodeFileName(String name) throws Exception{
String agent = request.getHeader("USER-AGENT");
if(agent != null && agent.indexOf("MSIE") != -1){ // is IE
StringBuffer res = new StringBuffer();
char[] chArr = name.toCharArray();
for(int j = 0; j < chArr.length; j++){
if(chArr[j] < 128){ // plain ASCII char
if (chArr[j] == '.' && j != name.lastIndexOf("."))
res.append("%2E");
else
res.append(chArr[j]);
}
else{ // non-ASCII char
byte[] byteArr = name.substring(j, j + 1).getBytes("UTF8");
for(int i = 0; i < byteArr.length; i++){
// byte must be converted to unsigned int
res.append("%").append(Integer.toHexString((byteArr[i]) & 0xFF));
}
}
}
return res.toString();
}
// Firefox/Mozilla
return MimeUtility.encodeText(name, "UTF8", "B");
}
到目前为止,它运作良好,直到有人发现它不起作用以及长文件名。例如:ああああああああああああああああああああああああああああああああああああああああああああああああああ如果我将其中一个单字节点更改为单字节下划线,或者如果我删除第一个字符,则它可以正常工作。即,它取决于点字符的长度和URL编码。
以下是一些例子。
It worked well so far, until someone found out that it doesn't work well with long file names. For example: あああああああああああああああ2008.10.1あ.doc
. If I change one of the single-byte dots to a single-byte underline , or if I remove the first character, it works OK. i.e., it depends on length and URL-encoding of a dot character.
Following are a few examples.
此断开(あああああああああああああああ2008年10月1日あ.DOC
):
Content-Disposition: attachment; filename=%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%822008%2E10%2E1%e3%81%82.doc;
这是OK(あああああああああああああああ2008_10.1あ.DOC
):
Content-Disposition: attachment; filename=%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%822008_10%2E1%e3%81%82.doc;
这也是细(あああああああああああああああ2008年10月1日あ.DOC
):
Content-Disposition: attachment; filename=%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%822008%2E10%2E1%e3%81%82.doc;
有人知道吗?
推荐答案
gmail处理文件名的转义方式略有不同:引用文件名(双引号),单字节句点不进行URL转义。
这样,问题中的长文件名就可以了。
gmail handles file name escaping somewhat differently: the file name is quoted (double-quotes), and single-byte periods are not URL-escaped. This way, the long file name in the question is OK.
Content-Disposition: attachment; filename="%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%822008.10.1%E3%81%82.doc"
然而,字节上仍有一个限制(显然只有IE)文件名的长度(我猜的是一个bug)。因此,即使文件名仅由单字节字符组成,文件名的开头也会被截断。
限制大约是160字节。
However, there is still a limitation (apparently IE-only) on the byte-length of the file name (a bug, I assume). So even if the file name is made of only single-byte characters, the beginning of the file name is truncated. The limitation is around 160 bytes.
这篇关于文件下载(IE)上的文件名损坏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!