如何在java中为文件上传设置UTF-8? [英] How to set UTF-8 for file upload in java?
问题描述
我有下面的文件上传功能:
I have function to get file upload below :
public static Map<Integer, Map<String, byte[]>> getFiles(IMultipartBody bimp) {
List<IAttachment> parts = bimp.getAllAttachments();
Iterator<IAttachment> it = parts.iterator();
ByteArrayOutputStream baos = null;
InputStream inputStream = null;
String fileName = null;
byte[] bytes = null;
Map<Integer, Map<String, byte[]>> files = new HashMap<Integer, Map<String, byte[]>>();
Map<String, String> duplicateFileMap = new HashMap<String, String>();
int counter = 0;
while (it.hasNext()) {
try {
IAttachment name = (IAttachment) it.next();
MultivaluedMap<String, String> headers = name.getHeaders();
if (headers.get("Content-Disposition") != null
&& !headers.get("Content-Disposition").isEmpty()) {
String header = headers.get("Content-Disposition").get(0);
String[] dispositions = header.split(";");
for (String disposition : dispositions) {
if (disposition.indexOf("filename") != -1) {
String tmpStr = disposition.substring(
disposition.indexOf("=") + 1,
disposition.length()).replaceAll("\"",
Constant.EMPTY);
ByteBuffer byteBuffs = StandardCharsets.UTF_8.encode(tmpStr);
fileName = StandardCharsets.UTF_8.decode(byteBuffs).toString();
// fileName = new String(tmpStr.getBytes(), Charset.forName("UTF-8"));
}
}
}
inputStream = name.getDataHandler().getInputStream();
baos = new ByteArrayOutputStream();
int reads = inputStream.read();
while (reads != -1) {
baos.write(reads);
reads = inputStream.read();
}
bytes = baos.toByteArray();
if (bytes == null || bytes.length < 1) {
continue;
}
Map<String, byte[]> file = new HashMap<String, byte[]>();
if (fileName != null ){
// Fix for firefox, remove '/'
if (fileName.startsWith("/")){
fileName = fileName.substring(1);
}
// Fix for IE, remove physical address, only get file name
if (fileName.lastIndexOf("\\") != -1 ){
fileName = fileName.substring(fileName.lastIndexOf("\\") + 1);
}
}
String md5 = generateMD5CheckSum(bytes);
if (duplicateFileMap.containsKey(md5)
&& duplicateFileMap.get(md5).equalsIgnoreCase(fileName)){
continue;
}
counter++;
file.put(fileName, bytes);
duplicateFileMap.put(md5,fileName);
files.put(Integer.valueOf(counter), file);
} catch (IOException e) {
e.printStackTrace();
LOGGER.error(e.getMessage());
} finally {
try {
if (inputStream != null) {
inputStream.close();
}
if (baos != null) {
baos.close();
}
} catch (IOException e) {
e.printStackTrace();
LOGGER.error(e.getMessage());
}
}
}
return files;
}
但是当我调试文件上传有文件名:ALMS_ขั้นตอนลงทะเบียน.pdf(泰语)时,附件的标题如下:
But when I debug with file upload has fileName: ALMS_ขั้นตอนลงทะเบียน.pdf (it is Thai language), the headers of Attachment have below:
{Content-Disposition=[form-data;名称=文件";filename="ALMS_ขั้นตà¸à¸™à¸¥à¸‡à¸—ะเà¸à¸µà¸à¸™.pdf"], Content-Type=[application/pdf], Content-ID=[root.message@cxf.apache.org]}
{Content-Disposition=[form-data; name="file"; filename="ALMS_ขั้นตà¸à¸™à¸¥à¸‡à¸—ะเบียน.pdf"], Content-Type=[application/pdf], Content-ID=[root.message@cxf.apache.org]}
我认为 IMultipartBody 在上传之前没有设置 UTF-8.任何人都可以帮我解决这个问题吗?谢谢.
I think the IMultipartBody is not set UTF-8 before uploaded. Anyone can help me resolve this problem? Thanks.
推荐答案
Content-Disposition 标头的使用包含在 RFC6266
Use of Content-Disposition header is covered by the RFC6266
filename
属性必须以 ISO-8859-1 编码.可以使用相同的名称属性后跟星号、filename*
和 URL 编码的文件名来支持其他字符集.
The filename
attribute must be encoded in ISO-8859-1. Other charsets can be supported using the same name attribute followed by an asterisk, filename*
, and a URL encoded filename.
请参阅 RFC 的示例第 5 节,了解以 UTF-8 编码的文件名€ rates"(欧元汇率):
See the example section 5 of the RFC, for the filename "€ rates" (euro rates) encoded in UTF-8:
filename*=UTF-8''%e2%82%ac%20rates
是的,这是一个奇怪的符号,不是错字:原始属性名称后跟星号,值以编码 (UTF-8) 开头,后跟两个引号,以及 URL 编码的文件名(注意是路径编码,而不是参数编码:空格被替换为 %20,而不是 +).
Yes, that's a weird notation, not a typo: the original attribute name followed by an asterisk, and the value starts with the encoding (UTF-8) followed by two quotes, and the filename URL-encoded (note that is path encoding, not parameter encoding: spaces are replaced by %20, not +).
这篇关于如何在java中为文件上传设置UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!