解析Java中包含multipart/form-data请求主体的String [英] Parse a String containing multipart/form-data request body in Java
问题描述
我认为标题说明了一切:我正在寻找一种解析包含多部分/表单数据HTTP请求的正文部分的 String 的方法. IE.字符串的内容如下所示:
I think the title says it all: I'm looking for the way to parse a String containing the body part of a multipart/form-data HTTP request. I.e. the contents of the string would look something like this:
--xyzseparator-blah
Content-Disposition: form-data; name="param1"
hello, world
--xyzseparator-blah
Content-Disposition: form-data; name="param2"
42
--xyzseparator-blah
Content-Disposition: form-data; name="param3"
blah, blah, blah
--xyzseparator-blah--
我希望获得的是parameters
地图或类似的东西.
What I'm hoping to obtain, is a parameters
map, or something similar, like this.
parameters.get("param1"); // returns "hello, world"
parameters.get("param2"); // returns "42"
parameters.get("param3"); // returns "blah, blah, blah"
parameters.keys(); // returns ["param1", "param2", "param3"]
其他条件
- 最好不要提供分隔符(在这种情况下为
xyzseparator-blah
),但如果需要的话,我可以接受它. - 我正在寻找一种基于库的解决方案,可能是从主流库(例如"Apache Commons"或类似的库)中找到的.
- 我想避免推出自己的解决方案,但是在目前阶段,恐怕我将不得不这样做.原因:尽管上面的示例在使用某些字符串操作进行拆分/解析时似乎微不足道,但真正的多部分请求主体可以具有更多的标头.除此之外,我不想重新发明(更不用说重新测试了!):)
- It would be best if I don't have to supply the separator (i.e.
xyzseparator-blah
in this case), but I can live with it if I do have to. - I'm looking for a library based solution, possibly from a main stream library (like "Apache Commons" or something similar).
- I want to avoid rolling my own solution, but at the current stage, I'm afraid I will have to. Reason: while the example above seems trivial to split/parse with some string manipulation, real multipart request bodies can have many more headers. Besides that, I do not want to re-invent (and much less re-test!) the wheel :)
Further criteria
If there were a solution, which satisfies the above criteria, but whose input is an Apache HttpRequest
, instead of a String
, that would be acceptable too.
(Basically I do receive an HttpRequest
, but the in-house library I'm using is built such, that it extracts the body of this request as a String, and passes that to the class responsible for doing the parsing. However, if need be, I could also work directly on the HttpRequest
.)
无论我如何尝试通过Google找到答案,无论是在SO还是在其他论坛上,解决方案似乎总是使用此处,此处,此处 ...
但是, parseRequest
方法期望
No matter how I try to find an answer through Google, here on SO, and on other forums too, the solution seems to be always to use commons fileupload to go through the parts. E.g.: here, here, here, here, here...
However, parseRequest
method, used in that solution, expects a RequestContext
, which I do not have (only HttpRequest
).
The other way, also mentioned in some of the above answers, is getting the parameters from the HttpServletRequest
(but again, I only have HttpRequest
).
编辑:换句话说:我可以包含Commons Fileupload(我可以访问它),但这无济于事,因为我有HttpRequest
,并且Commons Fileupload需要RequestContext
. (除非有一种简单的方法可以将HttpRequest
转换为RequestContext
,而我已经忽略了它.)
EDIT: In other words: I could include Commons Fileupload (I have access to it), but that would not help me, because I have an HttpRequest
, and the Commons Fileupload needs RequestContext
. (Unless there is an easy way to convert from HttpRequest
to RequestContext
, which I have overlooked.)
推荐答案
您可以使用Commons FileUpload解析字符串,方法是将其包装在实现"org.apache.commons.fileupload.UploadContext"的类中,如下所示.
You can parse your String using Commons FileUpload by wrapping it in a class implementing 'org.apache.commons.fileupload.UploadContext', like below.
由于一些原因,我建议将HttpRequest包装在建议的替代解决方案中.首先,使用String意味着整个多部分POST正文(包括文件内容)都需要放入内存中.包装HttpRequest将使您可以流式传输它,一次只有一个小缓冲区在内存中.其次,如果没有HttpRequest,则需要嗅探多部分边界,该边界通常位于"Content-type"标头中(请参见 RFC1867 ).
I recommend wrapping the HttpRequest in your proposed alternate solution instead though, for a couple of reasons. First, using a String means that the whole multipart POST body, including the file contents,needs to fit into memory. Wrapping the HttpRequest would allow you to stream it, with only a small buffer in memory at one time. Second, without the HttpRequest, you'll need to sniff out the multipart boundary, which would normally be in the 'Content-type' header (see RFC1867).
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.commons.fileupload.FileItem;
import org.apache.commons.fileupload.FileItemFactory;
import org.apache.commons.fileupload.FileUpload;
import org.apache.commons.fileupload.disk.DiskFileItemFactory;
public class MultiPartStringParser implements org.apache.commons.fileupload.UploadContext {
public static void main(String[] args) throws Exception {
String s = new String(Files.readAllBytes(Paths.get(args[0])));
MultiPartStringParser p = new MultiPartStringParser(s);
for (String key : p.parameters.keySet()) {
System.out.println(key + "=" + p.parameters.get(key));
}
}
private String postBody;
private String boundary;
private Map<String, String> parameters = new HashMap<String, String>();
public MultiPartStringParser(String postBody) throws Exception {
this.postBody = postBody;
// Sniff out the multpart boundary.
this.boundary = postBody.substring(2, postBody.indexOf('\n')).trim();
// Parse out the parameters.
final FileItemFactory factory = new DiskFileItemFactory();
FileUpload upload = new FileUpload(factory);
List<FileItem> fileItems = upload.parseRequest(this);
for (FileItem fileItem: fileItems) {
if (fileItem.isFormField()){
parameters.put(fileItem.getFieldName(), fileItem.getString());
} // else it is an uploaded file
}
}
public Map<String,String> getParameters() {
return parameters;
}
// The methods below here are to implement the UploadContext interface.
@Override
public String getCharacterEncoding() {
return "UTF-8"; // You should know the actual encoding.
}
// This is the deprecated method from RequestContext that unnecessarily
// limits the length of the content to ~2GB by returning an int.
@Override
public int getContentLength() {
return -1; // Don't use this
}
@Override
public String getContentType() {
// Use the boundary that was sniffed out above.
return "multipart/form-data, boundary=" + this.boundary;
}
@Override
public InputStream getInputStream() throws IOException {
return new ByteArrayInputStream(postBody.getBytes());
}
@Override
public long contentLength() {
return postBody.length();
}
}
这篇关于解析Java中包含multipart/form-data请求主体的String的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!