在Java中的AWS Lambda上解析多部分/表单数据主体 [英] Parse multipart/form-data Body on AWS Lambda in Java
问题描述
我是AWS Lambda的新手,我正在尝试实现一个Lambda函数,该函数接收POST请求,该请求包含编码为multipart/form-data的数据.该消息通过使用Lambda代理集成的API网关接收,并且到达Lambda函数时,主体在Base64中进行编码.手动解码后,我看到它包含一个多部分主体,如下所示:
<代码> ----- WebKitFormBoundary3EZ0C3tbP2JpAmz4内容处置:表单数据;name ="param1"值1----- WebKitFormBoundary3EZ0C3tbP2JpAmz4内容处置:表单数据;name ="param2"值2------ WebKitFormBoundary3EZ0C3tbP2JpAmz4内容处置:表单数据;name ="myfile";filename ="ivr.png"内容类型:image/pngPNG... [二进制文件]------ WebKitFormBoundary3EZ0C3tbP2JpAmz4--
我需要在Java 8中解析此消息,以便可以访问各个部分.我设法使用+ javax.mail.Multipart +对象做到了这一点,但似乎我无法访问部件的名称"属性,因此我无法区分相同类型的元素,例如"param1"和"param2".我相信这可能与该类用于解析电子邮件的事实有关...还有另一种方法可以在lambda函数中解析此多部分主体吗?这是我必须解析的代码(base64是包含正文的字符串):
DataSource源=新的ByteArrayDataSource(新的ByteArrayInputStream(Base64.decodeBase64(base64)),多部分/混合");MimeMultipart mp =新的MimeMultipart(来源);
非常感谢您能提供的任何帮助.
好,所以这绝对不是理想的解决方案,但我能够做到这一点.
问题
实际上有很多库可以解析多部分表单数据.实际的问题是所有库都依赖于 javax.servlet
包-最重要的是 HttpServletRequest
类(还有更多).
由于我们无法在AWS Lambda环境中访问 javax.servlet
包类,因此我的解决方案是对其进行修复.
解决方案
- 从GitHub下载
javax.servlet
-
完成此操作后,我们可以使用一个或多个可以为我们完成所有工作的库.我发现可以解析文件内容的最佳库是Delight FileUpload- https://mvnrepository.com/artifact/org.javadelight/delight-fileupload .
-
一旦添加了库,
getFilesFromMultipartFormData()
下面的方法将返回ArrayList< File>
,其中列表中的每个项目都代表一个File
在请求中发送.
<代码>/*** @param上下文上下文* @param body这个值取自`request.getBody()`* @param contentType此值取自`request.headers().get("Content-Type"))* @return File对象列表*/私有列表<文件>getFilesFromMultipartFormData(上下文上下文,字符串主体,字符串contentType){ArrayList< File>files = new ArrayList<>();List< FileItem>fileItems = FileUpload.parse(body.getBytes(StandardCharsets.UTF_8),contentType);for(FileItem fileItem:fileItems){if(fileItem == null){继续;}logger.log(" fileItem名称:" + fileItem.getName());logger.log("fileItem content-type:" + fileItem.getContentType());logger.log("fileItem size:" + fileItem.getSize());//注意:例如,除了将其存储在本地之外,还可以将其直接存储到S3中尝试 {//我将扩展名设置为.png,但是您可以查看fileItem.getContentType()//以确保它是图片,pdf和其他格式文件临时文件= File.createTempFile(fileItem.getName(),".png"));Files.copy(fileItem.getInputStream(),temp.toPath(),StandardCopyOption.REPLACE_EXISTING);files.add(temp);} catch(Exception e){继续;}}返回文件;}
I am new to AWS Lambda and I am trying to implement a Lambda function that receives a POST request containing data encoded as multipart/form-data. The message is received through the API Gateway using Lambda Proxy integration and the body is encoded in Base64 when it arrives to the Lambda function. After decoding it manually, I see it contains a multipart body like the following:
-----WebKitFormBoundary3EZ0C3tbP2JpAmz4 Content-Disposition: form-data; name="param1" value1 -----WebKitFormBoundary3EZ0C3tbP2JpAmz4 Content-Disposition: form-data; name="param2" value2 ------WebKitFormBoundary3EZ0C3tbP2JpAmz4 Content-Disposition: form-data; name="myfile"; filename="ivr.png" Content-Type: image/png PNG ... [binary stuff] ------WebKitFormBoundary3EZ0C3tbP2JpAmz4--
What I need is to parse this message in java 8 so I can access the individual parts. I managed to do it using the +javax.mail.Multipart+ object but it seems I cannot access the "name" property for the parts and as such I cannot distinguish between same type elements, e.g. "param1" and "param2". I believe this can be related to the fact that this Class is for parsing email messages... Is there another way to parse this multipart body inside the lambda function? This is the code I have to parse it (base64 is the string containing the body):
DataSource source = new ByteArrayDataSource(new ByteArrayInputStream(Base64.decodeBase64(base64)), "multipart/mixed"); MimeMultipart mp = new MimeMultipart(source);
I'd appreciate any help you can provide.
解决方案Ok so this is definitely NOT the ideal solution but I was able to make this work.
Problem
There are actually many libraries out there to parse multipart form data. The actual problem is all libraries rely on
javax.servlet
package - most importantlyHttpServletRequest
class (and few more).Since we can't access
javax.servlet
package classes in AWS Lambda environment, my solution is to fix that.
Solution
- Download the
javax.servlet
package from GitHub and add that to you your lambda function. See the image below - you can see that my classMultipartFormDataTest
is within my packagecom...
and I also havejavax.servlet
package within the same Java module.
Once we do this, we can use one or more libraries that will do all the work for us. The best library I've found that will parse the file content is Delight FileUpload - https://mvnrepository.com/artifact/org.javadelight/delight-fileupload.
Once that library is added, the method below
getFilesFromMultipartFormData()
will returnArrayList<File>
where each item in the list represents aFile
that was sent in the request.
/** * @param context context * @param body this value taken from the `request.getBody()` * @param contentType this value is taken from `request.headers().get("Content-Type")` * @return List of File objects */ private List<File> getFilesFromMultipartFormData(Context context, String body, String contentType) { ArrayList<File> files = new ArrayList<>(); List<FileItem> fileItems = FileUpload.parse(body.getBytes(StandardCharsets.UTF_8), contentType); for(FileItem fileItem : fileItems) { if(fileItem == null) { continue; } logger.log("fileItem name: " + fileItem.getName()); logger.log("fileItem content-type: " + fileItem.getContentType()); logger.log("fileItem size: " + fileItem.getSize()); // Note: instead of storing it locally, you can also directly store it to S3 for example try { // I'm setting the extension to .png but you can look at the fileItem.getContentType() // to make sure it is an image vs. pdf vs. some other format File temp = File.createTempFile(fileItem.getName(), ".png"); Files.copy(fileItem.getInputStream(), temp.toPath(), StandardCopyOption.REPLACE_EXISTING); files.add(temp); } catch (Exception e) { continue; } } return files; }
这篇关于在Java中的AWS Lambda上解析多部分/表单数据主体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-