在java servlet中流式传输大文件 [英] Streaming large files in a java servlet

查看:38
本文介绍了在java servlet中流式传输大文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个需要扩展的 Java 服务器.其中一个 servlet 将提供存储在 Amazon S3 中的图像.

I am building a java server that needs to scale. One of the servlets will be serving images stored in Amazon S3.

最近在负载下,我的 VM 内存不足,这是在我添加了提供图像的代码之后,所以我很确定流式传输更大的 servlet 响应会导致我的麻烦.

Recently under load, I ran out of memory in my VM and it was after I added the code to serve the images so I'm pretty sure that streaming larger servlet responses is causing my troubles.

我的问题是:在从数据库或其他云存储读取时,如何编写 Java servlet 以将大型 (>200k) 响应流式传输回浏览器,是否有任何最佳实践?

My question is : is there any best practice in how to code a java servlet to stream a large (>200k) response back to a browser when read from a database or other cloud storage?

我已经考虑将文件写入本地临时驱动器,然后生成另一个线程来处理流,以便可以重新使用 tomcat servlet 线程.这似乎会很重.

I've considered writing the file to a local temp drive and then spawning another thread to handle the streaming so that the tomcat servlet thread can be re-used. This seems like it would be io heavy.

任何想法将不胜感激.谢谢.

Any thoughts would be appreciated. Thanks.

推荐答案

如果可能,您不应将要提供服务的文件的全部内容存储在内存中.相反,获取数据的 InputStream,并将数据分片复制到 Servlet OutputStream.例如:

When possible, you should not store the entire contents of a file to be served in memory. Instead, aquire an InputStream for the data, and copy the data to the Servlet OutputStream in pieces. For example:

ServletOutputStream out = response.getOutputStream();
InputStream in = [ code to get source input stream ];
String mimeType = [ code to get mimetype of data to be served ];
byte[] bytes = new byte[FILEBUFFERSIZE];
int bytesRead;

response.setContentType(mimeType);

while ((bytesRead = in.read(bytes)) != -1) {
    out.write(bytes, 0, bytesRead);
}

// do the following in a finally block:
in.close();
out.close();

我同意托比的观点,您应该改为将它们指向 S3 网址."

I do agree with toby, you should instead "point them to the S3 url."

至于OOM异常,您确定它与提供图像数据有关吗?假设您的 JVM 有 256MB 的额外"内存用于提供图像数据.在 Google 的帮助下,256MB/200KB" = 1310.对于 2GB 的额外"内存(现在是非常合理的数量),可以支持超过 10,000 个并发客户端.即便如此,1300 个并发客户端也是一个相当大的数字.这是您所经历的负载类型吗?如果没有,您可能需要在其他地方寻找 OOM 异常的原因.

As for the OOM exception, are you sure it has to do with serving the image data? Let's say your JVM has 256MB of "extra" memory to use for serving image data. With Google's help, "256MB / 200KB" = 1310. For 2GB "extra" memory (these days a very reasonable amount) over 10,000 simultaneous clients could be supported. Even so, 1300 simultaneous clients is a pretty large number. Is this the type of load you experienced? If not, you may need to look elsewhere for the cause of the OOM exception.

编辑 - 关于:

在这个用例中,图像可能包含敏感数据...

In this use case the images can contain sensitive data...

几周前,当我通读 S3 文档时,我注意到您可以生成可附加到 S3 URL 的过期密钥.因此,您不必向公众开放 S3 上的文件.我对技术的理解是:

When I read through the S3 documentation a few weeks ago, I noticed that you can generate time-expiring keys that can be attached to S3 URLs. So, you would not have to open up the files on S3 to the public. My understanding of the technique is:

  1. 初始 HTML 页面包含指向您的网络应用程序的下载链接
  2. 用户点击下载链接
  3. 您的网络应用程序生成一个 S3 URL,其中包含一个密钥,该密钥将在 5 分钟后过期.
  4. 使用步骤 3 中的 URL 向客户端发送 HTTP 重定向.
  5. 用户从 S3 下载文件.即使下载时间超过 5 分钟,这也有效 - 一旦下载开始,它可以继续完成.

这篇关于在java servlet中流式传输大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆