将大文件作为流发送到process.getOutputStream [英] Sending Large files as stream to process.getOutputStream

查看:791
本文介绍了将大文件作为流发送到process.getOutputStream的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在windows机器中使用gzip实用程序。我压缩了一个文件并作为blob存储在DB中。当我想使用gzip实用程序解压缩此文件时,我将此字节流写入process.getOutputStream。但是在30KB之后,它无法读取文件。它挂在那里。

I am using gzip utilities in windows machine. I compressed a file and stored in the DB as blob. When I want to decompress this file using gzip utility I am writing this byte stream to process.getOutputStream. But after 30KB, it was unable to read the file. It hangs there.

尝试使用内存参数,读取和刷新逻辑。但是,如果我尝试写入文件,那么相同的数据非常快。

Tried with memory arguments, read and flush logic. But the same data if I try to write to a file it is pretty fast.

 OutputStream stdin = proc.getOutputStream();
 Blob blob = Hibernate.createBlob(inputFileReader);
 InputStream source = blob.getBinaryStream();
 byte[] buffer = new byte[256];
 long readBufferCount = 0;
 while (source.read(buffer) > 0)
 {
  stdin.write(buffer);
  stdin.flush();
  log.info("Reading the file - Read bytes: " + readBufferCount);
  readBufferCount = readBufferCount + 256;
 }
 stdin.flush();

问候,
Mani Kumar Adari。

Regards, Mani Kumar Adari.

推荐答案

我怀疑问题是外部进程(连接到 proc )是

I suspect that the problem is that the external process (connected to proc) is either


  • 没有读取它的标准输入,或

  • 它正在向Java应用程序未读取的标准输出写入内容。

请记住,Java使用一对管道与外部进程通信,并且这些缓冲有限。如果超过管道的缓冲容量,则写入进程将被阻止写入管道,直到读取器进程从管道读取足够的数据以腾出空间。如果读取器没有读取,则管道锁定。

Bear in mind that Java talks to the external process using a pair of "pipes", and these have a limited amount of buffering. If you exceed the buffering capacity of a pipe, the writer process will be blocked writing to the pipe until the reader process has read enough data from the pipe to make space. If the reader doesn't read, then the pipeline locks up.

如果您提供了更多上下文(例如,启动gzip进程的应用程序部分)我会能够更明确。

If you provided more context (e.g. the part of the application that launches the gzip process) I'd be able to be more definitive.

关注


gzip.exe是我们正在使用的windows中的unix实用程序。 gzip.exe在命令提示符下工作正常。但不是与java程序。有没有什么办法可以增加java写入管道的缓冲大小。我担心目前的输入部分。

gzip.exe is a unix utility in windows we are using. gzip.exe in command prompt working fine. But Not with the java program. Is there any way we can increase the buffering size which java writes to a pipe. I am concerned about the input part at present.

在UNIX上,gzip实用程序通常使用以下两种方式之一:

On UNIX, the gzip utility is typically used one of two ways:


  • gzip文件压缩文件转动它进入 file.gz

  • ... | gzip | ... (或类似的东西)将其标准输入的压缩版本写入其标准输出。

  • gzip file compresses file turning it into file.gz.
  • ... | gzip | ... (or something similar) which writes a compressed version of its standard input to its standard output.

我怀疑你做的是后者,java应用程序既是 gzip 命令的输入源,也是输出的目的地。这正是可以锁定的场景......如果java应用程序没有正确实现。例如:

I suspect that you are doing the equivalent of the latter, with the java application as both the source of the gzip command's input and the destination of its output. And this is the precisely the scenario that can lock up ... if the java application is not implemented correctly. For instance:

    Process proc = Runtime.exec(...);  // gzip.exe pathname.
    OutputStream out = proc.getOutputStream();
    while (...) {
        out.write(...);
    }
    out.flush();
    InputStream in = proc.getInputStream();
    while (...) {
        in.read(...);
    }

如果上面的应用程序的写入阶段写入太多数据,则保证锁定。

If the write phase of the application above writes too much data, it is guaranteed to lockup.

java应用程序和 gzip 之间的通信是通过两个管道进行的。正如我上面所说,管道将缓冲一定数量的数据,但是这个数量相对较小,并且肯定是有限的。这是锁定的原因。以下是发生的情况:

Communication between the java application and gzip is via two pipes. As I stated above, a pipe will buffer a certain amount of data, but that amount is relatively small, and certainly bounded. This is the cause of the lockup. Here is what happens:


  1. 使用一对管道创建 gzip 进程将它连接到Java应用程序进程。

  2. Java应用程序将数据写入 out stream

  3. gzip 进程从其标准输入中读取该数据,压缩并写入其标准输出。

  4. 步骤2.和3.重复几次,直到最后 gzip 进程尝试写入其标准输出块。

  1. The gzip process is creates with a pair of pipes connecting it to the Java application process.
  2. The Java application writes data to its out stream
  3. The gzip processes reads that data from its standard input, compresses it and writes to its standard output.
  4. Steps 2. and 3. are repeated a few times, until finally the gzip processes attempt to write to its standard output blocks.

发生的事情是 gzip 已写入其输出管道,但没有任何内容正在从中读取。最终,我们达到了我们已经耗尽输出管道的缓冲区容量以及写入管道块的程度。

What has been happening is that gzip has been writing into its output pipe, but nothing has been reading from it. Eventually, we reach the point where we've exhausted the output pipe's buffer capacity, and the write to the pipe blocks.

同时,Java应用程序仍在写入 out Stream,经过几轮之后,这也会因为我们填充了另一个管道而阻塞。

Meanwhile, the Java application is still writing to the out Stream, and after a couple more rounds, this too blocks because we've filled the other pipe.

唯一的解决方案是Java应用程序同时读取和写入 。执行此操作的简单方法是创建第二个线程,并从一个线程写入外部进程,并从另一个线程读取进程。

The only solution is for the Java application to read and write at the same time. The simple way to do this is to create a second thread and do the writing to the external process from one thread and the reading from the process in the other one.

(更改Java缓冲或Java读/写大小无济于事。重要的缓冲是在管道的OS实现中,如果有的话,没有办法从纯Java中更改它。)

(Changing the Java buffering or the Java read / write sizes won't help. The buffering that matters is in the OS implementations of the pipes, and there's no way to change that from pure Java, if at all.)

这篇关于将大文件作为流发送到process.getOutputStream的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆