从URL读取奇怪的byte []行为 [英] Strange byte[] behavior reading from a URL

查看:176
本文介绍了从URL读取奇怪的byte []行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最后,我的终极目标是:

In the end, my ultimate goals are:


  • 从URL中读取(此问题是关于什么)

  • 将检索到的[PDF]内容保存到数据库中的BLOB字段(已经确定已经确定)

  • 从BLOB字段中读取并将该内容附加到电子邮件中

  • 所有无法访问文件系统

  • Read from a URL (what this question is about)
  • Save the retrieved [PDF] content to a BLOB field in a DB (already have that nailed down)
  • Read from the BLOB field and attach that content to an email
  • All without going to a filesystem

以下方法的目标是获得一个 byte [] ,可以在下游用作电子邮件附件(以避免写作)到磁盘):

The goal with the following method is to get a byte[] that can be used downstream as an email attachment (to avoid writing to disk):

public byte[] retrievePDF() {

         HttpClient httpClient = new HttpClient();

         GetMethod httpGet = new GetMethod("http://website/document.pdf");
         httpClient.executeMethod(httpGet);
         InputStream is = httpGet.getResponseBodyAsStream();

         byte[] byteArray = new byte[(int) httpGet.getResponseContentLength()];

         is.read(byteArray, 0, byteArray.length);

        return byteArray;
}

对于特定的PDF, getResponseContentLength()方法返回101,689作为长度。 奇怪的部分是,如果我设置一个断点并询问 byteArray 变量,它有101,689个字节的元素,但是,在字节#3744之后数组的剩余字节都是零( 0 )。 PDF阅读器客户端(如Adobe Reader)无法读取生成的PDF文件。

For a particular PDF, the getResponseContentLength() method returns 101,689 as the length. The strange part is that if I set a break-point and interrogate the byteArray variable, it has 101,689 byte elements, however, after byte #3744 the remaining bytes of the array are all zeroes (0). The resulting PDF is then not readable by a PDF-reader client, like Adobe Reader.

为什么会发生这种情况?

通过浏览器检索相同的PDF并保存到磁盘,或使用如下方法(我在收到此StackOverflow帖子),产生可读的PDF:

Retrieving this same PDF via browser and saving to disk, or using a method like the following (which I patterned after an answer to this StackOverflow post), results in a readable PDF:

public void retrievePDF() {
    FileOutputStream fos = null;
    URL url;
    ReadableByteChannel rbc = null;

    url = new URL("http://website/document.pdf");

    DataSource urlDataSource = new URLDataSource(url);

    /* Open a connection, then set appropriate time-out values */
    URLConnection conn = url.openConnection();
    conn.setConnectTimeout(120000);
    conn.setReadTimeout(120000);

    rbc = Channels.newChannel(conn.getInputStream());

    String filePath = "C:\\temp\\";
    String fileName = "testing1234.pdf";
    String tempFileName = filePath + fileName;

    fos = new FileOutputStream(tempFileName);
    fos.getChannel().transferFrom(rbc, 0, 1 << 24);
    fos.flush();

    /* Clean-up everything */
    fos.close();
    rbc.close();
}

对于这两种方法,生成的PDF的大小为101,689字节a 在Windows中右键单击>属性...

For both approaches, the size of the resulting PDF is 101,689-bytes when doing a Right-click > Properties... in Windows.

为什么字节数组基本上会停止到中途?

推荐答案

InputStream.read 读取 byteArray.length 字节但可能读不到那么多。它返回它读取的字节数。你应该反复调用它来完全读取数据,如下所示:

InputStream.read reads up to byteArray.length bytes but might not read exactly that much. It returns how many bytes it read. You should call it repeatedly to fully read the data, like this:

int bytesRead = 0;
while (true) {
    int n = is.read(byteArray, bytesRead, byteArray.length);
    if (n == -1) break;
    bytesRead += n;
}

这篇关于从URL读取奇怪的byte []行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆