Java的:在一个applet /读取URL从一个PDF文件转换成Byte数组的ByteBuffer [英] Java: Reading a pdf file from URL into Byte array/ByteBuffer in an applet

查看:542
本文介绍了Java的:在一个applet /读取URL从一个PDF文件转换成Byte数组的ByteBuffer的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找出为什么code这个特殊的片段是不是为我工作。我有这应该阅读.PDF和产品的PDF渲染库显示它的小程序,但由于某些原因,当我在我坐服务器上的pdf文件阅读,他们最终成为被损坏。我已经被退了出去再写入文件进行了测试。

I'm trying to figure out why this particular snippet of code isn't working for me. I've got an applet which is supposed to read a .pdf and display it with a pdf-renderer library, but for some reason when I read in the .pdf files which sit on my server, they end up as being corrupt. I've tested it by writing the files back out again.

我试着在IE和Firefox和损坏的文件出现在查看小程序。有趣的是,当我试图在Safari中查看(对于Windows)的小程序,该文件实际上是精品!我理解的JVM可能不同,但我还是输了。我已经编译Java 1.5中的。 JVM中的1.6。它读取文件中的代码片段如下。

I've tried viewing the applet in both IE and Firefox and the corrupt files occur. Funny thing is, when I trying viewing the applet in Safari (for Windows), the file is actually fine! I understand the JVM might be different, but I am still lost. I've compiled in Java 1.5. JVMs are 1.6. The snippet which reads the file is below.

public static ByteBuffer getAsByteArray(URL url) throws IOException {
        ByteArrayOutputStream tmpOut = new ByteArrayOutputStream();

        URLConnection connection = url.openConnection();
        int contentLength = connection.getContentLength();
        InputStream in = url.openStream();
        byte[] buf = new byte[512];
        int len;
        while (true) {
            len = in.read(buf);
            if (len == -1) {
                break;
            }
            tmpOut.write(buf, 0, len);
        }
        tmpOut.close();
        ByteBuffer bb = ByteBuffer.wrap(tmpOut.toByteArray(), 0,
                                        tmpOut.size());
        //Lines below used to test if file is corrupt
        //FileOutputStream fos = new FileOutputStream("C:\\abc.pdf");
        //fos.write(tmpOut.toByteArray());
        return bb;
}

我必须失去了一些东西,我一直在敲打我的脑袋试图弄明白。任何帮助是极大的AP preciated。谢谢你。

I must be missing something, and I've been banging my head trying to figure it out. Any help is greatly appreciated. Thanks.


编辑:
为了进一步澄清我的情况,我之前的片段再经过读取文件的区别在于,后读那些我输出的显著小于它们最初都是。当打开他们,他们未被识别为.pdf文件。有没有例外被抛出,我不理,我曾尝试冲洗无济于事。

To further clarify my situation, the difference in the file before I read then with the snippet and after, is that the ones I output after reading are significantly smaller than they originally are. When opening them, they are not recognized as .pdf files. There are no exceptions being thrown that I ignore, and I have tried flushing to no avail.

本段工作在Safari浏览器,这意味着文件在它的整体阅读,在大小没有区别,而且可以与任何.PDF阅读器打开。在IE和Firefox的文件总是最终被损坏,始终同尺寸更小。

This snippet works in Safari, meaning the files are read in it's entirety, with no difference in size, and can be opened with any .pdf reader. In IE and Firefox, the files always end up being corrupted, consistently the same smaller size.

我监视len个变量(读取59KB文件时),希望能看到多少字节在每个循环得到读入。在IE和Firefox,在18KB的in.read(BUF)返回-1,如果该文件已经结束。 Safari不这样做。

I monitored the len variable (when reading a 59kb file), hoping to see how many bytes get read in at each loop. In IE and Firefox, at 18kb, the in.read(buf) returns a -1 as if the file has ended. Safari does not do this.

我会坚持下去,我AP preciate所有建议为止。

I'll keep at it, and I appreciate all the suggestions so far.

推荐答案

以防万一,这些小的变化有所作为,试试这个:

Just in case these small changes make a difference, try this:

public static ByteBuffer getAsByteArray(URL url) throws IOException {
    URLConnection connection = url.openConnection();
    // Since you get a URLConnection, use it to get the InputStream
    InputStream in = connection.getInputStream();
    // Now that the InputStream is open, get the content length
    int contentLength = connection.getContentLength();

    // To avoid having to resize the array over and over and over as
    // bytes are written to the array, provide an accurate estimate of
    // the ultimate size of the byte array
    ByteArrayOutputStream tmpOut;
    if (contentLength != -1) {
        tmpOut = new ByteArrayOutputStream(contentLength);
    } else {
        tmpOut = new ByteArrayOutputStream(16384); // Pick some appropriate size
    }

    byte[] buf = new byte[512];
    while (true) {
        int len = in.read(buf);
        if (len == -1) {
            break;
        }
        tmpOut.write(buf, 0, len);
    }
    in.close();
    tmpOut.close(); // No effect, but good to do anyway to keep the metaphor alive

    byte[] array = tmpOut.toByteArray();

    //Lines below used to test if file is corrupt
    //FileOutputStream fos = new FileOutputStream("C:\\abc.pdf");
    //fos.write(array);
    //fos.close();

    return ByteBuffer.wrap(array);
}

您忘了关闭 FOS 这可能会导致该文件被缩短,如果你的应用程序仍在运行或突然终止。另外,我添加了合适的初始大小创建 ByteArrayOutputStream 。 (否则,Java将有反复分配一个新的数组和复制,分配一个新的数组和复制,这是昂贵的。)有一个更合适的值替换值16384。 16K可能是一个PDF小,但我不知道平均的大小如何,但你希望下载。

You forgot to close fos which may result in that file being shorter if your application is still running or is abruptly terminated. Also, I added creating the ByteArrayOutputStream with the appropriate initial size. (Otherwise Java will have to repeatedly allocate a new array and copy, allocate a new array and copy, which is expensive.) Replace the value 16384 with a more appropriate value. 16k is probably small for a PDF, but I don't know how but the "average" size is that you expect to download.

由于您使用 toByteArray()两次(尽管一个是在诊断code),我分配了一个变量。最后,虽然它不应该作出任何区别,当你包裹在全部中的ByteBuffer数组,你只需要提供字节数组本身。直供偏移 0 ,长度是多余的。

Since you use toByteArray() twice (even though one is in diagnostic code), I assigned that to a variable. Finally, although it shouldn't make any difference, when you are wrapping the entire array in a ByteBuffer, you only need to supply the byte array itself. Supplying the offset 0 and the length is redundant.

请注意,如果您正在下载的PDF文件这种方式,然后确保你的JVM与一个足够大的堆,你有足够的空间,几次你希望阅读的最大文件大小运行。您正在使用的方法保存在内存整个文件,这是OK,只要你能负担得起的内存。 :)

Note that if you are downloading large PDF files this way, then ensure that your JVM is running with a large enough heap that you have enough room for several times the largest file size you expect to read. The method you're using keeps the whole file in memory, which is OK as long as you can afford that memory. :)

这篇关于Java的:在一个applet /读取URL从一个PDF文件转换成Byte数组的ByteBuffer的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆