使用 stream_filter_append 和 stream_copy_to_stream 解压缩 gzip [英] uncompressing gzip with stream_filter_append and stream_copy_to_stream

查看:24
本文介绍了使用 stream_filter_append 和 stream_copy_to_stream 解压缩 gzip的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

发现这个:https://stackoverflow.com/a/11373078/530599 - 很棒,但是

Found this: https://stackoverflow.com/a/11373078/530599 - great, but

怎么样 stream_filter_append($fp, 'zlib.inflate', STREAM_FILTER_*

正在寻找另一种解压缩数据的方法.

Looking for another way to uncompress data.

$fp = fopen($src, 'rb');
$to = fopen($output, 'wb');

// some filtering here?
stream_copy_to_stream($fp, $to);
fclose($fp);
fclose($to);

其中 $srchttp://.../file.gz 的一些 url,例如 200+ Mb :)

Where $src is some url to http://.../file.gz for example 200+ Mb :)

添加了有效的测试代码,但分两步:

<?php

    $src = 'http://is.auto.ru/catalog/catalog.xml.gz';
    $fp = fopen($src, 'rb');
    $to = fopen(dirname(__FILE__) . '/output.txt.gz', 'wb');
    stream_copy_to_stream($fp, $to);
    fclose($fp);
    fclose($to);

    copy('compress.zlib://' . dirname(__FILE__) . '/output.txt.gz', dirname(__FILE__) . '/output.txt');

推荐答案

PHP 的流过滤器子系统中令人讨厌的遗漏之一是缺少 gzip 过滤器.Gzip 本质上是使用 deflate 方法压缩的内容.然而,它在压缩数据之前添加了一个 2 字节的标头,并在末尾添加了 Adler-32 校验和.如果您只是将 zlib.inflate 过滤器添加到流中,则不会起作用.在附加过滤器之前,您必须跳过前两个字节.

One of the annoying omissions in PHP's stream filter subsystem is the lack of a gzip filter. Gzip is essentially contents compressed using the deflate method. It adds a 2-byte header before the deflated data, however, and a Adler-32 checksum at the end. If you just add an zlib.inflate filter to a stream, it's not going to work. You have to skip the first two bytes before attaching the filter.

请注意,PHP 5.2.X 版中的流过滤器存在严重错误.这是由于流缓冲.基本上 PHP 将无法通过过滤器传递流的内部缓冲区中已有的数据.如果您在附加 inflate 过滤器之前执行 fread($handle, 2) 来读取 gzip 标头,则很有可能会失败.调用 fread() 会导致 PHP 尝试填满它的缓冲区.即使对 fread() 的调用只要求两个字节,PHP 实际上可能会从物理介质中读取更多的字节(比如 1024 个)以尝试提高性能.由于上述错误,额外的 1022 字节将不会发送到解压缩例程.

Note that there's a serious bug with stream filters in PHP version 5.2.X. It's due to stream buffering. Basically PHP would fail to pass data already in the stream's internal buffer through the filter. If you do a fread($handle, 2) to read the gzip header before attaching the inflate filter, there's a good chance that it's going to fail. A call to fread() would cause PHP to try to fill up the its buffer. Even if the call to fread() asks for only two bytes, PHP might actually read many more bytes (let say 1024) from the physical medium in an attempt to improve performance. Due to the aforementioned bug, the extra 1022 bytes would not get send to the decompression routine.

这篇关于使用 stream_filter_append 和 stream_copy_to_stream 解压缩 gzip的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆