在Clojure中解压缩zlib流 [英] Decompress zlib stream in Clojure

查看:50
本文介绍了在Clojure中解压缩zlib流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个二进制文件,其内容由Python上的 zlib.compress 创建,是否有一种简单的方法可以在Clojure中打开和解压缩该文件?

I have a binary file with contents created by zlib.compress on Python, is there an easy way to open and decompress it in Clojure?

import zlib
import json

with open('data.json.zlib', 'wb') as f:
    f.write(zlib.compress(json.dumps(data).encode('utf-8')))

很容易地说,这不是gzip文件,它只是表示放气的数据的字节.

Basicallly it isn't a gzip file, it is just bytes representing deflated data.

我只能找到这些参考文献,但找不到我想要的(我认为前两个是最相关的):

I could only find these references but not quite what I'm looking for (I think first two are most relevant):

我必须真的将这个多行包装器实现为 java.util.zip ,还是那里有一个不错的库?实际上,我什至不确定这些字节流是否跨库兼容,或者我是否只是尝试混合并匹配错误的库.

Must I really implement this multi-line wrapper to java.util.zip or is there a nice library out there? Actually I'm not even sure if these byte streams are compatible across libraries, or if I'm just trying to mix-and-match wrong libs.

Python中的步骤:

Steps in Python:

>>> '{"hello": "world"}'.encode('utf-8')
b'{"hello": "world"}'
>>> zlib.compress(b'{"hello": "world"}')
b'x\x9c\xabV\xcaH\xcd\xc9\xc9W\xb2RP*\xcf/\xcaIQ\xaa\x05\x009\x99\x06\x17'
>>> [int(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23]
>>> import numpy
>>> [numpy.int8(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]
>>> zlib.decompress(bytes([120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23])).decode('utf-8')
'{"hello": "world"}'

在Clojure中尝试解码:

Decode attempt in Clojure:

; https://github.com/funcool/buddy-core/blob/master/src/buddy/util/deflate.clj#L40 without try-catch
(ns so.core
  (:import java.io.ByteArrayInputStream
           java.io.ByteArrayOutputStream
           java.util.zip.Deflater
           java.util.zip.DeflaterOutputStream
           java.util.zip.InflaterInputStream
           java.util.zip.Inflater
           java.util.zip.ZipException)
  (:gen-class))

(defn uncompress
  "Given a compressed data as byte-array, uncompress it and return as an other byte array."
  ([^bytes input] (uncompress input nil))
  ([^bytes input {:keys [nowrap buffer-size]
                  :or {nowrap true buffer-size 2048}
                  :as opts}]
   (let [buf  (byte-array (int buffer-size))
         os   (ByteArrayOutputStream.)
         inf  (Inflater. ^Boolean nowrap)]
     (with-open [is  (ByteArrayInputStream. input)
                 iis (InflaterInputStream. is inf)]
       (loop []
         (let [readed (.read iis buf)]
           (when (pos? readed)
             (.write os buf 0 readed)
             (recur)))))
     (.toByteArray os))))

(uncompress (byte-array [120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]))
ZipException invalid stored block lengths  java.util.zip.InflaterInputStream.read (InflaterInputStream.java:164)

任何帮助将不胜感激.我不想使用zip或gzip文件,因为我只关心原始内容,而不关心这种情况下的文件名或修改日期.但是,如果是唯一的选择,则可以在Python端使用其他压缩算法.

Any help would be appreciated. I wouldn't want to use zip or gzip files as I only care about raw content, not file names or modification dates in this context. But is possible to use an other compression algorithm on Python side if it is the only option.

推荐答案

以下是使用gzip的简单方法:

Here is an easy way to do it with gzip:

Python代码:

import gzip
content = "the quick brown fox"
with gzip.open('fox.txt.gz', 'wb') as f:
    f.write(content)

邮政编码:

(with-open [in (java.util.zip.GZIPInputStream.
                (clojure.java.io/input-stream
                 "fox.txt.gz"))]
  (println "result:" (slurp in)))

;=>  result: the quick brown fox

请记住,"gzip"是一种算法和一种格式,并不意味着您需要使用"gzip"命令行工具.

Keep in mind that "gzip" is an algorithm and a format, and does not mean you need to use the "gzip" command-line tool.

请注意,Clojure的输入不必是文件.您可以将gzip压缩数据作为原始字节通过套接字发送,并且仍然在Clojure端将其解压缩.有关详细信息,请访问: https://clojuredocs.org/clojure.java.io/input-流

Please note that the input to Clojure doesn't have to be a file. You could send the gzip compressed data as raw bytes over a socket and still decompress it on the Clojure side. Full details at: https://clojuredocs.org/clojure.java.io/input-stream

如果您需要使用纯 zlib 格式而不是 gzip ,则结果非常相似:

If you need to use the pure zlib format instead of gzip, the result is very similar:

Python代码:

import zlib
fp = open( 'balloon.txt.z', 'wb' )
fp.write( zlib.compress( 'the big red baloon' ))
fp.close()

邮政编码:

(with-open [in (java.util.zip.InflaterInputStream.
                (clojure.java.io/input-stream
                 "balloon.txt.z"))]
  (println "result:" (slurp in)))

;=> result: the big red baloon

这篇关于在Clojure中解压缩zlib流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆