压缩浮点数据 [英] Compressing floating point data
问题描述
是否有任何无损压缩方法可应用于浮点时间序列数据,并且会明显胜过将二进制数据写入文件并通过gzip运行的性能?
Are there any lossless compression methods that can be applied to floating point time-series data, and will significantly outperform, say, writing the data as binary into a file and running it through gzip?
降低精度可能是可以接受的,但必须以受控方式进行(即,我必须能够确定必须保留多少个数字的界限)
Reduction of precision might be acceptable, but it must happen in a controlled way (i.e. I must be able to set a bound on how many digits must be kept)
我正在处理一些大型数据文件,这些文件是相关的 double
s系列,描述了时间的函数(即值是相关的)。通常,我不需要完整的 double
精度,但可能需要的不仅仅是 float
。
I am working with some large data files which are series of correlated double
s, describing a function of time (i.e. the values are correlated). I don't generally need the full double
precision but I might need more than float
.
由于存在专门的图像/音频无损方法,我想知道是否存在针对这种情况的专门方法。
Since there are specialized lossless methods for images/audio, I was wondering if anything specialized exists for this situation.
澄清:我正在寻找现有的实用工具,而不是描述如何实现此类功能的论文。在速度上可以媲美gzip的东西将是极好的。
Clarification: I am looking for existing practical tools rather than a paper describing how to implement something like this. Something comparable to gzip in speed would be excellent.
推荐答案
您可能想看看这些资源:
You might want to have a look at these resources:
- Lossless Compression of Predicted Floating-Point Values
- Papers by Martin Burtscher: The FPC Double-Precision Floating-Point Compression Algorithm and its Implementation, Fast Lossless Compression of Scientific Floating-Point Data and High Throughput Compression of Double-Precision Floating-Point Data
您可能还想尝试 Logluv压缩的TIFF ,以为我自己还没有用过。
You might also want to try Logluv-compressed TIFF for this, thought I haven't used them myself.
这篇关于压缩浮点数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!