h5py:压缩管道中的复合数据类型和小数偏移 [英] h5py: Compound datatypes and scale-offset in the compression pipeline

查看:116
本文介绍了h5py:压缩管道中的复合数据类型和小数偏移的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Numpy和h5py,可以创建要存储在hdf5文件中的复合数据类型"数据集:

Using Numpy and h5py, it is possible to create ‘compound datatype’ datasets to be stored in an hdf5-file:

import h5py
import numpy as np
#
# Create a new file using default properties.
#
file = h5py.File('compound.h5','w')
#
# Create a dataset under the Root group.
#
comp_type = np.dtype([('fieldA', 'i4'), ('fieldB', 'f4')])
dataset = file.create_dataset("comp", (4,), comp_type)

还可以在压缩管道"中使用各种压缩过滤器,其中包括比例偏移"过滤器:

It is also possible to use various compression filters in a ‘compression pipeline’, among them the ‘scale-offset’ filter:

cmpr_dataset = file.create_dataset("cmpr", (4,), 'i4', scaleoffset=0)

但是,我不清楚对于复合数据类型的不同字段,是否可以以及随后如何使用特定参数(例如,上例中的0)指定比例偏移滤波器.

However, it is not clear to me whether and then how it is possible to specify the scale offset filter with specific parameter (e.g., the 0 in the above example) for the different fields of a compound datatype.

更笼统地说,我不清楚是否以及如何使用特定于字段的参数来应用任何过滤器.

More generally, it is not clear to me whether and how any filter can be applied with field-specific parameters.

所以,问题是:

  • 是否可以仅将过滤器应用于复合数据类型数据集,或者将其应用于具有特定参数的特定字段?

  • Is it possible to apply filters to compound datatype datasets only, or with specific parameters, to a specific field?

如果是,如何在语法上做到这一点?

If yes, how can this be done, syntax-wise?

我的猜测(恐惧)是复合数据的存储方式(在一个列"中,而不是在其自己的列"中的每个字段)的性质将禁止应用此类特定于字段的过滤器,但我想检查,只是为了确定.

My guess (fear) is that the nature of how the compound data is stored (in one ‘column’, instead of each field in its own ‘column’) will prohibit application of such field-specific filters, but I wanted to check, just to be sure.

推荐答案

除了h5py文档外,请查看hdf5文档.他们会更详细.如果基础文件系统不支持此功能,则numpy接口也不会.

Besides the h5py docs, look at the hdf5 docs. They go into more detail. If the underlying file system does not support this, then the numpy interface won't either.

https://support.hdfgroup.org/HDF5 /doc/UG/OldHtmlSource/10_Datasets.html#ScaleOffset

在其他地方说过滤器应用于整个块.

Elsewhere it says filters are applied to whole chunks.

定义化合物类型的表达式为纯numpy. h5py必须将其描述符转换为等效的hdf5 c-struc描述.有样本c和fortran化合物类型定义.

The expression defining the compound type is pure numpy. h5py must be translating its descriptor into an equivalent hdf5 c-struc description. There are sample c and fortran compound types definitions.

所有文档都说此offset仅适用于整数和浮点类型.可以理解为排除字符串,vlen和化合物.您希望它仍然可以与复合类型中的数字类型一起使用.我不这么认为.

All docs say that this offset applies only to integer and float types. That can be understood as excluding string, vlen, and compound. What you are hoping is that it would still work with the numeric types inside a compound type. I don't think so.

这篇关于h5py:压缩管道中的复合数据类型和小数偏移的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆