Python HDF5 H5Py打开多个文件时出现问题 [英] Python HDF5 H5Py issues opening multiple files

查看:209
本文介绍了Python HDF5 H5Py打开多个文件时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用64位版本的Enthought Python,以处理多个HDF5文件中的数据.我在64位Windows上使用h5py版本1.3.1(HDF5 1.8.4).

I am usint the 64-bit version of Enthought Python to process data across multiple HDF5 files. I'm using h5py version 1.3.1 (HDF5 1.8.4) on 64-bit Windows.

我有一个对象,可以为我的特定数据层次结构提供方便的接口,但是独立测试h5py.File(fname,'r')会产生相同的结果.我正在遍历一长串列表(一次约100个文件),并尝试从文件中提取特定信息.我遇到的问题是,我从几个文件中得到了相同的信息!我的循环看起来像:

I have an object that provides a convenient interface to my specific data heirarchy, but testing the h5py.File(fname, 'r') independently yields the same results. I am iterating through a long list (~100 files at a time) and attempting to pull out specific pieces of information from the files. The problem I'm having is that I'm getting the same information out of several files! My loop looks something like:

files = glob(r'path\*.h5')
out_csv = csv.writer(open('output_file.csv', 'rb'))

for filename in files:
  handle = hdf5.File(filename, 'r')
  data = extract_data_from_handle(handle)
  for row in data:
     out_csv.writerow((filename, ) +row)

当我使用hdfview之类的文件检查文件时,我知道内部结构是不同的.但是,我得到的csv似乎表明所有文件都包含相同的数据.有人见过这种行为吗?有什么建议可以开始调试此问题吗?

When I inspect the files using something like hdfview, I know the internals are different. However, the csv I get seems to indicate that all the files contain the same data. Has anyone seen this behavior before? Any suggestions where I could go to start debugging this issue?

推荐答案

我得出的结论是,这是

I've concluded that this is a strange manifestation of Perplexing assignment behavior with h5py object as instance variable . I re-wrote my code so that each file is handled within a function call and the variable is not reused. Using this approach, I don't see the same strange behavior and it seems to work much better. For clarity, the solution looks more like:

files = glob(r'path\*.h5')
out_csv = csv.writer(open('output_file.csv', 'rb'))

def extract_data_from_filename(filename):
    return extract_data_from_handle(hdf5.File(filename, 'r'))

for filename in files:
  data = extract_data_from_filename(filename)
  for row in data:
     out_csv.writerow((filename, ) +row)

这篇关于Python HDF5 H5Py打开多个文件时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆