Python：如何读取列数不均匀的数据文件 [英] Python: How to read a data file with uneven number of columns

查看：170 发布时间：2017/11/3 19:04:00 python file numpy

本文介绍了Python：如何读取列数不均匀的数据文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的一个朋友需要读取很多数据（大约18000个数据集），这些数据都是格式化的。具体来说，数据应该是8列和8000行的数据，但是数据是以7的列传递的，最后一个入口溢出到下一行的第一列中。

另外每个〜30行只有4列。这是因为一些上游程序正在将一个200 x 280阵列重塑为7x8120阵列。

我的问题是：我们如何读取数据到一个8x7000阵列。当列数不一致时，我通常的np.loadtxt和np.genfromtxt库失败。

请注意，性能是一个因素，因为必须完成为〜18000个数据文件。
$ b 这是一个典型的数据文件的链接：
http://users-phys.au.dk/hha07/hk_L1.ref

解决方案
使用open（hk_L1.ref）来解决这个问题的一个更简单的方法：

）作为f： data = numpy.array（f.read（）.split（），dtype = float）.reshape（7000，8）
首先将数据读取为一维数组，然后完全忽略所有换行符，然后将其重塑为所需的形状。
$ b

虽然我认为这个任务会被I / O限制，但是这个方法在处理时间上应该很少使用。

A friend of mine needs to to read a lot of data (about 18000 data sets) that is all formatted annoyingly. Specifically the data is supposed to be 8 columns and ~ 8000 rows of data, but instead the data is delivered as columns of 7 with the last entry spilling into the first column of the next row.

In addition every ~30 rows there is only 4 columns. This is because some upstream program is reshaping a 200 x 280 array into the 7x8120 array.

My question is this: How can we read the data into a 8x7000 array. My usual arsenal of np.loadtxt and np.genfromtxt fail when there is an uneven number of columns.

Keep in mind that performance is a factor since this has to be done for ~18000 datafiles.

Here is a link to a typical data file: http://users-phys.au.dk/hha07/hk_L1.ref

解决方案

An even easier approach I just thought of:

with open("hk_L1.ref") as f:
    data = numpy.array(f.read().split(), dtype=float).reshape(7000, 8)

This reads the data as a one-dimensional array first, completely ignoring all new-line characters, and then we reshape it to the desired shape.

While I think that the task will be I/O-bound anyway, this approach should use little processor time if it matters.

这篇关于Python：如何读取列数不均匀的数据文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python：如何读取列数不均匀的数据文件 [英] Python: How to read a data file with uneven number of columns

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python：如何读取列数不均匀的数据文件 [英] Python: How to read a data file with uneven number of columns

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭