使用numpy.genfromtxt()跳过指定数量的列 [英] Skip a specified number of columns with numpy.genfromtxt()

查看:607
本文介绍了使用numpy.genfromtxt()跳过指定数量的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用numpy.genfromtxt()加载一张大桌子(文本格式的数字).我想忽略前五个 n 列,例如5.我不知道表的大小(行数或列数)是预先准备的.

I have a large table (numbers in text format) that I would like to load with numpy.genfromtxt(). I would like to ignore the first n columns, say 5. I do not know the size of the table (number of row or columns) in advance.

我看到genfromtxt()有一个选项skip_header,该选项允许跳过指定数量的标题行,但是对于列似乎没有这样的选项.有一个usecols选项,但是我必须指定我要保留的列号,而不是我要舍弃的列号(我事先不知道该数字).

I saw that genfromtxt() has an option skip_header that allows to skip a specified number of header rows, but it seems there is no such option for columns. There is a usecols option but there I must specify the column numbers I want to keep, rather than those I want to discard (I do not know this number in advance).

很明显,我可以先加载整个内容,然后丢弃前 n 列,但这并不是很优雅,并且在内存方面是浪费的.

Obviously I could just load the whole thing and then throw away the first n columns, but this is not elegant and is wasteful in terms of memory.

我也可以进入文件,找到列数,然后构造usecols参数,但这很混乱.

Also I could peak into the file, find the number of columns, and then construct the usecols argument, but that is rather messy.

关于如何优雅地解决此问题的任何想法?我可以使用一些隐藏的/未记录的参数吗?

Any ideas on how to solve this elegantly? Is there some hidden/undocumented argument that I can use?

推荐答案

在较新版本的Numpy中,np.genfromtxt可以采用可迭代的参数,因此您可以将正在读取的文件包装在生成器中,该生成器生成行,跳过前N列.如果您的数字以空格分隔,则类似于

In newer versions of Numpy, np.genfromtxt can take an iterable argument, so you can wrap the file you're reading in a generator that generates lines, skipping the first N columns. If your numbers are space-separated, that's something like

np.genfromtxt(" ".join(ln.split()[N:]) for ln in f)

这篇关于使用numpy.genfromtxt()跳过指定数量的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆