使用loadtxt(Python)缩写多个文件的导入 [英] Abbreviate the import of multiple files with loadtxt (Python)

查看:175
本文介绍了使用loadtxt(Python)缩写多个文件的导入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用loadtxt简化导入多个文件的方式,我要做下一个:

I wanna abbreviate the way I import multiples files with loadtxt, I do the next:

rc1    =loadtxt("20120701_Gp_xr_5m.txt", skiprows=19)
rc2    =loadtxt("20120702_Gp_xr_5m.txt", skiprows=19)
rc3    =loadtxt("20120703_Gp_xr_5m.txt", skiprows=19)
rc4    =loadtxt("20120704_Gp_xr_5m.txt", skiprows=19)
rc5    =loadtxt("20120705_Gp_xr_5m.txt", skiprows=19)
rc6    =loadtxt("20120706_Gp_xr_5m.txt", skiprows=19)
rc7    =loadtxt("20120707_Gp_xr_5m.txt", skiprows=19)
rc8    =loadtxt("20120708_Gp_xr_5m.txt", skiprows=19)
rc9    =loadtxt("20120709_Gp_xr_5m.txt", skiprows=19)
rc10   =loadtxt("20120710_Gp_xr_5m.txt", skiprows=19)

然后我使用以下命令将它们连接起来:

Then I concatenate them using:

GOES   =concatenate((rc1,rc2,rc3,rc4,rc5,rc6,rc7,rc8,rc9,
                     rc10),axis=0)

但是我的问题是:我是否想减少所有这些?也许带有FOR或类似的东西.由于文件是日期(字符串)的后缀.

But my question is: Do I wanna reduce all of this? Maybe with a FOR or something like that. Since the files are a secuence of dates (strings).

我当时正在考虑做这样的事情

I was thinking to do something like this

day = ####我不知道如何定义从01到31的字符串

day= #### i dont know how define a string going from 01 to 31 for example

data="201207"+day+"_Gp_xr_5m.txt"

然后执行此操作,但我认为这是不正确的

Then do this, but i think is not correct

GOES=loadtxt(data, skiprows=19)

推荐答案

是的,您可以使用for循环或等效的列表推导轻松获得子数组.使用 glob模块获取所需的文件名:

Yes, you can easily get your sub-arrays with a for-loop, or with an equivalent list comprehension. Use the glob module to get the desired file names:

import numpy as np  # you probably don't need this line
from glob import glob

fnames = glob('path/to/dir')
arrays = [np.loadtxt(f, skiprows=19) for f in fnames]
final_array = np.concatenate(arrays)

如果内存使用成为问题,您还可以通过链接并逐行逐行遍历所有文件,生成到np.loadtxt.

If memory use becomes a problem, you can also iterate over all files line by line by chaining them and feeding that generator to np.loadtxt.

在OP评论后进行编辑

我关于glob的示例不是很清楚.

My example with glob wasn't very clear..

您可以使用通配符" *来匹配文件,例如glob('*')获取当前目录中所有文件的列表.因此,上面的部分代码可以更好地编写为:

You can use "wildcards" * to match files, e.g. glob('*') to get a list of all files in the current directory. A part of the code above could therefor be written better as:

fnames = glob('path/to/dir/201207*_Gp_xr_5m.txt')

或者如果您的程序已经从正确的目录运行:

Or if your program already runs from the right directory:

fnames = glob('201207*_Gp_xr_5m.txt')

我之前忘记了这一点,但是您也应该对文件名列表进行排序,因为不能保证对glob中的文件名列表进行排序.

I forgot this earlier, but you should also sort the list of filenames, because the list of filenames from glob is not guaranteed to be sorted.

fnames.sort()


一种稍微不同的方法,更多的是您在想的是下面的方法.当变量day包含日期时,可以将其放在文件名中,如下所示:


A slightly different approach, more in the direction of what you were thinking is the following. When variable day contains the day number you can put it in the filename like so:

daystr = str(day).zfill(2)
fname = '201207' + daystr + '_Gp_xr_5m.txt'

或使用巧妙的格式说明符:

Or using a clever format specifier:

fname = '201207{:02}_Gp_xr_5m.txt'.format(day)

或旧"方式:

fname = '201207%02i_Gp_xr_5m.txt' % day

然后只需在for循环中使用它即可:

Then simply use this in a for-loop:

arrays = []
for day in range(1, 32):
    daystr = str(day).zfill(2)
    fname = '201207' + daystr + '_Gp_xr_5m.txt'
    a = np.loadtxt(fname, skiprows=19)
    arrays.append(a)

final_array = np.concatenate(arrays)

这篇关于使用loadtxt(Python)缩写多个文件的导入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆