使用numpy.genfromtxt进行过滤 [英] Filtering whilst using numpy.genfromtxt

查看:121
本文介绍了使用numpy.genfromtxt进行过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文件,我只需要从该文件中将某些值读取到数组中即可.该文件按指定TIMESTEP值的行划分.我需要紧随文件中最高TIMESTEP之后的数据部分.

I have a file from which I only need to read certain values into an array. The file is divided by rows which specify a TIMESTEP value. I need the section of data following the highest TIMESTEP in the file.

这些文件将包含200,000行,尽管我不知道任何给定文件的节应该从哪一行开始,并且我不知道最大的TIMESTEP值是多少.

The files will contain over 200,000 rows although I won't know which row the section I need begins for any given file and I won't know what the largest TIMESTEP value will be.

Am假设如果我可以找到最大的TIMESTEP的行号,那么我可以从该行开始导入.所有这些TIMESTEP行均以空格字符开头.关于如何进行的任何想法都会有所帮助.

Am assuming that if I can find the row number of the largest TIMESTEP then I can import starting at that line. All these TIMESTEP lines begin with a space character. Any ideas on how I might proceed would be helpful.

示例文件

 headerline 1 to skip
 headerline 2 to skip
 headerline 3 to skip
 TIMESTEP =    0.00000000    
0,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
1,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
2,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
2,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
 TIMESTEP =   0.119999997    
0,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
1,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
2,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
3,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
 TIMESTEP =    3.00000000    
0,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
1,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
1,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0
2,    1.0,   1.0,    1.0,   1.0,      1.0,   1.0

基本代码

import numpy as np

with open('myfile.txt') as f_in:
  data = np.genfromtxt(f_in, skip_header=3, comments=" ")

推荐答案

您可以使用自定义这是一个有效的示例:

从numpy导入genfromtxt

from numpy import genfromtxt

class Iter(object):
    ' a custom iterator which returns a timestep and corresponding data '

    def __init__(self, fd):
        self.__fd = fd
        self.__timestep = None
        self.__next_timestep = None
        self.__finish = False
        for _ in self.to_next_timestep(): pass # skip header

    def to_next_timestep(self):
        ' iterate until next timestep '
        for line in self.__fd:
            if 'TIMESTEP' in line:
                self.__timestep = self.__next_timestep
                self.__next_timestep = float(line.split('=')[1])
                return
            yield line
        self.__timestep = self.__next_timestep
        self.__finish = True

    def __iter__(self): return self

    def next(self):
        if self.__finish:
            raise StopIteration
        data = genfromtxt(self.to_next_timestep(), delimiter=',')
        return self.__timestep, data

with open('myfile.txt') as fd:
    iter = Iter(fd)
    for timestep, data in iter:
        print timestep, data # data can be selected upon highest timestep

这篇关于使用numpy.genfromtxt进行过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆