从特定文件获取数据 [英] Grabbing data from certain files

查看：134 发布时间：2017/2/25 0:28:51 python python-2.7 csv grab

本文介绍了从特定文件获取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有〜200个文件，并希望抓取每个文件中的数据，然后将所有的数据显示在一个 .csv 文件中。

I have ~200 files and would like to grab the data in each file, then to show all the data in one .csv file.

例如，文件列表为

#targeted folder
a001.opd
a002.opd
a003.opd
...
..
.
a200.opd

每个文件具有相同的数据结构，

Each file has the same data structure which looks like

model <many spaces>          1 <many spaces>       0.003    0.002  # Title
mo(data,1) <many spaces>     1 <many spaces>       0.2      0.0001 # 1
mo(data,1) <many spaces>     2 <many spaces>      -0.1      0.04   # 2
mo(data,1) <many spaces>     3 <many spaces>      -0.4      0.005  # 3
....................................
................
.............                                                      # n-1
......                                                             # n

如果我想在 grab_result.csv 文件，任何人如何实现这个由python。

If I would like to see something in my grab_result.csv file, does anyone how to achieve this by python.

#grab_result.csv  # order will be from left to right that is a001 to a200

a001                                      a002                                             
model      1 0.003 0.002  <empty column>  model      1  0.02   0.1   <empty column> 
mo(data,1) 1 0.2   0.0001 <empty column>  mo(data,1) 1  0.04   0.001 <empty column>
mo(data,1) 2 -0.1  0.04   <empty column>  mo(data,1) 2 -0.145  0.014 <empty column>
mo(data,1) 3 -0.2  0.003  <empty column>  mo(data,1) 3 -0.24   0.06  <empty column>

下面是我所做的代码。

import os

def openfolder(path, outputfile='grab_result.csv'):  # get .opd file from folder and open an output file
    if os.path.isdir(path): 
        fo = open(outputfile, 'wb')
        fo.write('filename') # write title here
        for filename in [os.path.abspath(path)+'\\'+each for each in os.listdir(path) if each.endswith('.opd')]:                         
            return openfile(filename)
    else:
        print "path unavailable"

    openfolder('C:\\path', 'C:\\path\\grab_result.csv')     


def openfile(filename):   # open file.opd
    if os.path.isfile(filename) and filename.endswith('.opd'): 
        return grabdata(open(filename, 'rb').read())         
    else:
        print "invalid file"
        return []       


def grabdata(string):   # start to grab data
    ret = []
    idx_data = string.find('model')
    # then I stop here....

有没有人知道如何从这些文件中获取数据？

Does anyone know how to grab the data from these files?

以下是我的示例文件（ http://goo.gl/HyT0wM ）

Here is my example file ( http://goo.gl/HyT0wM )

推荐答案

如果你有很多文件有很多内容，我会使用生成器。这允许不将所有内容加载到存储器中。下面是我将如何处理它：

If you have many files with lots of content, I would use generators. That allows not to load all the contents into memory. Here is how I would go about it:

def get_all_files(path):
    ## get a generator with all file names
    import os
    import glob
    return glob.iglob(os.path.join(path,'*.opd'))

def get_all_data(files):
    ## get a generator with all the data from all the files
    for fil in files:
        with open(fil, 'r') as the_file:
            for line in the_file:
                yield line

def write_lines_to_file(lines, outfile):
    with open(outfile, 'w') as the_file:
        for line in lines:
            ## add here an if statement if not all lines should be written to outfile
            the_file.write(line+'\n')

path = 'blah blah'
outfile = 'blah.csv'
files = get_all_files(path)
lines = get_all_data(files)
write_lines_to_file(lines, outfile)

这篇关于从特定文件获取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从特定文件获取数据 [英] Grabbing data from certain files

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从特定文件获取数据 [英] Grabbing data from certain files

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭