用python计算多个文件的记录平均值 [英] Computing averages of records from multiple files with python

查看:654
本文介绍了用python计算多个文件的记录平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

亲爱的大家,

我是Python的初学者。我正在寻找在Python中执行以下操作的最佳方式:让我们假设我有三个文本文件,每个文本文件有m行和n列数字,名称文件A,B和C.以下内容可以是索引为 A [i] [j] B [k] [l] 等等。我需要计算 A [0] [0] B [0] [0] 的平均值, code> C [0] [0] ,并把它写到 D [0] [0] 的文件D中。剩下的记录也一样。例如,假设:

pre $ code A
1 2 3
4 5 6
B:
0 1 3
2 4 5
C:
2 5 6
1 1 1
pre>

因此,文件D应该是

  D:
1 2.67 4
2.33 3.33 4

我的实际文件当然大于目前的,一些Mb的顺序。如果读取按文件名索引的嵌套结构中的所有文件内容,或者尝试读取每个文件中的每一行并计算平均值,我不确定最佳解决方案。阅读本手册之后, fileinput 模块在这种情况下没有用处,因为它并不是按照我需要的方式读取并行的行,而是读取行连续。任何指导或建议,高度赞赏。

解决方案

看看 numpy 。它可以将三个文件读入三个数组(使用 fromfile ),计算平均值并将其导出到一个文本文件(使用 tofile )。

  import numpy as np 


a = np.fromfile('A.csv',dtype = np.int)
b = np .fromfile('B.csv',dtype = np.int)
c = np.fromfile('C.csv',dtype = np.int)

d =(a + b + c)/ 3.0

d.tofile('D.csv')

一些MB的大小不应该是一个问题。


Dear all,
I am beginner in Python. I am looking for the best way to do the following in Python: let's assume I have three text files, each one with m rows and n columns of numbers, name file A, B, and C. For the following, the contents can be indexed as A[i][j], or B[k][l] and so on. I need to compute the average of A[0][0], B[0][0], C[0][0], and writes it to file D at D[0][0]. And the same for the remaining records. For instance, let's assume that :

A:  
1 2 3   
4 5 6  
B:  
0 1 3  
2 4 5  
C:  
2 5 6  
1 1 1

Therefore, file D should be

D:  
1     2.67   4    
2.33  3.33   4  

My actual files are of course larger than the present ones, of the order of some Mb. I am unsure about the best solution, if reading all the file contents in a nested structure indexed by filename, or trying to read, for each file, each line and computing the mean. After reading the manual, the fileinput module is not useful in this case because it does not read the lines "in parallel", as I need here, but it reads the lines "serially". Any guidance or advice is highly appreciated.

解决方案

Have a look at numpy. It can read the three files into three arrays (using fromfile), calculate the average and export it to a text file (using tofile).

import numpy as np


a = np.fromfile('A.csv', dtype=np.int)   
b = np.fromfile('B.csv', dtype=np.int)   
c = np.fromfile('C.csv', dtype=np.int)   

d = (a + b + c) / 3.0

d.tofile('D.csv')

Size of "some MB" should not be a problem.

这篇关于用python计算多个文件的记录平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆