计算每个CSV行的均值 [英] Calculate Mean for each CSV row

查看:241
本文介绍了计算每个CSV行的均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有3个csv文件,分别名为file1,file2,file3. 每个CSV均填充3列和5653行:

i have 3 csv files named file1, file2, file3. Each CSV is filled with 3 Columns and 5653 rows:

1   0   -95
2   0   -94
3   0   -93
...
51  0   -93
0   1   -92
1   1   -91
2   1   -90
..

第一列是X变量,第二列是y变量,第三列是要获取平均值的测量值.

First column is a X variable 2nd is a y variable, 3rd is a measured value from which I want to have the mean.

我想做的是:

  • 读取文件1的第一行
  • 读取文件2的第一行
  • 读取文件3的第一行,然后计算测量值的平均值.

例如:

file1 row1 -98 
file2 row1 -97
file3 row1 -95

mean 96,666666667

我想将这些意思写入具有以下格式的新csv文件中

i want to write that mean into a new csv file with the following format

 1,0,mean_of_row1 (which would be 96,666666667)
 2,0,mean_of_row2
 3,0,mean_of_row3
 4,0,mean_of_row4

当前无法计算每个文件的测量列的平均值并将其存储为结果文件中的一行

currently im able to calculate the mean of the measurement column of each file and store it as a row in a results file

import pandas as pd
import numpy as np

csv_file_list = ["file1.csv", "file2.csv", "file3.csv"]
result_csv = "result.csv"

with open(result_csv, 'wb') as rf:
    for idx, csv_file in enumerate(csv_file_list):
        csv_data = pd.read_csv(csv_file).values
        mean_measured = np.mean(csv_data[:, 2])
        rf.write(','.join([str(0), str(idx), str(mean_measured)+"\n"]))

但是如何实现我的意图? 到目前为止,谢谢

But how can fulfill my intention? Thanks so far

推荐答案

在这种情况下,Pandas确实很有帮助.您可以避免所有循环并将整齐的csv读入数据帧.然后将所有三个数据框合并为一个,并计算 pandas.行中必填字段的DataFrame.mean .

In this situation, Pandas is really helpful. You can avoid all looping and neatly read csv into dataframe. Then join all three dataframes into one and calculate the pandas.DataFrame.mean of the required fields in row wise.

pandas.read_csv 可以选择使用nrows参数限制行数.

pandas.read_csv has the option to limit the number of rows using nrows parameter.

import pandas as pd

df1=pd.read_csv('file1.txt',names=['x1','Y1','Value1'],nrows=5356)
df2=pd.read_csv('file2.txt',names=['x2','Y2','Value2'],nrows=5356)
df3=pd.read_csv('text3.txt',names=['x3','Y3','Value3'],nrows=5356)

df_concat= pd.concat([df1,df2,df3], axis=1)
print df_concat


df_concat['meanvalue']=df_concat[['Value1','Value2','Value3']].mean(axis=1)
print(df_concat.to_csv(columns=['meanvalue'],index=False))

输出

meanvalue
-96.5
-97.0
-86.0
-95.0

这篇关于计算每个CSV行的均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆