计算每个CSV行的均值 [英] Calculate Mean for each CSV row
问题描述
我有3个csv文件,分别名为file1,file2,file3. 每个CSV均填充3列和5653行:
i have 3 csv files named file1, file2, file3. Each CSV is filled with 3 Columns and 5653 rows:
1 0 -95
2 0 -94
3 0 -93
...
51 0 -93
0 1 -92
1 1 -91
2 1 -90
..
第一列是X变量,第二列是y变量,第三列是要获取平均值的测量值.
First column is a X variable 2nd is a y variable, 3rd is a measured value from which I want to have the mean.
我想做的是:
- 读取文件1的第一行
- 读取文件2的第一行
- 读取文件3的第一行,然后计算测量值的平均值.
例如:
file1 row1 -98
file2 row1 -97
file3 row1 -95
mean 96,666666667
我想将这些意思写入具有以下格式的新csv文件中
i want to write that mean into a new csv file with the following format
1,0,mean_of_row1 (which would be 96,666666667)
2,0,mean_of_row2
3,0,mean_of_row3
4,0,mean_of_row4
当前无法计算每个文件的测量列的平均值并将其存储为结果文件中的一行
currently im able to calculate the mean of the measurement column of each file and store it as a row in a results file
import pandas as pd
import numpy as np
csv_file_list = ["file1.csv", "file2.csv", "file3.csv"]
result_csv = "result.csv"
with open(result_csv, 'wb') as rf:
for idx, csv_file in enumerate(csv_file_list):
csv_data = pd.read_csv(csv_file).values
mean_measured = np.mean(csv_data[:, 2])
rf.write(','.join([str(0), str(idx), str(mean_measured)+"\n"]))
但是如何实现我的意图? 到目前为止,谢谢
But how can fulfill my intention? Thanks so far
推荐答案
在这种情况下,Pandas确实很有帮助.您可以避免所有循环并将整齐的csv读入数据帧.然后将所有三个数据框合并为一个,并计算 pandas.行中必填字段的DataFrame.mean .
In this situation, Pandas is really helpful. You can avoid all looping and neatly read csv into dataframe. Then join all three dataframes into one and calculate the pandas.DataFrame.mean of the required fields in row wise.
pandas.read_csv 可以选择使用nrows参数限制行数.
pandas.read_csv has the option to limit the number of rows using nrows parameter.
import pandas as pd
df1=pd.read_csv('file1.txt',names=['x1','Y1','Value1'],nrows=5356)
df2=pd.read_csv('file2.txt',names=['x2','Y2','Value2'],nrows=5356)
df3=pd.read_csv('text3.txt',names=['x3','Y3','Value3'],nrows=5356)
df_concat= pd.concat([df1,df2,df3], axis=1)
print df_concat
df_concat['meanvalue']=df_concat[['Value1','Value2','Value3']].mean(axis=1)
print(df_concat.to_csv(columns=['meanvalue'],index=False))
输出
meanvalue
-96.5
-97.0
-86.0
-95.0
这篇关于计算每个CSV行的均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!