Python_如何在numpy读取的每个文件的新列中写入数据? [英] Python_How to write data in new columns for every file read by numpy?
问题描述
我有几个这样的结构的文本文件。列数相同但行数不同:
I have several text files with such a construction. Same number of columns but different rows:
1.txt
2013-08-29T15:11:18.55912 0.019494552 0.110042184 0.164076427 0.587849877
2013-08-29T15:11:18.65912 0.036270974 0.097213155 0.122628797 0.556928624
2013-08-29T15:11:18.75912 0.055350041 0.104121094 0.121641949 0.593113069
2013-08-29T15:11:18.85912 0.057159263 0.107410588 0.198122695 0.591797271
2013-08-29T15:11:18.95912 0.05288292 0.102476346 0.172958062 0.591139372
2013-08-29T15:11:19.05912 0.043507861 0.104121094 0.162102731 0.598376261
2013-08-29T15:11:19.15912 0.068343545 0.102805296 0.168517245 0.587849877
2013-08-29T15:11:19.25912 0.054527668 0.105765841 0.184306818 0.587191978
2013-08-29T15:11:19.35912 0.055678991 0.107739538 0.169997517 0.539165352
2013-08-29T15:11:19.45912 0.05321187 0.102476346 0.167530397 0.645744989
2.txt
2013-08-29T16:46:05.41730 0.048771052 0.10642374 0.180852849 0.430612023
2013-08-29T16:46:05.51730 0.046303932 0.112673779 0.166050124 0.518112585
2013-08-29T16:46:05.61730 0.059955334 0.149845068 0.164569851 0.511533595
2013-08-29T16:46:05.71730 0.042192064 0.107410588 0.115227435 0.476007051
2013-08-29T16:46:05.81730 0.037915721 0.115634324 0.177892304 0.519428383
2013-08-29T16:46:05.91730 0.043507861 0.120568566 0.187267364 0.483243939
2013-08-29T16:46:06.01730 0.042356538 0.10642374 0.143352612 0.522059978
此代码读取文件夹中的所有文本文件,进行一些数学运算,并且应将每个文本文件的结果写入单个csv的新列中。
This code reads all the text files in the folder, do some math and is supposed to write results of each text file in new columns in a single csv.
files_ = glob.glob('D:\Test files\New folder\*.txt')
averages_ = []
seg_len = 3
def cum_sum(lis):
total = 0
for x in lis:
total += x[1]
yield total
with open ('outfile.csv', 'wb') as outfile:
writer = csv.writer(outfile)
for i in files_:
acols, f_column, average_original, fcol = [], [], [], []
data = loadtxt(i , usecols = (1,2,3,4))
for x in range(0, len(data[:,0]), seg_len):
#some math on each column
sample_means = [x] + [mean(data[x:x+seg_len,i]) for i in range(4)]
#change types and save in a list
float_means = ["%1f" % (x) for x in sample_means]
#append previous two lines in lists
average_original.append(sample_means)
acols.append(float_means)
fcol = list(cum_sum(average_original))
#write fcol in a column next to acols
acols = [row + [col] for row, col in zip(acols, fcol)]
averages_.append(acols)
for row in averages_:
writer.writerows(row)
问:
为每个新文件写入新列的代码。我发现的最相关的帖子是 Python:我如何为每个读取的文件获取一个新列?,但 line.strip()
不适用于我。
Q:
But I cannot get the code to write new columns for each new file. The most relevant post I found was Python : How do i get a new column for every file I read?, but line.strip()
doesn't work for me.
I appreciate any hints how to approach this please.
推荐答案
这对你有用吗?
import pandas as pd
df = pd.DataFrame()
mad = lambda x: x[0] + x.mean()
A = []
for f in ['1.txt', '2.txt']:
tmp = pd.read_csv(f, header=None, delim_whitespace=True)
tmp = tmp.ix[:,1:5]
df = pd.concat([df, pd.rolling_apply(tmp, 3, mad)], axis=1)
df.to_csv('test.csv')
b $ b
在这种情况下, rolling_apply
函数在窗口为3的列上应用移动函数。
The rolling_apply
function applies a moving function along columns with a window of 3 in this case.
对不起,如果这不是你想要的,但我认为它显示了熊猫有多么强大。
I'm sorry if this isn't quite what you want, but I think it shows how powerful pandas can be.
这篇关于Python_如何在numpy读取的每个文件的新列中写入数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!