循环浏览多个CSV文件并产生多个输出 [英] Looping through Multiple CSV files and producing Multiple Outputs
问题描述
我正在编写一些Python脚本来打开.csv文件,定义数据框,运行一些分析(例如,汇总数据,拆分列,查找平均值等),然后将分析的输出绘制在图形上.输出将是一个图形(.png文件)和一个csv文件,该文件的末尾添加了单词"_ANALYSIS".
I am writing some python script that opens a .csv file, defines the dataframe, run some analysis (e.g. aggregate data, splitting columns, finding averages etc..) and plot the output of the analysis on a graph. The outputs will be both a graph (.png file) AND a csv file with the word "_ANALYSIS" added to the original file name at the end.
我已经在Jupyter Notebook中将其设置为循环功能:
I have set it up as a loop function in Jupyter Notebook:
#import multiple csv files
import glob
import pandas as pd
import numpy as np
from pytz import all_timezones
import matplotlib.pyplot as plt
files = glob.glob('folder/*.csv')
for file in files:
df = pd.read_csv(file)
#START OF THE ANALYSIS
#Multiple lines of code starts here
#GRAPH some outputs from the analysis
df2 = df.replace(0, np.nan)
fig, ax = plt.subplots()
df2.groupby('Day_type').plot(x = 'Time', y = 'avg_vt', ax=ax, grid=True)
#OUTPUT FILES: graph + csv file
plt.savefig('*.png', index = False)
file_name="file"+str(i+1)+"_ANALYSIS"
df.to_csv('file1_ANALYSIS.csv', index = False)
不幸的是,它没有产生任何输出.添加循环功能之前,我尝试过的分析代码本身没有问题.
Unfortunately, it isn't producing any outputs. There is no problem with the analysis code itself as I tried it before I added the loop function.
谢谢, R
推荐答案
与pathlib相比更加优雅
slightly more elegant with pathlib
from pathlib import Path
folder="C:\Users\Renaldo.Moonu\Desktop\folder name"
for file in Path(folder).glob('*.csv'):
df = pd.read_csv(file)
df.fillna(0, inplace=True)
fig, ax = plt.subplots()
df.groupby('Day_type').plot(x = 'Time', y = 'avg_vt', ax=ax, grid=True)
plt.savefig(file.with_suffix('.png'), index = False)
df.to_csv(file.with_suffix('.csv'), index = False)
这篇关于循环浏览多个CSV文件并产生多个输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!