将csv文件的内容存储到数据帧中[Python Pandas] [英] Storing csv file's contents into data Frames [Python Pandas]

查看:240
本文介绍了将csv文件的内容存储到数据帧中[Python Pandas]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个读取csv文件的功能,将它们存储为数据帧并按小时重新采样.下面是我的代码;

I have written a function that reads csv files, stores them as a data frame and resample them on hourly basis. Below is my code;

def ABC(path1,path2):
    df1=pd.read_csv(path1, sep='\t',names = ["Datetime", "Value"])
    df2=pd.read_csv(path2, sep='\t',names = ["Datetime", "Value"])
    df1['Datetime']=pd.to_datetime(df1['Datetime'])
    df1=df1.set_index('Datetime')
    df1=df1.resample('H',how='sum')
    df2['Datetime']=pd.to_datetime(df2['Datetime'])
    df2=df2.set_index('Datetime')
    df2=df2.resample('H',how='sum')
    ABC = pd.DataFrame(df1['Value'] + df2['Value'])
    ABCD = ABC * 0.519
    return ABC, ABCD
ABC, ABCD= ABC('C:\Users\Desktop\B1.tsv'
                             ,'C:\Users\Desktop\B2.tsv')

该程序运行良好,但是如果我有30个文件路径,那么很难制作30个数据帧并执行此过程. 我正在考虑按照上述方式进行操作;

This program works well but what if I have 30 file paths then it will be difficult to make 30 data frames and do this process. I was thinking of following way of doing the above;

def ABC():
    Path= ['B1','B2','B3']
    general = []
    for i in Path:
        url = ('C:\Users\Desktop\%s.tsv'%i)
        X = pd.read_csv(url,sep='\t',names = ["Datetime", "Value"])
        X['Datetime']=pd.to_datetime(X['Datetime'])
        X=X.set_index('Datetime')
        X=X.resample('H',how='sum')
        general.append(X)
        return general
df=ABC()
print df

上面的代码仅输出一个dataFrame,而不输出第一个脚本的操作.知道我在做什么错吗?

The above code just output one dataFrame and not outputting what the first script is doing. Any idea what I am doing wrong?

推荐答案

您返回得太早了(在第一次迭代之后).这是一个缩进问题.该函数应显示为:

You return too early (after the very first iteration). It's an indentation problem. The function should read:

def ABC():
    Path= ['B1','B2','B3']
    general = pd.DataFrame()
    for i in Path:
        url = ('C:\Users\Desktop\%s.tsv'%i)
        X = pd.read_csv(url,sep='\t',names = ["Datetime", "Value"])
        X['Datetime']=pd.to_datetime(X['Datetime'])
        X=X.set_index('Datetime')
        X=X.resample('H',how='sum')
        if len(general) == 0:
            general = X
        else:
            general = general['Values'] + X['Values']
    return general

注意最后一行的缩进.

EDIT :添加了代码,以按注释中的要求对循环内的数据帧求和.

EDIT: added code to sum the dataframes inside the loop as requested in a comment.

首先,创建一个名为general的空数据框.在第一次迭代时(当空数据帧的长度为0时),将当前数据帧X分配给general.在随后的迭代中,将当前数据帧的值加到存储在general中的所有先前数据帧的值的总和中.

First, create an empty dataframe called general. At the first iteration (when the length of the empty dataframe is 0), assign the current dataframe X to general. At subsequent iterations, add the values of the current dataframe to the sum of all previous dataframes' values, stored in general.

这篇关于将csv文件的内容存储到数据帧中[Python Pandas]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆