使用for循环重命名 pandas 数据框列 [英] Renaming pandas data frame columns using a for loop

查看:117
本文介绍了使用for循环重命名 pandas 数据框列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不确定这是否是愚蠢的方法,但是我有几个数据帧,所有的数据帧都有相同的列.我需要重命名每个列中的列以反映每个数据框的名称(此后,我将对所有这些列进行外部合并).

I'm not sure if this is a dumb way to go about things, but I've got several data frames, all of which have identical columns. I need to rename the columns within each to reflect the names of each data frame (I'll be performing an outer merge of all of these afterwards).

假设数据帧称为df1df2df3,每个数据帧包含列namedatecount.

Let's say the data frames are called df1, df2 and df3, and each contains the columns name, date, and count.

我想将df1中的每一列重命名为name_df1date_df1count_df1.

I'd like to rename each of the columns in df1 into name_df1, date_df1, and count_df1.

我编写了一个函数来重命名列,因此:

I've written a function to rename the columns, thus:

df_list=[df1, df2, df3]

def rename_cols():
    col_name="name"+suffix
    col_count="count"+suffix
    col_date="date"+suffix

for x in df_list:
    if x['name'].tail(1).item() == df1['name'].tail(1).item():
        suffix="_"+"df1"
        rename_cols()
        continue
    elif x['name'].tail(1).item() == df2['name'].tail(1).item():
        suffix="_"+"df2"
        rename_cols()
        continue
    else:
        suffix="_"+"df3"
        rename_cols()

    col_names=[col_name,col_date,col_count]
    x.columns=col_names

不幸的是,我收到以下错误:KeyError: 'name'

Unfortunately, I get the following error: KeyError: 'name'

我真的很难弄清楚为什么会这样. df1的列(df_list中的第一个数据帧)被重命名.其他所有内容都保持不变...我是在搞乱基本语法(可能是),还是我对事情应该如何工作有根本的误解?

I'm really struggling to figure out why that's going on. The columns for df1, the first data frame in the df_list, gets renamed. Everything else stays the same... Am I messing up basic syntax (probably), or is there a fundamental misunderstanding that I've got of how things should work?

据我所知,列表中的第一个数据帧不止一次地被遍历-但是为什么会这样呢?

From what I can ascertain, the first data frame in the list is being iterated through more than once — but why would that be the case?

推荐答案

我想您可以使用以下更简单的方法来实现此目的:

I guess you can achieve this with something simplier, like that :

df_list=[df1, df2, df3]
for i, df in enumerate(df_list, 1):
    df.columns = [col_name+'_df{}'.format(i) for col_name in df.columns]

如果您的DataFrame具有漂亮的名称,则可以尝试:

If your DataFrames have prettier names you can try:

df_names=('Home', 'Work', 'Park')
for df_name in df_names:
    df = globals()[df_name]
    df.columns = [col_name+'_{}'.format(df_name) for col_name in df.columns]

或者您可以通过查找globals()(或locals())来获取每个变量的名称:

Or you can fetch the name of each variable by looking up into globals() (or locals()) :

df_list = [Home, Work, Park]
for df in df_list:
    name = [k for k, v in globals().items() if id(v) == id(df) and k[0] != '_'][0]
    df.columns = [col_name+'_{}'.format(name) for col_name in df.columns]

这篇关于使用for循环重命名 pandas 数据框列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆