等价于Python/pandas中的R/ddply中的transform? [英] Equivalent of transform in R/ddply in Python/pandas?

查看:138
本文介绍了等价于Python/pandas中的R/ddply中的transform?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R的ddply函数中,您可以按组计算任何新列,并将结果附加到原始数据帧中,例如:

In R's ddply function, you can compute any new columns group-wise, and append the result to the original dataframe, such as:

ddply(mtcars, .(cyl), transform, n=length(cyl)) # n is appended to the df

在Python/熊猫中,我先进行了计算,然后合并,例如:

In Python/pandas, I have computed it first, and then merge, such as:

df1 = mtcars.groupby("cyl").apply(lambda x: Series(x["cyl"].count(), index=["n"])).reset_index()
mtcars = pd.merge(mtcars, df1, on=["cyl"])

或类似的东西.

但是,我总是觉得这很艰巨,所以一次完成所有可行吗?

However, I always feel like that's pretty daunting, so is it feasible to do it all once?

谢谢.

推荐答案

您可以通过将groupby/transform操作的结果分配给它来向DataFrame添加列:

You can add a column to a DataFrame by assigning the result of a groupby/transform operation to it:

mtcars['n'] = mtcars.groupby("cyl")['cyl'].transform('count')


import pandas as pd
import pandas.rpy.common as com

mtcars = com.load_data('mtcars')
mtcars['n'] = mtcars.groupby("cyl")['cyl'].transform('count')
print(mtcars.head())

收益

                    mpg  cyl  disp   hp  drat     wt   qsec  vs  am  gear  carb   n
Mazda RX4          21.0    6   160  110  3.90  2.620  16.46   0   1     4     4   7
Mazda RX4 Wag      21.0    6   160  110  3.90  2.875  17.02   0   1     4     4   7
Datsun 710         22.8    4   108   93  3.85  2.320  18.61   1   1     4     1  11
Hornet 4 Drive     21.4    6   258  110  3.08  3.215  19.44   1   0     3     1   7
Hornet Sportabout  18.7    8   360  175  3.15  3.440  17.02   0   0     3     2  14


要添加多列,可以使用groupby/apply.确保您应用的函数返回的DataFrame与其输入具有相同的索引.例如,


To add multiple columns, you could use groupby/apply. Make sure the function you apply returns a DataFrame with the same index as its input. For example,

mtcars[['n','total_wt']] = mtcars.groupby("cyl").apply(
    lambda x: pd.DataFrame({'n': len(x['cyl']), 'total_wt': x['wt'].sum()},
                           index=x.index))
print(mtcars.head())

收益

                    mpg  cyl  disp   hp  drat     wt   qsec  vs  am  gear  carb   n  total_wt
Mazda RX4          21.0    6   160  110  3.90  2.620  16.46   0   1     4     4   7    21.820
Mazda RX4 Wag      21.0    6   160  110  3.90  2.875  17.02   0   1     4     4   7    21.820
Datsun 710         22.8    4   108   93  3.85  2.320  18.61   1   1     4     1  11    25.143
Hornet 4 Drive     21.4    6   258  110  3.08  3.215  19.44   1   0     3     1   7    21.820
Hornet Sportabout  18.7    8   360  175  3.15  3.440  17.02   0   0     3     2  14    55.989

这篇关于等价于Python/pandas中的R/ddply中的transform?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆