如何将函数应用于两列 Pandas 数据框 [英] How to apply a function to two columns of Pandas dataframe

查看:42
本文介绍了如何将函数应用于两列 Pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个 df,其中包含 'ID'、'col_1'、'col_2' 列.我定义了一个函数:

Suppose I have a df which has columns of 'ID', 'col_1', 'col_2'. And I define a function :

f = lambda x, y : my_function_expression.

现在我想将 f 应用到 df 的两列 'col_1', 'col_2' 以元素方式计算一个新列 'col_3' ,有点像:

Now I want to apply the f to df's two columns 'col_1', 'col_2' to element-wise calculate a new column 'col_3' , somewhat like :

df['col_3'] = df[['col_1','col_2']].apply(f)  
# Pandas gives : TypeError: ('<lambda>() takes exactly 2 arguments (1 given)'

怎么办?

** 添加详细示例如下***

** Add detail sample as below ***

import pandas as pd

df = pd.DataFrame({'ID':['1','2','3'], 'col_1': [0,2,3], 'col_2':[1,4,5]})
mylist = ['a','b','c','d','e','f']

def get_sublist(sta,end):
    return mylist[sta:end+1]

#df['col_3'] = df[['col_1','col_2']].apply(get_sublist,axis=1)
# expect above to output df as below 

  ID  col_1  col_2            col_3
0  1      0      1       ['a', 'b']
1  2      2      4  ['c', 'd', 'e']
2  3      3      5  ['d', 'e', 'f']

推荐答案

这是在数据帧上使用 apply 的示例,我使用 axis = 1 调用.

Here's an example using apply on the dataframe, which I am calling with axis = 1.

注意区别在于,不是尝试将两个值传递给函数 f,而是重写函数以接受一个 Pandas Series 对象,然后索引 Series 以获得所需的值.

Note the difference is that instead of trying to pass two values to the function f, rewrite the function to accept a pandas Series object, and then index the Series to get the values needed.

In [49]: df
Out[49]: 
          0         1
0  1.000000  0.000000
1 -0.494375  0.570994
2  1.000000  0.000000
3  1.876360 -0.229738
4  1.000000  0.000000

In [50]: def f(x):    
   ....:  return x[0] + x[1]  
   ....:  

In [51]: df.apply(f, axis=1) #passes a Series object, row-wise
Out[51]: 
0    1.000000
1    0.076619
2    1.000000
3    1.646622
4    1.000000

根据您的用例,有时创建一个 pandas group 对象会很有帮助,然后在该组上使用 apply.

Depending on your use case, it is sometimes helpful to create a pandas group object, and then use apply on the group.

这篇关于如何将函数应用于两列 Pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆