如何将函数应用于两列 Pandas 数据框 [英] How to apply a function to two columns of Pandas dataframe
问题描述
假设我有一个 df
,其中包含 'ID'、'col_1'、'col_2'
列.我定义了一个函数:
Suppose I have a df
which has columns of 'ID', 'col_1', 'col_2'
. And I define a function :
f = lambda x, y : my_function_expression
.
现在我想将 f
应用到 df
的两列 'col_1', 'col_2'
以元素方式计算一个新列 'col_3'
,有点像:
Now I want to apply the f
to df
's two columns 'col_1', 'col_2'
to element-wise calculate a new column 'col_3'
, somewhat like :
df['col_3'] = df[['col_1','col_2']].apply(f)
# Pandas gives : TypeError: ('<lambda>() takes exactly 2 arguments (1 given)'
怎么办?
** 添加详细示例如下***
** Add detail sample as below ***
import pandas as pd
df = pd.DataFrame({'ID':['1','2','3'], 'col_1': [0,2,3], 'col_2':[1,4,5]})
mylist = ['a','b','c','d','e','f']
def get_sublist(sta,end):
return mylist[sta:end+1]
#df['col_3'] = df[['col_1','col_2']].apply(get_sublist,axis=1)
# expect above to output df as below
ID col_1 col_2 col_3
0 1 0 1 ['a', 'b']
1 2 2 4 ['c', 'd', 'e']
2 3 3 5 ['d', 'e', 'f']
推荐答案
这是在数据帧上使用 apply
的示例,我使用 axis = 1
调用.
Here's an example using apply
on the dataframe, which I am calling with axis = 1
.
注意区别在于,不是尝试将两个值传递给函数 f
,而是重写函数以接受一个 Pandas Series 对象,然后索引 Series 以获得所需的值.
Note the difference is that instead of trying to pass two values to the function f
, rewrite the function to accept a pandas Series object, and then index the Series to get the values needed.
In [49]: df
Out[49]:
0 1
0 1.000000 0.000000
1 -0.494375 0.570994
2 1.000000 0.000000
3 1.876360 -0.229738
4 1.000000 0.000000
In [50]: def f(x):
....: return x[0] + x[1]
....:
In [51]: df.apply(f, axis=1) #passes a Series object, row-wise
Out[51]:
0 1.000000
1 0.076619
2 1.000000
3 1.646622
4 1.000000
根据您的用例,有时创建一个 pandas group
对象会很有帮助,然后在该组上使用 apply
.
Depending on your use case, it is sometimes helpful to create a pandas group
object, and then use apply
on the group.
这篇关于如何将函数应用于两列 Pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!