Pandas 应用函数将多个值返回到 Pandas 数据帧中的行 [英] pandas apply function that returns multiple values to rows in pandas dataframe
问题描述
我有一个带有时间索引和包含 3D 矢量坐标的 3 列的数据框:
I have a dataframe with a timeindex and 3 columns containing the coordinates of a 3D vector:
x y z
ts
2014-05-15 10:38 0.120117 0.987305 0.116211
2014-05-15 10:39 0.117188 0.984375 0.122070
2014-05-15 10:40 0.119141 0.987305 0.119141
2014-05-15 10:41 0.116211 0.984375 0.120117
2014-05-15 10:42 0.119141 0.983398 0.118164
我想对每一行应用一个转换,同时返回一个向量
I would like to apply a transformation to each row that also returns a vector
def myfunc(a, b, c):
do something
return e, f, g
但如果我这样做:
df.apply(myfunc, axis=1)
我最终得到了一个 Pandas 系列,它的元素是元组.这是因为 apply 将在不解包的情况下获取 myfunc 的结果.如何更改 myfunc 以便获得具有 3 列的新 df?
I end up with a Pandas series whose elements are tuples. This is beacause apply will take the result of myfunc without unpacking it. How can I change myfunc so that I obtain a new df with 3 columns?
以下所有解决方案都有效.Series 解决方案确实允许使用列名,而 List 解决方案似乎执行得更快.
All solutions below work. The Series solution does allow for column names, the List solution seem to execute faster.
def myfunc1(args):
e=args[0] + 2*args[1]
f=args[1]*args[2] +1
g=args[2] + args[0] * args[1]
return pd.Series([e,f,g], index=['a', 'b', 'c'])
def myfunc2(args):
e=args[0] + 2*args[1]
f=args[1]*args[2] +1
g=args[2] + args[0] * args[1]
return [e,f,g]
%timeit df.apply(myfunc1 ,axis=1)
100 loops, best of 3: 4.51 ms per loop
%timeit df.apply(myfunc2 ,axis=1)
100 loops, best of 3: 2.75 ms per loop
推荐答案
只返回一个列表而不是元组.
Just return a list instead of tuple.
In [81]: df
Out[81]:
x y z
ts
2014-05-15 10:38:00 0.120117 0.987305 0.116211
2014-05-15 10:39:00 0.117188 0.984375 0.122070
2014-05-15 10:40:00 0.119141 0.987305 0.119141
2014-05-15 10:41:00 0.116211 0.984375 0.120117
2014-05-15 10:42:00 0.119141 0.983398 0.118164
[5 rows x 3 columns]
In [82]: def myfunc(args):
....: e=args[0] + 2*args[1]
....: f=args[1]*args[2] +1
....: g=args[2] + args[0] * args[1]
....: return [e,f,g]
....:
In [83]: df.apply(myfunc ,axis=1)
Out[83]:
x y z
ts
2014-05-15 10:38:00 2.094727 1.114736 0.234803
2014-05-15 10:39:00 2.085938 1.120163 0.237427
2014-05-15 10:40:00 2.093751 1.117629 0.236770
2014-05-15 10:41:00 2.084961 1.118240 0.234512
2014-05-15 10:42:00 2.085937 1.116202 0.235327
这篇关于Pandas 应用函数将多个值返回到 Pandas 数据帧中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!