pandas.Series.apply中的访问索引 [英] Access index in pandas.Series.apply

查看:53
本文介绍了pandas.Series.apply中的访问索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个MultiIndex系列s:

Lets say I have a MultiIndex Series s:

>>> s
     values
a b
1 2  0.1 
3 6  0.3
4 4  0.7

我想应用一个使用行索引的函数:

and I want to apply a function which uses the index of the row:

def f(x):
   # conditions or computations using the indexes
   if x.index[0] and ...: 
   other = sum(x.index) + ...
   return something

如何为这样的功能做s.apply(f)?进行此类操作的推荐方法是什么?我希望获得一个新的Series,并将此函数所产生的值应用于每行并具有相同的MultiIndex.

How can I do s.apply(f) for such a function? What is the recommended way to make this kind of operations? I expect to obtain a new Series with the values resulting from this function applied on each row and the same MultiIndex.

推荐答案

我不相信apply有权访问该索引;它将每行视为一个numpy对象,而不是一个Series,如您所见:

I don't believe apply has access to the index; it treats each row as a numpy object, not a Series, as you can see:

In [27]: s.apply(lambda x: type(x))
Out[27]: 
a  b
1  2    <type 'numpy.float64'>
3  6    <type 'numpy.float64'>
4  4    <type 'numpy.float64'>

要解决此限制,请将索引提升为列,应用函数,然后使用原始索引重新创建系列.

To get around this limitation, promote the indexes to columns, apply your function, and recreate a Series with the original index.

Series(s.reset_index().apply(f, axis=1).values, index=s.index)

其他方法可能会使用s.get_level_values(在我看来,它通常很难看),或者使用s.iterrows(),这可能会更慢-可能取决于f的工作.

Other approaches might use s.get_level_values, which often gets a little ugly in my opinion, or s.iterrows(), which is likely to be slower -- perhaps depending on exactly what f does.

这篇关于pandas.Series.apply中的访问索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆