如何在系列上应用功能 [英] How to apply a function on a Series

查看:39
本文介绍了如何在系列上应用功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出系列s:

                  Name
0     Tennessee Oilers
1     Tennessee Titans
2  Washington Redskins

我想应用一个函数来重命名值.

I would like a apply a function to rename the values.

translate = {
    'Houston Oilers': 'Tennessee Titans',
    'Tennessee Oilers': 'Tennessee Titans'
}

s = s.apply(lambda x: translate.get(x, x))

这引起了:

TypeError :(系列"对象是可变的,因此不能 散列",则发生在索引0')

TypeError: ("'Series' objects are mutable, thus they cannot be hashed", u'occurred at index 0')

如果我在DataFrame的列上应用了它,那会起作用的.

Had I applied this on DataFrame's column instead, this would have worked.

我认为我是根据文档您能纠正我吗?

I thought I was doing this according to the docs Can you correct me please?

推荐答案

使用

Use map to perform the lookup:

In [204]:
translate = {
    'Houston Oilers': 'Tennessee Titans',
    'Tennessee Oilers': 'Tennessee Titans'
}
s.map(translate)

Out[204]:
0    Tennessee Titans
1                 NaN
2                 NaN
Name: Name, dtype: object

s = s.apply(lambda x: translate.get(x, x))失败的原因是因为这里的lambda是熊猫Series,并且由于给出的错误原因,不能将其用作键查找值,因为无法对必须是dict键的哈希进行哈希处理.

The reason s = s.apply(lambda x: translate.get(x, x)) fails is because the lambda here is a pandas Series and this cannot be used as a key lookup value for the error reason given as it cannot be hashed which dict keys must be.

编辑

实际上我无法重现您的错误:

Actually I can't reproduce your error:

In [210]:
s.apply(lambda x: translate.get(x, x))

Out[210]:
0       Tennessee Titans
1       Tennessee Titans
2    Washington Redskins
Name: Name, dtype: object

上面的方法很好

编辑1

要保留不存在的值,可以调用 dropna update :

To keep non-existing values you can call dropna and update:

In [219]:
s.update(s.map(translate).dropna())
s

Out[219]:
0       Tennessee Titans
1       Tennessee Titans
2    Washington Redskins
Name: Name, dtype: object

当您使用read_csv读取csv时,即使只有一列,它也会返回一个df,如果您想要一个序列,则可以传递参数squeeze=True:

When you read a csv using read_csv it returns a df even if it only has a single column, if you want a series then pass param squeeze=True:

In [223]:
t="""Name
Tennessee Oilers
Tennessee Titans
Washington Redskins"""
type(pd.read_csv(io.StringIO(t), squeeze=True))

Out[223]:
pandas.core.series.Series

编辑2

发生错误是因为您在单列df上调用了apply:

Your error occurred because you called apply on a single column df:

pd.DataFrame(s).apply(lambda x: translate.get(x, x))

因此,这与系列apply遍历每个可以被哈希处理的值的Series不同,但是在这里它传递了无法被哈希处理的整个Series,如果您这样做,它将起作用:

So this is different to a Series where apply iterates over each value which can be hashed but here it's passing the entire Series which cannot be hashed, it would work if you did this:

In [227]:
pd.DataFrame(s).apply(lambda x: translate.get(x['Name'], x['Name']), axis=1)

Out[227]:
0       Tennessee Titans
1       Tennessee Titans
2    Washington Redskins
dtype: object

传递axis=1执行按行值传递

这篇关于如何在系列上应用功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆