如何在系列上应用功能 [英] How to apply a function on a Series
问题描述
给出系列s
:
Name
0 Tennessee Oilers
1 Tennessee Titans
2 Washington Redskins
我想应用一个函数来重命名值.
I would like a apply a function to rename the values.
translate = {
'Houston Oilers': 'Tennessee Titans',
'Tennessee Oilers': 'Tennessee Titans'
}
s = s.apply(lambda x: translate.get(x, x))
这引起了:
TypeError :(系列"对象是可变的,因此不能 散列",则发生在索引0')
TypeError: ("'Series' objects are mutable, thus they cannot be hashed", u'occurred at index 0')
如果我在DataFrame的列上应用了它,那会起作用的.
Had I applied this on DataFrame's column instead, this would have worked.
我认为我是根据文档您能纠正我吗?
I thought I was doing this according to the docs Can you correct me please?
推荐答案
Use map
to perform the lookup:
In [204]:
translate = {
'Houston Oilers': 'Tennessee Titans',
'Tennessee Oilers': 'Tennessee Titans'
}
s.map(translate)
Out[204]:
0 Tennessee Titans
1 NaN
2 NaN
Name: Name, dtype: object
s = s.apply(lambda x: translate.get(x, x))
失败的原因是因为这里的lambda是熊猫Series
,并且由于给出的错误原因,不能将其用作键查找值,因为无法对必须是dict键的哈希进行哈希处理.
The reason s = s.apply(lambda x: translate.get(x, x))
fails is because the lambda here is a pandas Series
and this cannot be used as a key lookup value for the error reason given as it cannot be hashed which dict keys must be.
编辑
实际上我无法重现您的错误:
Actually I can't reproduce your error:
In [210]:
s.apply(lambda x: translate.get(x, x))
Out[210]:
0 Tennessee Titans
1 Tennessee Titans
2 Washington Redskins
Name: Name, dtype: object
上面的方法很好
编辑1
要保留不存在的值,可以调用 dropna
和 update
:
To keep non-existing values you can call dropna
and update
:
In [219]:
s.update(s.map(translate).dropna())
s
Out[219]:
0 Tennessee Titans
1 Tennessee Titans
2 Washington Redskins
Name: Name, dtype: object
当您使用read_csv
读取csv时,即使只有一列,它也会返回一个df,如果您想要一个序列,则可以传递参数squeeze=True
:
When you read a csv using read_csv
it returns a df even if it only has a single column, if you want a series then pass param squeeze=True
:
In [223]:
t="""Name
Tennessee Oilers
Tennessee Titans
Washington Redskins"""
type(pd.read_csv(io.StringIO(t), squeeze=True))
Out[223]:
pandas.core.series.Series
编辑2
发生错误是因为您在单列df上调用了apply:
Your error occurred because you called apply on a single column df:
pd.DataFrame(s).apply(lambda x: translate.get(x, x))
因此,这与系列apply
遍历每个可以被哈希处理的值的Series不同,但是在这里它传递了无法被哈希处理的整个Series
,如果您这样做,它将起作用:
So this is different to a Series where apply
iterates over each value which can be hashed but here it's passing the entire Series
which cannot be hashed, it would work if you did this:
In [227]:
pd.DataFrame(s).apply(lambda x: translate.get(x['Name'], x['Name']), axis=1)
Out[227]:
0 Tennessee Titans
1 Tennessee Titans
2 Washington Redskins
dtype: object
传递axis=1
执行按行值传递
这篇关于如何在系列上应用功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!