pandas 使用startswith从Dataframe中选择 [英] pandas select from Dataframe using startswith
问题描述
这有效(使用Pandas 12 dev)
This works (using Pandas 12 dev)
table2=table[table['SUBDIVISION'] =='INVERNESS']
然后我意识到我需要使用开头为"来选择字段,因为我缺少一堆. 因此,根据我所能追踪的Pandas文档,我尝试过
Then I realized I needed to select the field using "starts with" Since I was missing a bunch. So per the Pandas doc as near as I could follow I tried
criteria = table['SUBDIVISION'].map(lambda x: x.startswith('INVERNESS'))
table2 = table[criteria]
并得到AttributeError:'float'对象没有属性'startswith'
And got AttributeError: 'float' object has no attribute 'startswith'
所以我尝试了具有相同结果的替代语法
So I tried an alternate syntax with the same result
table[[x.startswith('INVERNESS') for x in table['SUBDIVISION']]]
参考 http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing 第4节:系列的列表理解和映射方法也可以用于生成更复杂的条件:
Reference http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing Section 4: List comprehensions and map method of Series can also be used to produce more complex criteria:
我想念什么?
推荐答案
您可以使用 str.startswith
DataFrame方法可提供更一致的结果:
You can use the str.startswith
DataFrame method to give more consistent results:
In [11]: s = pd.Series(['a', 'ab', 'c', 11, np.nan])
In [12]: s
Out[12]:
0 a
1 ab
2 c
3 11
4 NaN
dtype: object
In [13]: s.str.startswith('a', na=False)
Out[13]:
0 True
1 True
2 False
3 False
4 False
dtype: bool
和布尔索引将正常工作(我更喜欢使用loc
,但不使用它的情况相同):
and the boolean indexing will work just fine (I prefer to use loc
, but it works just the same without):
In [14]: s.loc[s.str.startswith('a', na=False)]
Out[14]:
0 a
1 ab
dtype: object
.
看起来在Series/列中您的至少一个元素是浮点数,它没有startswith方法,因此AttributeError,列表推导应该引发相同的错误...
这篇关于 pandas 使用startswith从Dataframe中选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!