DataFrame字符串操作 [英] DataFrame String Manipulation
问题描述
我有一个数据框,其中有一列,其数据如下所示:
I have a dataframe that has one column with data that looks like this:
AAH.
AAH.
AAR.UN
AAR.UN
AAR.UN
AAR.UN
AAV.
AAV.
AAV.
我认为我需要使用apply方法来修剪列数据.因此,如果在句点之后没有任何内容,则保持数据不变,但是如果在句点之后没有任何内容,则仅返回字母,不带句点末.我知道我可能可以使用lambda函数,也可以使用字符串拆分之类的方法来执行此操作,但是并没有实现它的很多想法.
I think I need to use the apply method to trim the column data. So if there is anything after the period keep the data unchanged but if there is nothing after the period then return just the letters without the period at the end. I know I can probably use a lambda function and maybe a string split or something to do this but have not much of an idea to make it happen.
这是我到目前为止所拥有的:
This is kind of what I have so far:
df.apply(lambda x: string.split('.'))
我不确定是否可以通过lambda函数使用if语句或其他东西吗?
I am not sure if I can use an if statement or something with the lambda function this way?
任何指导表示赞赏.
推荐答案
Since there's only one column, you can take advantage of vectorized string operations via .str
(docs):
>>> df
0
0 AAH.
1 AAH.
2 AAR.UN
3 AAR.UN
4 AAR.UN
5 AAR.UN
6 AAV.
7 AAV.
8 AAV.
>>> df[0] = df[0].str.rstrip('.')
>>> df
0
0 AAH
1 AAH
2 AAR.UN
3 AAR.UN
4 AAR.UN
5 AAR.UN
6 AAV
7 AAV
8 AAV
否则,您将必须执行df.applymap(lambda x: x.rstrip("."))
之类的操作,或使用numpy char
方法.
Otherwise you'd have to do something like df.applymap(lambda x: x.rstrip("."))
, or drop down to numpy char
methods.
这篇关于DataFrame字符串操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!