在 pandas 系列中保留具有模式的元素,而无需将其转换为列表 [英] Keep elements with pattern in pandas series without converting them to list

查看:46
本文介绍了在 pandas 系列中保留具有模式的元素,而无需将其转换为列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框:

df = pd.DataFrame(["Air type:1, Space kind:2, water", "something, Space blu:3, somethingelse"], columns = ['A'])

,我想创建一个新列,该列为每行包含所有带有:"元素的元素.在他们之中.因此,例如在第一行中,我想返回"type:1,kind:2"对于第二行,我想要"blu:3".我通过以下方式使用列表推导进行管理:

and I want to create a new column that contains for each row all the elements that have a ":" in them. So for example in the first row I want to return "type:1, kind:2" and for the second row I want "blu:3". I managed by using a list comprehension in the following way:

df['new'] = [[y for y in x  if ":" in y] for x in df['A'].str.split(",")]

但是我的问题是新列包含列表元素.

But my issue is that the new column contains list elements.

    A                                                       new
0   Air type:1, Space kind:2, water                         [Air type:1, Space kind:2]
1   something at the start:4, Space blu:3, somethingelse    [something at the start:4, Space blu:3]

我没有大量使用Python,因此我是否100%是否想念更多特定于Pandas的方式来做到这一点.如果有的话,很乐意学习和使用它.如果这是正确的方法,我如何将元素转换回字符串以对它们执行正则表达式?我尝试了如何将列表中的项目连接到单个字符串?,但这不能正常工作.

I have not used Python a lot so I am not 100% whether I am missing a more Pandas specific way to do this. If there is one, more than happy to learn about it and use it. If this is a correct approach how can I convert the elements back into strings in order to do regexes on them? I tried How to concatenate items in a list to a single string? but this is not working as I would like it to.

推荐答案

您可以使用

You can use pd.Series.str.findall here.

df['new'] = df['A'].str.findall('\w+:\w+')

                                 A               new
0            type:1, kind:2, water  [type:1, kind:2]
1  something, blu:3, somethingelse           [blu:3]

编辑:

当有多个单词时,请尝试

When there are multiple words then try

df['new'] = df['A'].str.findall('[^\s,][^:,]+:[^:,]+').str.join(', ')

                                      A                       new
0        Air type:1, Space kind:2, water  Air type:1, Space kind:2
1  something, Space blu:3, somethingelse               Space blu:3

这篇关于在 pandas 系列中保留具有模式的元素,而无需将其转换为列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆