在pandas df中提取带有子字符串的行,该子字符串包含空格 [英] Extracting rows with substring containing whitespace after + in pandas df

查看:125
本文介绍了在pandas df中提取带有子字符串的行,该子字符串包含空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获取 df 中所有行,这些行的 path 列包含一个子字符串 new +文件夹。此问题从pandas DataFrame中按部分字符串选择,然后cs95的答案对于 new + fol 之类的子字符串非常有用,但搜索时结果不正确

I want to get all the rows in df whose path column contains a substring new+ folder. This question Select by partial string from a pandas DataFrame and the answer by cs95 has been very helpful for substrings like new+ or fol but the results are not correct when I search

new +文件夹

>>>dft = pandas.DataFrame([[ '/new+folder/'], ['/new+ folder/']], columns=['a'])
index     path
0         `/new+folder/`
1         `/new+ folder/`

现在使用查询

>>>print(dft.query('a.str.contains("new+")', engine='python').head())

a
0   new+folder
1  new+ folder



print(dft.query('a.str.contains("new+ ")', engine='python').head())
Empty DataFrame
Columns: [a]
Index: []



>>>print(dft.query('a.str.contains("new+ f")', engine='python').head())
Empty DataFrame
Columns: [a]
Index: []

使用进行测试

>>>dft[dft['a'].str.contains('new+')]
a
0   new+folder
1   new+ folder



>>>dft[dft['a'].str.contains('new+ ')]
a



>>>dft[dft['a'].str.contains('new+ f')]
a

如何解决出现时出现的错误后加上 + 还是感觉特殊?

How can I get the error resolved that comes when there is a after a + or I feel special characters?

熊猫0.24.2
Python 3.7.3 64位

Pandas 0.24.2 Python 3.7.3 64-bit

推荐答案

是的, + 是特殊的正则表达式字符,如果需要使用 query 的有效解决方案,则需要对其进行转义:

Yes, + is special regex character, need escape it if need working solution with query:

print(dft.query('a.str.contains("new\+ ")', engine='python').head())
               a
1  /new+ folder/

解决方案 regex = False 在这里不起作用:

Solution with regex=False here not working:

print(dft.query('a.str.contains("new+ ", regex=False)', engine='python').head())




AttributeError:'dict'对象没有属性'append'

AttributeError: 'dict' object has no attribute 'append'

如果想要通过 boolean indexing 同时使用这两种解决方案。

If want filtering by boolean indexing working both solutions.

这篇关于在pandas df中提取带有子字符串的行,该子字符串包含空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆