regex不区分大小写对pandas中的列进行过滤 [英] regex case insensitive filtering of columns in pandas
问题描述
我正在尝试使用Python在python的csv文件中匹配一个字符串(列),但没有任何匹配.我希望匹配的字符串不区分大小写.我很新,但这就是我想要做的
I am trying to match a string(column) in csv files in python using Python but it does not match anything. I want the string to be match to be case insensitive. I am quite new but this is what I tried to do
test = pd.read_csv("data.csv")
mytest= pd.DataFrame(test, columns=[re.search("[a-zA-Z1-9_]", "columnname1", re.IGNORECASE),])
print(mytest)
我们将不胜感激任何帮助
Any help will be highly appreciated
推荐答案
如果我了解您的要求,则可以filter
将df仅返回名称匹配的列并使其不区分大小写:>
If I understand what you're after you can filter
your df to only return the columns where the name matches and make it case-insensitive:
In [298]:
df = pd.DataFrame({'columnname1':np.arange(5), 'ColumnName1':np.arange(5), 'columnname2':0, 'column name 1':0})
df
Out[298]:
ColumnName1 column name 1 columnname1 columnname2
0 0 0 0 0
1 1 0 1 0
2 2 0 2 0
3 3 0 3 0
4 4 0 4 0
In [299]:
import re
df.filter(regex=re.compile("columnname1", re.IGNORECASE))
Out[299]:
ColumnName1 columnname1
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
编辑
仅匹配名称但不带单词,因此匹配测试"而不匹配我的测试":
For matching just the name without words preceding it, so matching on 'Test' but not 'My Test':
In [52]:
df = pd.DataFrame({'Test':np.arange(5), 'ColumnName1':np.arange(5), 'My Test':0, 'My column name 1':0})
import re
df.filter(regex=re.compile(r"^Test$", re.IGNORECASE))
Out[52]:
Test
0 0
1 1
2 2
3 3
4 4
因此^
在str的开头查找'Test',而$
标志着要搜索的模式的结尾,这里有一个方便的
So the ^
looks for 'Test' at the beginning of the str and the $
marks the end of the pattern to search, there is a handy cheat sheet.
这篇关于regex不区分大小写对pandas中的列进行过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!