检查键是否以子串的形式存在于行中 [英] check if key is present in the row as a substring
问题描述
我有字典
myDict = {
'apple': 'FULL FORM',
'ball' : 'NEW'
}
然后是一个数据框:
myCol
..
new apple
netball
hello
我想遍历col myCol
的所有行以及字典的所有键,以查看我的任何键是否在行值中作为子字符串出现.如果是,我想获取键值并将其附加到列表中.例如,键值"apple"作为子行出现在我的第一行新苹果"中,因此我想提取键值"apple"
I want to iterate over all rows of col myCol
and all the keys of my dictionary to see if any of my keys are present as a substring in the row value. If yes, I want to obtain the key value and append it to a list. For example, the key value 'apple' comes as a substring in my first row 'new apple', so I want to extract the key value 'apple'
我正在尝试此操作,但是由于我得到了所有未找到"信息,因此迭代似乎不起作用
I am trying this but the iteration doesn't seem to work since I get all 'Not Founds'
myList = []
for index, row in df.iterrows():
for key, value in myDict.items():
if key in row['myCol'].lower():
mylist.append(key)
else:
print(row['myCol'].lower())
mylist.append('Not Found')
print(mylist)
推荐答案
您的解决方案应通过 break
进行更改:
Your solution should be changed with break
:
myList = []
for index, row in df.iterrows():
for key, value in myDict.items():
if key in row['myCol'].lower():
myList.append(key)
break
else:
print(row['myCol'].lower())
myList.append('Not Found')
print(myList)
['apple', 'ball', 'Not Found']
或使用 按字典键的Series.str.extract
,对于正则表达式或,用
|
连接,如果未生成匹配项,则缺少值,因此将其替换由 Series.fillna
并将 Series
转换为列表:
Or use Series.str.extract
by keys of dictionary with join by |
for regex or
, if no match is generated missing value, so replace it by Series.fillna
and convert Series
to list:
myList = (df['myCol'].str.extract(f'({"|".join(myDict.keys())})', expand=False, case=False)
.fillna('Not Found')
.tolist())
print(myList)
['apple', 'ball', 'Not Found']
这篇关于检查键是否以子串的形式存在于行中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!