如何计算两列之间的模糊比? [英] How do I calculate fuzz ratio between two columns?
本文介绍了如何计算两列之间的模糊比?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
开始使用熊猫。
I have two columns:
A B
Something Something Else
Everything Evythn
Someone Cat
Everyone Evr1
我要计算两列之间每一行的模糊比率,因此输出将如下所示:
I want to calculate fuzz ratio for each row between the two columns so the output would be something like this:
A B Ratio
Something Something Else 12
Everything Evythn 14
Someone Cat 10
Everyone Evr1 20
我将如何做到这一点?
How would I be able to accomplish this? Both the columns are in the same df.
推荐答案
使用lambda函数和 DataFrame.apply
:
Use lambda function with DataFrame.apply
:
from fuzzywuzzy import fuzz
df['Ratio'] = df.apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
#alternative with list comprehension
#df['Ratio'] = [fuzz.ratio(a, b) for a,b in zip(df.A, df.B)]
print (df)
A B Ratio
0 Something Something Else 78
1 Everything Evythn 75
2 Someone Cat 0
3 Everyone Evr1 50
编辑:
如果可能的话,列中的某些缺失值会失败,因此添加了 DataFrame.dropna
:
If possible some missing values in columns it failed, so added DataFrame.dropna
:
print (df)
A B
0 Something Something Else
1 Everything NaN
2 Someone Cat
3 Everyone Evr1
from fuzzywuzzy import fuzz
df['Ratio'] = df.dropna(subset=['A', 'B']).apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
print (df)
A B Ratio
0 Something Something Else 78.0
1 Everything NaN NaN
2 Someone Cat 0.0
3 Everyone Evr1 50.0
这篇关于如何计算两列之间的模糊比?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文