如何计算两列之间的模糊比? [英] How do I calculate fuzz ratio between two columns?

查看:89
本文介绍了如何计算两列之间的模糊比?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

开始使用熊猫。

I have two columns:
A                     B
Something             Something Else
Everything            Evythn
Someone               Cat
Everyone              Evr1

我要计算两列之间每一行的模糊比率,因此输出将如下所示:

I want to calculate fuzz ratio for each row between the two columns so the output would be something like this:

A                     B                  Ratio
Something             Something Else     12
Everything            Evythn             14
Someone               Cat                10
Everyone              Evr1               20

我将如何做到这一点?

How would I be able to accomplish this? Both the columns are in the same df.

推荐答案

使用lambda函数和 DataFrame.apply

Use lambda function with DataFrame.apply:

from fuzzywuzzy import fuzz

df['Ratio'] = df.apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
#alternative  with list comprehension
#df['Ratio'] = [fuzz.ratio(a, b) for a,b in zip(df.A, df.B)]
print (df)
            A               B  Ratio
0   Something  Something Else     78
1  Everything          Evythn     75
2     Someone             Cat      0
3    Everyone            Evr1     50

编辑:

如果可能的话,列中的某些缺失值会失败,因此添加了 DataFrame.dropna

If possible some missing values in columns it failed, so added DataFrame.dropna:

print (df)
            A               B
0   Something  Something Else
1  Everything             NaN
2     Someone             Cat
3    Everyone            Evr1

from fuzzywuzzy import fuzz

df['Ratio'] = df.dropna(subset=['A', 'B']).apply(lambda x: fuzz.ratio(x.A, x.B), axis=1)
print (df)
            A               B  Ratio
0   Something  Something Else   78.0
1  Everything             NaN    NaN
2     Someone             Cat    0.0
3    Everyone            Evr1   50.0

这篇关于如何计算两列之间的模糊比?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆