Python模糊搜索和替换 [英] Python fuzzy search and replace

查看:770
本文介绍了Python模糊搜索和替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对字符串中的子字符串进行模糊搜索并替换该部分.例如:

I need to perfom fuzzy search for sub-string in string and replace that part. For example:

str_a = "Alabama"
str_b = "REPLACED"
orig_str = "Flabama is a state located in the southeastern region of the United States."
print(fuzzy_replace(str_a, str_b, orig_str)) # fuzzy_replace code should be implemented
# Output: REPLACED is a state located in the southeastern region of the United States.

使用 fuzzywuzzy 模块,搜索本身很简单,但是它只为我提供了字符串之间的差异比率.有什么方法可以在原始字符串中找到子字符串模糊匹配的位置?

The search itself is simple with fuzzywuzzy module, but it gives me only ratio of difference between strings. Are there any ways to find a position in original string where sub-string fuzzy matches to?

推荐答案

尝试一下..

from fuzzywuzzy import fuzz

def fuzzy_replace(str_a, str_b, orig_str):
    l = len(str_a.split()) # Length to read orig_str chunk by chunk
    splitted = orig_str.split()
    for i in range(len(splitted)-l+1):
        test = " ".join(splitted[i:i+l])
        if fuzz.ratio(str_a, test) > 75: #Using fuzzwuzzy library to test ratio
            before = " ".join(splitted[:i])
            after = " ".join(splitted[i+1:])
            return before+" "+str_b+" "+after #Output will be sandwich of these three strings

str_a = "Alabama is a"
str_b = "REPLACED"
orig_str = "Flabama is a state located in the southeastern region of the United States."
print fuzzy_replace(str_a, str_b, orig_str)

此打印

 REPLACED state located in the southeastern region of the United States.

这篇关于Python模糊搜索和替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆