编辑距离相似度 sas? [英] Edit distance similarity sas?
问题描述
我在表 V_tablas.arreglo(columns--> domainsBad) 中有一个域列表:@hotmai.es@ghotmail.es@hotmaol.com@hotmai.com@otmail.com.....etc(超过10k)
并且需要将此域更正为@hotmail.com"我的问题是关于 oracle 的 EDIT_DISTANCE_SIMILARITY(模糊逻辑)获取返回 0 到 100 之间的整数,其中 0 表示完全没有相似性,100 表示完全匹配"是否可行?
I have a list of domains in a table V_tablas.arreglo(columns--> domainsBad):
@hotmai.es
@ghotmail.es
@hotmaol.com
@hotmai.com
@otmail.com.....etc(more than 10k)
And need to correct this domains to "@hotmail.com"
My questions is about EDIT_DISTANCE_SIMILARITY(fuzzy logic) of oracle for get 'Returns an integer Between 0 and 100, Where 0 Indicates no similarity at all and 100 Indicates a perfect match' Is it posible?
推荐答案
SAS 至少有几个函数来计算两个字符串之间的编辑距离:
SAS has at least a couple functions for calculating edit distance between two strings:
Compged,对于一般编辑距离:http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206133.htm
Compged, for general edit distance: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206133.htm
Complev,对于 Levenshtein 距离:http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206137.htm
Complev, for Levenshtein distance: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206137.htm
这篇关于编辑距离相似度 sas?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!