什么是文本:: JaroWinkler :: strcmp95的第三个参数? [英] What is the third parameter to Text::JaroWinkler::strcmp95 for?

查看:270
本文介绍了什么是文本:: JaroWinkler :: strcmp95的第三个参数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有兴趣用Perl编写的哈罗 - 温克勒模块来计算两个字符串之间的距离(或相似)的:

I am interested in the Jaro-Winkler module written in Perl to compute the distance (or similarity) between two strings:

http://search.cpan.org/~scw/文本JaroWinkler-0.1 / JaroWinkler.pm

函数的语法我不太清楚;我找不到它的任何明确的文档。

The syntax of the function is not clear to me; I could not find any clear documentation of it.

下面是示例code:

#!/usr/bin/perl

use 5.10.0;
use Text::JaroWinkler qw( strcmp95 );
print strcmp95("it is a dog","i am a dog.",11);

究竟做了11重present?我收集它的长度。它的长度?字符数量的长度我要检查的?在那里需要它?

What exactly does the 11 represent? I gather it is a length. Which length? The length of the amount of characters I want checked? Is it required to be there?

推荐答案

请参阅的一个回答你的问题的来源。它包含这一行:

See the source for an answer to your question. It contains this line:

$ying = sprintf("%*.*s", -$y_length, $y_length, $ying);

所以 $ y_length 是被用来格式化字符串,如果需要的话填充它们,并将它们修整到一个相同的长度。然后,这些相等长度的字符串被送入实际比较函数。这表明,亚历克斯是正确的,并给予最大(长$莹,长度$阳)的长度是要给在大多数情况下最好的结果。

So $y_length is being used to reformat the strings, padding them if necessary and trimming them to an identical length. These equal-length strings are then fed into the actual comparison function. This suggests that Alex is correct and giving a length of max(length $ying, length $yang) is going to give the best results under most circumstances.

阅读该人士还透露,如果未能提供 $ y_length ,没有默认提供。所以,你会成为空字符串比较空字符串。这些应该有一个pretty短距离JW

Reading the source also reveals that if you fail to supply $y_length, no default is supplied. So you'll be comparing the empty string to the empty string. Those should have a pretty short JW distance.

这篇关于什么是文本:: JaroWinkler :: strcmp95的第三个参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆