数据中意外插入的全词匹配 [英] Whole word matching with unexpected insertion in data
问题描述
我有字符串考虑
my $string = 'String need to be evaluated';
在 $string
我正在搜索评估或任何其他词.问题是他们可能会在字符串中插入一些标签例如.Str<data>ing 需要 eval<data>ua<data>ted
这是意料之外的.在这种情况下,我如何搜索单词?
in $string
I'm searching evaluated or any other word.
problem is their may be insertion of some tags in string
eg. Str<data>ing need to be eval<data>ua<data>ted
which is unexpected.
In this case how could I search for the words?
这是我试过的代码:
my $string = 'Text to be evaluated';
my $string2 = "Te<data>xt need to be eval<data2>ua<data>ted";
# patten to match
$pattern = "evaluated";
@b = split('',$pattern);
for my $i(@b){
$i="$i"."\(?:<data>\)?";
print "$i#\n";
}
$pattern = join('',@b);
print "\n$pattern\n";
if ($string2 =~ /$pattern/){
print "$pattern found\n";
}
您是否建议任何其他方法或模块使其变得容易?我不知道会插入什么样的数据.
Do you suggest any other method or module to make it easy? i don't know what kind of data will get inserted.
推荐答案
不确定这是否是您需要的,但如何
Not sure if that is what you need but how about
@b = split('',$pattern);
for my $i(@b){
$i=$i.".*";
print "$i \n";
}
$pattern = join('',@b);
只要模式的字符仍然存在并且顺序正确,它就应该匹配任何在随机插入之前具有该模式的字符串.它确实在字符串 esouhgvw8vwrg355#*asrgl/\u[\w]atet(45)<data>efdvd
中找到了 evaluated
什么是尽可能嘈杂的东西.但是当然,如果无法区分插入和原始字符串,您将得到误报".例如,如果字符串曾经是 evaluted
并且它变成了类似 evalu
的东西,你会得到一个肯定的.当然,如果你知道插入总是在标签中而文本不是,那么用户的回答会安全得多.
That should match any string that had the pattern before it got random insertions as long as the characters of the pattern are still there and in the correct order.
It does find evaluated
in the string esouhgvw8vwrg355#*asrgl/\u[\w]atet(45)<data>efdvd
what is about as noisy as it gets. But of course, if it is impossible to distinguish between insertion and original string, you will get "false" positives. For example if the string used to be evaluted
and it becomes something like evalu<hereisyourmissinga>ted
you will get a positive. Of course, if you knew that insertions would always be in tags while text is not, users answer is much safer.
只要您单引号输入字符串,像 [\w] (45) 之类的字符也不应该受到伤害.我不明白为什么它们会在任何时候进行插值.
As long as you single quote your input string, characters like [\w] (45) and whatnot should not hurt either. I cannot see why they would be interpolated at any point.
这篇关于数据中意外插入的全词匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!