高尔夫代码:“颜色突出显示"重复文字 [英] Code golf: "Color highlighting" of repeated text
本文介绍了高尔夫代码:“颜色突出显示"重复文字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
(感谢下面的greg0ire帮助您提供关键概念)
(Thanks to greg0ire below for helping with key concepts)
挑战: 构建一个程序,以查找所有子字符串,并使用颜色属性对其进行标记"(有效地以XML突出显示它们).
The challenge: Build a program that finds all substrings and "tags" them with color attributes (effectively highlighting them in XML).
规则:
- 仅应对长度为2或更大的子字符串执行此操作.
- 子字符串只是连续字符的字符串,其中可能包含非字母字符.请注意,空格和其他标点符号不会分隔子字符串.
- 不能忽略字符大小写.
- 突出显示"应该通过在XML中标记子字符串来完成.您的标记应采用
<TAG#>theSubstring</TAG#>
的形式,其中#
是该子字符串和相同子字符串唯一的正数. - 该算法的优先级是找到最长的子字符串,而不是找到文本中匹配子字符串的次数.
- This should only be done for substrings of length 2 or more.
- Substrings are just strings of consecutive characters, which may include non-alphabetic characters. Note that spaces and other punctuation do not delimit substrings.
- Character casing cannot be ignored.
- The "highlight" should be done by tagging the substring in XML. Your tagging should be of the form
<TAG#>theSubstring</TAG#>
where#
is a positive number unique to that substring and identical substrings. - The priority of the algorithm is to find the longest substring, not how many times it matches within the text.
注意:下例中显示的标记顺序并不重要. OP只是为了清楚起见使用了它.
Note: The order of the tagging shown in the example below is not important. Its just used by the OP for clarity.
示例输入:
LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.
部分正确的输出(在此示例中,OP可能没有完全替代)
A partially correct output (OP may NOT have completely replaced perfectly in this example)
<TAG1>LoremIpsum</TAG1>issimply<TAG2>dummytext</TAG2>of<TAG5>the</TAG5><TAG3>print</TAG3>ingand<TAG4>type</TAG4>setting<TAG6>industry</TAG6>.<TAG1>LoremIpsum</TAG1>hasbeen<TAG5>the</TAG5><TAG6>industry</TAG6>'sstandard<TAG2>dummytext</TAG2>eversince<TAG5>the</TAG5>1500s,whenanunknown<TAG3>print</TAG3>ertookagalleyof<TAG4>type</TAG4>andscrambledittomakea<TAG4>type</TAG4>specimenbook.
您的代码应能够处理一些极端情况,例如:
Your code should be able to handle edge cases, such as the following:
示例输入2:
hello!TAG!</hello.TAG.</
示例输出2:
<TAG1>hello</TAG1>!<TAG2>TAG</TAG2>!<TAG3></</TAG3><TAG1>hello</TAG1>.<TAG2>TAG</TAG2>.<TAG3></</TAG3>
获胜者:
The winner:
- 多数优雅的解决方案胜出(由 其他评论,支持)
- 奖金 解决方案的要点/注意事项 利用shell脚本
- Most elegant solution wins (judged by others comments, upvotes)
- Bonus points/consideration for solutions utilizing shell scripting
次要澄清:
- 输入可以进行硬编码或从文件中读取
- 标准仍然是优雅",虽然它有点模糊,但是它也封装了简单的字符/行数.其他人的评论和/或投票也表明SO社区如何看待挑战
推荐答案
Perl 206 , 189 , 188 , 199 ,157个字符
,不包括原始字符串和最后打印的内容.
Perl 206, 189, 188, 199, 157 chars
excluding original string and last print.
#New algorithm that produces correct ouputs for many cases
push@z,q/LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook/;
push@z,q/oktooktobookokm/;
push@z,q!dino1</dino2</!;
push@z,q!dino1TAG2dino3TAG!;
## loop for tests doesn't count
for $z(@z) {
print "input : $z\n";
$i=0;@r=();
#### begin count
$c=127;$l=length($_=$z);while($l>1){if(/(.{$l}).*\1/){push@r,$1;++$c;s/$1/chr($c)/eg}else{$l--}}$z=$_;map{++$i;$x=chr(127+$i);$z=~s:$x:<TAG$i>$_</TAG$i>:g}@r
#### end count 157 chars
## output doesn't count
;print "output : $z\n","="x80,"\n"
}
__END__
perl tags2.pl
input : LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe15
00s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook
output : <TAG1>LoremIpsum</TAG1>i<TAG11>ss</TAG11><TAG12>im</TAG12>ply<TAG2>dummytext</TAG2><TAG6>oft</TAG6><TAG13>he</TAG13><TAG4>p
rint</TAG4><TAG7>ing</TAG7><TAG8>and</TAG8><TAG5>types</TAG5>e<TAG14>tt</TAG14><TAG7>ing</TAG7><TAG3>industry</TAG3>.<TAG1>LoremIpsu
m</TAG1>hasbe<TAG15>en</TAG15><TAG9>the</TAG9><TAG3>industry</TAG3>'<TAG11>ss</TAG11>t<TAG8>and</TAG8>ard<TAG2>dummytext</TAG2>ev<TA
G16>er</TAG16>since<TAG9>the</TAG9>1500s,w<TAG13>he</TAG13>nanunknown<TAG4>print</TAG4><TAG16>er</TAG16>t<TAG10>ook</TAG10>agal<TAG1
7>le</TAG17>y<TAG6>oft</TAG6>y<TAG18>pe</TAG18><TAG8>and</TAG8>scramb<TAG17>le</TAG17>di<TAG14>tt</TAG14>omakea<TAG5>types</TAG5><TA
G18>pe</TAG18>c<TAG12>im</TAG12><TAG15>en</TAG15>b<TAG10>ook</TAG10>
================================================================================
input : oktooktobookokm
output : <TAG1>okto</TAG1><TAG1>okto</TAG1>bo<TAG2>ok</TAG2><TAG2>ok</TAG2>m
================================================================================
input : dino1</dino2</
output : <TAG1>dino</TAG1>1<TAG2></</TAG2><TAG1>dino</TAG1>2<TAG2></</TAG2>
================================================================================
input : dino1TAG2dino3TAG
output : <TAG1>dino</TAG1>1<TAG2>TAG</TAG2>2<TAG1>dino</TAG1>3<TAG2>TAG</TAG2>
================================================================================
这篇关于高尔夫代码:“颜色突出显示"重复文字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文