如何在c#中的字符串中找到重复的单词? [英] How can i find repeated words in a string in c#?
问题描述
i有一系列字符紧跟着相同的序列然后我希望程序删除重复的单词,如字符串'' abcdabcd
''我需要 abcd
我该怎么办呢。
谢谢inadvance
正则表达式
是你的朋友!
使用系统;
使用 System.Diagnostics;
使用 System.Text.RegularExpressions;
命名空间 ConsoleApplication16
{
class 程序
{
static readonly string [ ]测试= { abcdabcd, xabcdabcd, abcdabc, xaaabcdabcd};
静态 readonly 正则表达式FindDup = new Regex( @ (。+)\ 1,RegexOptions.IgnoreCase);
静态 void Main( string [] args)
{
foreach ( string t 在测试中)
{
MatchCollection allMatches = FindDup.Matches(t);
Trace.WriteLine( string .Format( {0}:{1},t,allMatches.Count));
}
}
}
}
结果:
abcdabcd:1
xabcdabcd:1
abcdabc:0
xaaabcdabcd:2
这也将在allMatches
集合中识别匹配的字符串。
编辑:添加对Collin的回复的评论:
allMatches
中的每个匹配
值包含信息关于Groups
属性中的加倍文本。Groups [0]
包含整个匹配的字符串(两个副本),Groups [1]
包含字符串单一副本。
如果您将上面的循环更改为:
foreach ( string t in 测试)
{
MatchCollection allMatches = FindDup.Matches(t);
Trace.WriteLine( string .Format( {0}:{1},t,allMatches.Count));
foreach (匹配项 in allMatches)
{
Trace。 WriteLine( string .Format( @ {0 }加倍,item.Groups [ 1 ]));
}
}
你会看到的。
如果目标是< b>删除重复,然后使用替换()
正则表达式的方法将完成这项工作:
string t2 = FindDup.Replace(t, string .Empty);
Trace.WriteLine( string .Format( @ 最终:{0},t2));
当然,可以用一个不同的字符串代替string.Empty
如果字符串只有一个字重复两次,那么你可以尝试下面
string str = abcdabcd;
string temp = str.Substring( 0 ,str.Length / 2 跨度>);
我对你要找的东西做了一些假设(你的问题并不完全清楚),但我认为这样做的伎俩。
您可以使用System.Text.StringBuilder
然后使用构建的字符串拆分原始字符串。一旦你将所有被解析的项目作为字符串为空,这是你重复的字符串,你就会突破循环。
string val = abcabcabc;
System.Text.StringBuilder sb = new System.Text.StringBuilder();
string result = string .Empty;
foreach ( var c in val)
{
sb.Append(c);
var 已解析= val.Split( new string [] {sb.ToString()},StringSplitOptions.None);
var stringFound =!parsed.Any(s = > s!= string .Empty); // 所有项目均为空
if (stringFound)
{
result = sb.ToString();
break ;
}
}
执行此操作后,重复的字符串将在中结果
。请注意,算法在第一次出现后会中断,因为当sb包含原始字符串的所有字符时,它也符合条件。该算法将找到任意数量的重复但假设字符串仅包含重复序列。
Hi,
i have sequence of characters immediately followed by the same sequence then i want the program to remove the repeated words like the string ''abcdabcd
'' i need abcd
how can i do that.
Thanks inadvance
Regex
is your friend!
using System; using System.Diagnostics; using System.Text.RegularExpressions; namespace ConsoleApplication16 { class Program { static readonly string[] Tests = { "abcdabcd", "xabcdabcd", "abcdabc", "xaaabcdabcd" }; static readonly Regex FindDup = new Regex(@"(.+)\1", RegexOptions.IgnoreCase); static void Main(string[] args) { foreach (string t in Tests) { MatchCollection allMatches = FindDup.Matches(t); Trace.WriteLine(string.Format("{0}: {1}", t, allMatches.Count)); } } } }
Results:
abcdabcd: 1 xabcdabcd: 1 abcdabc: 0 xaaabcdabcd: 2
This will also identify what the matching strings are, in theallMatches
collection.
EDIT: Add response to Collin''s comments:
Each of theMatch
values in theallMatches
contains the information about the doubled text in theGroups
property.Groups[0]
contains the whole matched string (both copies), andGroups[1]
contains the string of the single copy.
If you change the loop above to:
foreach (string t in Tests) { MatchCollection allMatches = FindDup.Matches(t); Trace.WriteLine(string.Format("{0}: {1}", t, allMatches.Count)); foreach (Match item in allMatches) { Trace.WriteLine(string.Format(@" ""{0}"" is doubled", item.Groups[1])); } }
You''ll see that.
If the objective is to remove the duplications, then use ofReplace()
method of the Regex will do the job:
string t2 = FindDup.Replace(t, string.Empty); Trace.WriteLine(string.Format(@"Final: ""{0}""", t2));
Of course, a different string can be substituted in instead ofstring.Empty
If string has only one word repeated twice then you can try below
string str = "abcdabcd"; string temp = str.Substring(0, str.Length / 2);
I am making some assumptions on what you are looking for (your question isn''t entirely clear) but I think this does the trick.
You can use aSystem.Text.StringBuilder
and then split the original string using the built up string. Once you have all of the parsed items as string empty that is your repeated string and you break out of the loop.
string val = "abcabcabc"; System.Text.StringBuilder sb = new System.Text.StringBuilder(); string result = string.Empty; foreach (var c in val) { sb.Append(c); var parsed = val.Split(new string[] {sb.ToString()}, StringSplitOptions.None); var stringFound = !parsed.Any(s => s != string.Empty);//All items are empty if (stringFound) { result = sb.ToString(); break; } }
After this runs your string that is repeated will be inresult
. Note the algorithm breaks out after the first occurance because it will also meet the criteria when the sb contains all characters of the original string. This algorithm will find any number of it being repeated but assumes the string only contains the repeated sequence.
这篇关于如何在c#中的字符串中找到重复的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!