记号化字符串与字符串DELIM [英] Tokenize a string with delim of strings

查看:128
本文介绍了记号化字符串与字符串DELIM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有这样一个字符串


  

这是将由此而被拆分一个字符串


我想获得分割结果


  1. 是将由被分割的字符串


  2. 这是一个字符串

  3. 将由此分裂和

1和2分别由这个分割
图3和4是由分割,即

我的解决方法是使用地图一串字符串,并将结果存储在同一类型串来串的另一个地图。然而,对于更复杂和更长的文字,其结果存储在地图成为在上述1和3的子串重复,即为是一个字符串被重复并且这个冗余产生不正确的统计结果。

请您提供一个整洁的更好的解决方案,以标记化的长字符串与不同的长字符串分隔符?


解决方案

 字符串的myString =这是将由此而被分裂一个字符串;
串富= myString.ToUpper();字符串[] = byThis foo.Split(新的String [] {这个},StringSplitOptions.RemoveEmptyEntries);
字符串[] = byThat foo.Split(新的String [] {,即},StringSplitOptions.RemoveEmptyEntries);字符串[] =所有foo.Split(新的String [] {表示,这个},StringSplitOptions.RemoveEmptyEntries);

或者你可以使用正则表达式为

 的String [] =所有System.Text.RegularEx pressions.Regex.Split(MyString的,你的模式,System.Text.RegularEx pressions.RegexOptions。 IGNORECASE);

If I have a string like

"This is a string that will be split by this and that"

I would like to get the split results as

  1. "is a string that will be split by"
  2. "and that"
  3. "this is a string"
  4. "will be split by this and"

1 and 2 are split by "this" 3 and 4 are split by "that"

My solution is use a map of string to string and store the result in another map of the same type-string to string. However, for more complex and longer text, the results stored in the map become repeated, i.e as in the above 1 and 3 the substring "is a string" is repeated and this redundancy produces incorrect statistical results.

Would you please offer a neat better solution to tokenizing a long string with delimiters that are different long strings?

解决方案

string myString = "This is a string that will be splitted by this and that";
string foo = myString.ToUpper();

string[] byThis = foo.Split(new string[] { "THIS" }, StringSplitOptions.RemoveEmptyEntries);
string[] byThat = foo.Split(new string[] { "THAT" }, StringSplitOptions.RemoveEmptyEntries);

string[] all = foo.Split(new string[] { "THAT", "THIS" }, StringSplitOptions.RemoveEmptyEntries);

Or you can use Regex for that

string[] all = System.Text.RegularExpressions.Regex.Split(myString, "your pattern", System.Text.RegularExpressions.RegexOptions.IgnoreCase);

这篇关于记号化字符串与字符串DELIM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆