如何在C#中提取文本字符串 [英] How do I extract a string of text in c#

查看:105
本文介绍了如何在C#中提取文本字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在用c#分割字符串时遇到麻烦

i am having trouble splitting a string in c# have a string

start and dffdfdddddddfd<m>one</m><m>two</m><m>three</m><m>four</m>dbfjnbjvbnvbnjvbnv and end



我想提取< m>之间的文本和</m>和我需要3个输出:

输出1:一二三四

输出2:四个

输出3:一个



我该怎么办?
我该怎么办?
请给我一个示例代码
请帮忙.
感谢和问候.



and I want to extract the text between <m> and </m> and i need 3 output :

output 1 : one two three four

output 2 : four

output 3 : one



what do i do ?
how would I do this?
please give me a sample code
please help.
thanks and regards.

推荐答案

使用正则表达式:
Use a Regex:
(?<=\<m\>)[^\<]+(?=\)


//  using System.Text.RegularExpressions;

/// <summary>
///  Regular expression built for C# on: Sun, Oct 9, 2011, 07:37:07 AM
///  Using Expresso Version: 3.0.3634, http://www.ultrapico.com
///
///  A description of the regular expression:
///
///  Match a prefix but exclude it from the capture. [\<m\>]
///      \<m\>
///          Literal <

///          m

///          Literal >
///  Any character that is NOT in this class: [\<], one or more repetitions

///  Match a suffix but exclude it from the capture. [\</m\>]
///      \</m\>
///          Literal <

///          /m

///          Literal >
///
///
/// </summary>
public static Regex regex = new Regex("(?<=\\<m\\>)[^\\<]+(?=\\</m\\>)",
    RegexOptions.CultureInvariant | RegexOptions.Compiled);

// Capture all Matches in the InputText
MatchCollection ms = regex.Matches(InputText);




获取 Expresso [ ^ ]-它是免费的,它检查并生成正则表达式.我真的希望我能写出来!




Get a copy of Expresso [^] - it''s free, and it examines and generates Regular expressions. I really wish I''d written it!


我为OriginalGriff的上述优雅解决方案投票了+5 ...您也应该:) ...但是...我很好奇知道使用``Split''来实现这一点的痛苦,所以去了:
I am voting +5 for OriginalGriff''s elegant solution above ... and you should too :) ... but ... I was curious to know the pain of implementing this using ''Split'' so here goes:
// this code uses Linq: be sure and reference the Linq library 
// in your Form's 'Using-declarations"' using System.Linq;
//
// assume a winform with:
// textBox1, textBox2, button1
// textBox1 (MulitLine = false) holds the string to be split
// textBox2 (MultiLine = true) will hold the result of splitting
// button1 triggers the parsing

// variables to hold the results of indexing the split string
private List<string> method1;
private string method2;
private string method3;

// string to be turned into char[] to use in splitting
private string stringForSplit = "<m>";

private void button1_Click(object sender, EventArgs e)
{
    var result = textBox1.Text
      .Split(stringForSplit.ToCharArray())
        .Where(s => (!String.IsNullOrWhiteSpace(s)))
          .ToList();

    // ignore the first and last entries in the result
    result = result.GetRange(1, result.Count - 2);

    method1 = result;
    method2 = result.Last();
    method3 = result.First();

    // examine the result ...
    textBox2.Lines = result.ToArray();
}

讨论:

1.在这种情况下比较Split和RegEx的性能会很有趣.

2.要将这种技术的通用性"与RegEx进行真正的比较,将需要我不了解RegEx的技能.

Discussion:

1. would be interesting to compare performance of Split versus RegEx for this scenario.

2. to really compare the ''generalized usefulness'' of this technique compared to RegEx would require skills beyond my knowledge of RegEx.


这篇关于如何在C#中提取文本字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆