我如何才能找到使用正则表达式的特定字符串/字符后的字符串 [英] How can I find a string after a specific string/character using regex

查看:114
本文介绍了我如何才能找到使用正则表达式的特定字符串/字符后的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我绝望与正则表达式(C#),所以我希望得到一些帮助:

I am hopeless with regex (c#) so I would appreciate some help:

Basicaly我需要解析文本,我需要找到里面的以下信息文本:

Basicaly I need to parse a text and I need to find the following information inside the text:

示例文本:

KeywordB: 的textToFind 剩下的是不相关的,但 KeywordB: 文本ToFindB 的,然后一些更多的文本

KeywordB:TextToFind the rest is not relevant but KeywordB: Text ToFindB and then some more text.

我需要发现其可与结束某些关键字后面的字或词:。

I need to find the word(s) after a certain keyword which may end with a ":".

[更新]

感谢安德鲁和艾伦:对不起,重新开放的问题但有相当多在正则表达式缺少一个重要的事情。正如我在最后的评论中写道,是否有可能有一个变量(多少字来寻找,这取决于关键字)作为正则表达式的一部分?

Thanks Andrew and Alan: Sorry for reopening the question but there is quite an important thing missing in that regex. As I wrote in my last comment, Is it possible to have a variable (how many words to look for, depending on the keyword) as part of the regex?

或:我可以为每个关键字不同的正则表达式(只会是满手)。但还是不知道怎么有四个字来寻找恒正则表达式

Or: I could have a different regex for each keyword (will only be a hand full). But still don't know how to have the "words to look for" constant inside the regex

推荐答案

里面让我知道如果我应该删除旧的职位,但也许有人要读它。

Let me know if I should delete the old post, but perhaps someone wants to read it.

做一个四个字来寻找里面的正则表达式是这样的方式:

The way to do a "words to look for" inside the regex is like this:

regex = @"(Key1|Key2|Key3|LastName|FirstName|Etc):"

什么你可能做的是不值得的正则表达式的努力,尽管它可以的可能的完成你想要的方式(对需求还没有100%的清楚,虽然)。它涉及展望下一场比赛,而在这一点上停下来。

What you are doing probably isn't worth the effort in a regex, though it can probably be done the way you want (still not 100% clear on requirements, though). It involves looking ahead to the next match, and stopping at that point.

下面是一个重新写一个正则表达式+常规功能的代码,应该做的伎俩。它不关心空格,所以如果你问密钥2像下面,将它的值分开。

Here is a re-write as a regex + regular functional code that should do the trick. It doesn't care about spaces, so if you ask for "Key2" like below, it will separate it from the value.

string[] keys = {"Key1", "Key2", "Key3"};
string source = "Key1:Value1Key2: ValueAnd A: To Test Key3:   Something";
FindKeys(keys, source);

private void FindKeys(IEnumerable<string> keywords, string source) {
    var found = new Dictionary<string, string>(10);
    var keys = string.Join("|", keywords.ToArray());
    var matches = Regex.Matches(source, @"(?<key>" + keys + "):",
                          RegexOptions.IgnoreCase);            

    foreach (Match m in matches) {
        var key = m.Groups["key"].ToString();
        var start = m.Index + m.Length;
        var nx = m.NextMatch();
        var end = (nx.Success ? nx.Index : source.Length);
        found.Add(key, source.Substring(start, end - start));
    }

    foreach (var n in found) {
        Console.WriteLine("Key={0}, Value={1}", n.Key, n.Value);
    }                            
}



而从这个输出是:

And the output from this is:

Key=Key1, Value=Value1
Key=Key2, Value= ValueAnd A: To Test 
Key=Key3, Value=   Something

这篇关于我如何才能找到使用正则表达式的特定字符串/字符后的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆