正则表达式 - 在此字符串上有一个脑死亡时刻 [英] Regular expression - having a braindead moment on this string

查看：91 发布时间：2019/6/17 4:47:24 regular-expression

本文介绍了正则表达式 - 在此字符串上有一个脑死亡时刻的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您好，我希望有人能够开始正常表达我的大脑。我已阅读 30分钟正则表达式教程 [ ^ ]，但我仍然卡住了。

我的要求是采取这样的字符串：

public abc.def.ghi.MYDATA1 MYDATA2 ;

并通过以下方式识别：

1.以公开

2.它包含 .def.ghi ，但 abc 可能会有所不同。

我需要提取：

1. MYDATA1

2. MYDATA2

所以：

\ bpublic\b 找到单词public： \ bpublic\b

。* 将涵盖下一个未知字符： \bpublic\b。*

def.ghi。是下一个已知的（这可能是错误的： \ bpublic \\ b。*（def\.ghi \。）

现在我遇到了一个阻塞，测试人员在 regex Planet [ ^ 告诉我，我已经出错了。

帮助！

Hello, I''m hoping somebody can kickstart my brain on Regular expressions. I have read The 30 Minute Regex Tutorial[^] , but I''m still stuck.

My requirement is to take a string like this:

public abc.def.ghi.MYDATA1 MYDATA2;

and recognise it by:

1. it starts with public
2. it contains .def.ghi, but abc may differ.

I need to extract:

1. MYDATA1
2. MYDATA2

So:

\bpublic\b finds the word public : \bpublic\b
.* will cover the next unknown characters : \bpublic\b.*
def.ghi. is the next known (this could be wrong from here : \bpublic\b.*(def\.ghi\.)

And now I have hit a block, the tester at regex Planet[^] tells me I have already gone wrong.

Help!

推荐答案

要解析源代码，您应该：

将整个文件视为单行，即正则表达式不应该将新行视为特殊字符

允许语言标记之间的空格（所有这些 \s * 看起来有点难看，但是有必要抓住所有条目）

仔细标记，例如如果你只是期望不转义标识符 [ ^ ]，使用 \w + ，否则你需要更有创意; - ）

To parse source code, you should:

treat the whole file as "singleline", i.e. the Regex should not treat new lines as special characters
allow whiespaces between language tokens (all these \s* look a bit ugly, but is necessary to catch all entries)
carefully tokenize, e.g. if you only expect not escaped identifier[^], use \w+, otherwise you need to be more creative ;-)

string text = "..."; // file content // public Word . def . ghi . Type Name ; string pattern = @"\bpublic\s+\w+\s*\.\s*def\s*\.\s*ghi\s*\.\s*(\w+)\s+(\w+)\s*;"; foreach(Match m in Regex.Matches(text, pattern, RegexOptions.Singleline)) { Console.WriteLine("1. {0}", m.Groups[1].Value); Console.WriteLine("2. {0}", m.Groups[2].Value); }

如果您有多个Word。图层（例如 ABCdef.ghi ... ），您可以按如下方式扩展模式：

If you have multiple "Word." layers (e.g. A.B.C.def.ghi...), you may extend the pattern as follows:

string pattern = @"\bpublic\s+(?:\w+\s*\.\s*)+def\s*\.\s*ghi\s*\.\s*(\w+)\s+(\w+)\s*;";

干杯

Andi

Cheers
Andi

这样的事情怎么样？

public\s +（？ '' space''\w + \.\w + \.\w +）\（？ '' type''\w +）\s +（ '' name''\w +）？;

我更喜欢使用命名捕获组来提取件。

How about something like this?

public\s+(?''space''\w+\.\w+\.\w+)\.(?''type''\w+)\s+(?''name''\w+);

I prefer to use named capture groups to extract pieces.

这篇关于正则表达式 - 在此字符串上有一个脑死亡时刻的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

正则表达式 - 在此字符串上有一个脑死亡时刻 [英] Regular expression - having a braindead moment on this string

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

正则表达式 - 在此字符串上有一个脑死亡时刻 [英] Regular expression - having a braindead moment on this string

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭