如果该术语在搜索到的文本中用换行符打断,我该如何使用RegEx查找术语 [英] How can I use RegEx to find a term, if the term is broken by a new-line in the searched text

查看:34
本文介绍了如果该术语在搜索到的文本中用换行符打断,我该如何使用RegEx查找术语的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我正在搜索应聘者",就像以前发生的事情一样,我收到了这样的文本文件:

Say I'm searching for "applicants", and as things have before happened to me, I receive a text file like this:

We have considered the applica
nt's experience and qualification, 
and wish to grant him an interview.

现在,我仍然希望RegEx返回整个单词"applicant"的索引23处的匹配项,并且我想告诉用户部分匹配项从 m 行和列开始n .我该如何实现?

Now I still want my RegEx to return a match at index 23 of the whole word "applicant", and I want to tell the user the partial match starts on line m and column n. How can I achieve this?

我想到的一个相当乏味的解决方案是,在每次比赛之前插入一个特殊的标记字符,每次递增剩余比赛的索引.然后逐行重复搜索,寻找标记,后跟搜索词的第一个字符.

A rather tedious solution I have in mind is to insert a special marker-character before each match, each time incrementuing the indices of the remaining matches. Then repeat the search on a line by line basis and look for the marker followed by the first char of the search term.

推荐答案

在每个字符之间插入 [\ t \ r \ n] * (匹配定义集中的零个或多个字符).搜索词.然后,使用正则表达式匹配换行符( @"\ r?\ n | \ r" )将文本从0索引开始的部分拆分为 match.Index 你去了

Insert [\t\r\n]* (matches zero or more chars from the defined set) in between each character in the search word. Then, split the part of the text starting from 0 index to the match.Index with a regex matching linebreaks (@"\r?\n|\r") and there you go:

var text = "Morelines\n\nWe have considered the applica\t\r\nnt's experience and qualification, \nand wish to grant him an interview.";
Console.WriteLine(string.Format("Our text:\n{0}\n---------", text));
var search = "applicant";
var pattern = string.Join(@"[\t\r\n]*", search.ToCharArray());
Console.WriteLine(string.Format("Our pattern: {0}\n----------", pattern));
var result = Regex.Match(text, pattern);
if (result.Success) {
    Console.WriteLine(string.Format("Match: {0} at {1}\n----------", result.Value, result.Index));
    var lineNo = Regex.Split(text.Substring(0, result.Index), @"\r?\n|\r").GetLength(0);
    Console.WriteLine(string.Format("Line No: {0}", lineNo));
}

请参见在线C#演示

输出:

Our text:
Morelines

We have considered the applica  
nt's experience and qualification, 
and wish to grant him an interview.
---------
Our pattern: a[\t\r\n]*p[\t\r\n]*p[\t\r\n]*l[\t\r\n]*i[\t\r\n]*c[\t\r\n]*a[\t\r\n]*n[\t\r\n]*t
----------
Match: applica  
nt at 34
----------
Line No: 3

这篇关于如果该术语在搜索到的文本中用换行符打断,我该如何使用RegEx查找术语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆