在C#中使用正则表达式突出显示的单词列表 [英] Highlight a list of words using a regular expression in c#

查看:108
本文介绍了在C#中使用正则表达式突出显示的单词列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含一些缩写的网站内容。我认识到缩写列表为网站,与他们一起解释。我想创建一个正则表达式,这将让我代替所有与一些标记的内容中找到公认的缩写



例如:



内容:

这是MEMB的只是一个小测试,看看它是否得到回升。 


德布当然也应该在这里抓住

缩写:

 MEMB =会员; DEB =登场; 



结果:

这是[标题=会员] MEMB [/年只是一个小测试],看它是否被拾起。 
[标题=闪亮登场]德布[/ A],当然也应该在这里抓住了。



(这是。只是简单示例标记)



感谢



编辑:



CraigD的答案是几乎没有,但也有问题。我只想匹配整个单词。我也想保持每个单词替换的正确大小写,这样的deb仍然DEB,而黛比仍然德布按原文。例如,该输入:

 
这是MEMB只是一个小测试。
和其他MEMB,但不amemba。
德布当然也应该抓住here.deb!


解决方案

首先,你需要的 Regex.Escape() 所有的输入字符串。



然后你可以找他们的字符串,并反复用标记替换它们,你心里有:

 字符串简称=MEMB; 
串字=成员;
字符串模式=的String.Format(\b {0} \b,Regex.Escape(简称));
串顶替=的String.Format([标题= \{0} \] {1} [/ A],字缩写);
字符串输出= Regex.Replace(输入模式,顶替);



编辑:我问一个简单的 与string.replace() 是不够的 - 但我可以看到,为什么正则表达式是可取的:你可以用它只能通过使用单词边界锚的模式实施整词替换



您可以去尽可能建立一个单一的模式从您的所有逃跑的输入字符串,如:

  \b(?:{abbr_1} | {abbr_2} | {abbr_3 } | {} abbr_n)\b 

,然后用的match评估,以找到合适的替代品。这样你可以避免遍历输入字符串一次以上。


I have some site content that contains abbreviations. I have a list of recognised abbreviations for the site, along with their explanations. I want to create a regular expression which will allow me to replace all of the recognised abbreviations found in the content with some markup.

For example:

content:

This is just a little test of the memb to see if it gets picked up. 
Deb of course should also be caught here.

abbreviations:

memb = Member; deb = Debut; 

result:

This is just a little test of the [a title="Member"]memb[/a] to see if it gets picked up. 
[a title="Debut"]Deb[/a] of course should also be caught here.

(This is just example markup for simplicity).

Thanks.

EDIT:

CraigD's answer is nearly there, but there are issues. I only want to match whole words. I also want to keep the correct capitalisation of each word replaced, so that deb is still deb, and Deb is still Deb as per the original text. For example, this input:

This is just a little test of the memb. 
And another memb, but not amemba. 
Deb of course should also be caught here.deb!

解决方案

First you would need to Regex.Escape() all the input strings.

Then you can look for them in the string, and iteratively replace them by the markup you have in mind:

string abbr      = "memb";
string word      = "Member";
string pattern   = String.Format("\b{0}\b", Regex.Escape(abbr));
string substitue = String.Format("[a title=\"{0}\"]{1}[/a]", word, abbr);
string output    = Regex.Replace(input, pattern, substitue);

EDIT: I asked if a simple String.Replace() wouldn't be enough - but I can see why regex is desirable: you can use it to enforce "whole word" replacements only by making a pattern that uses word boundary anchors.

You can go as far as building a single pattern from all your escaped input strings, like this:

\b(?:{abbr_1}|{abbr_2}|{abbr_3}|{abbr_n})\b

and then using a match evaluator to find the right replacement. This way you can avoid iterating the input string more than once.

这篇关于在C#中使用正则表达式突出显示的单词列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆