立即查找小写字母后跟大写字母 [英] Find Lowercase immediately followed by uppercase
问题描述
我的文字如下:
< font size = + 2 color =#F07500>< b> [BA] LT; /字体>< / B个
< ul>< li>< font color =#0B610B>字词词语词< br>< / font>< / li>< / ul>
< ul>< li>< font color =#F07500>字词词< br>< / font>< / li>< / ul>
< ul>< li>< font color =#0B610B> < br>< / font>< / li>< / ul>
< ul>< li>< font color =#0B610B> WordWord< BR>< /字体>< /立GT;< / UL>
< br>< font color =#E41B17>< b>大写字母< / b>< / font>
< ul>< li>< font color =#0B610B> < br>< / font>< br>< font color =#E41B17>< b> PhD和dataBase< / b>< / font> < /立GT;< / UL>
< font color =#0B610B> < br>< / font>< / li>< / ul>< dd>< font color =#F07500> »»词词词词。< br>< / font>
每一个< ; font color =#0B610B> ...< / font>
。例如:
< font color =#0B610B>词词wordWord词。< br>< / font>
我想通过将它们拆分如下来纠正这个错误(即:添加一个冒号和一个空格他们):
< font color =#0B610B>单词单词:单词单词。< br>< / font>
到目前为止,我一直在使用: 但是当我使用时: 它会找到但选择 我希望它找到并替换每个特定标签对中的错误: 是否有正则表达式来解决这个问题?非常感谢。 一般来说,正则表达式并不是解析HTML的好主意(如果它是一次性的可能会好的)。 我认为这可能是你的正则表达式不工作的原因。 在一个案例中,我可以想到如果在一个范围内没有匹配( 在这种情况下, only 有效匹配是 我可以想到一个简单的解决方法,但我不会不推荐它,除非这个任务是一次性的,因为使用HTML的正则表达式总是容易出现这样的错误!这个正则表达式也相当低效。尝试(未经测试): 它说:寻找 My text is as below: There is a lowercase letter immediately followed by an uppercase in each of the I want to correct this error by splitting them as follows (i.e: adding a colon and a space between them): So far, I have been using: to select each of the instances of But when I use: it does find but selects everything between I want it to find and replace error in each of this specific pair of tags: Are there any regular expressions to solve this problem? Many thanks in advance. In general, regex is not a good idea for parsing HTML (if it's a once-off you might be OK). I think this might be the reason your regex is not working.
Can you give an example of a case in which your regex fails? One case I can think of if is there is no match ( In this case, the only valid match is I can think of a crude workaround but I wouldn't recommend it unless this task is a once-off because using regex for HTML is always prone to such errors!. This regex is also pretty inefficient. Try (untested): It says, "look for the 这篇关于立即查找小写字母后跟大写字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
< pre $ (< font color =#0B610B \ b [^>] *>)(。*?< / font>)
$ c $选择< font color =#0B610B> ...< / font>的所有实例 / code>,并且它可以很好地通过
< font color =#0B610B> ...< / font>
的一个实例找到一个实例
(< font color =# 0B610B \ b [^>]>)(。*?[az])([AZ]。*?< / font>)
< font color =#0B610B> ...< / font>
之间的所有内容一行,不管其他字体颜色标签,并替换其他不需要的实例。
< font color =#0B610B> ...< / font>
,不会抓取以< font color =#0B610B>
并以。结尾< / font>
您能举出一个你的正则表达式失败的例子吗?
[az] [AZ]
))匹配< font color =#0B610B>< / font>
对,但是在邻居< / em>字体>< /字体>
。例如:
< font color =#0B610B>单词单词< / font>< font color =#000000>单词wordWord< /字体>
< font color =#0B610B> word word< / font>< font color =#000000> word word
和字符串的其余部分< / font> / code>,所以这就是正则表达式匹配的地方(因为如果它匹配的话就会!)
(< font color =#0B610B\b [^>]>)( ([^ )<(?!/ font))*?[az])([AZ]。*?< / font>)
$ p>
< font color = xxxx>
标签,后跟一个尖括号<
not 后跟 / font
,或其他任何东西,再后面跟着 [AZ] [AZ]
。
因此它试图确保匹配不会超过< / font>
边界。<font size=+2 color=#F07500><b> [ba]</font></b>
<ul><li><font color =#0B610B> Word word wordWord word.<br></font></li></ul>
<ul><li><font color =#F07500> Word word word.<br></font></li></ul>
<ul><li><font color =#0B610B> Word word word wordWord.<br></font></li></ul>
<ul><li><font color =#0B610B> WordWord.<br></font></li></ul>
<br><font color =#E41B17><b>UPPERCASE LETTERS</b></font>
<ul><li><font color =#0B610B> Word word wordWord word.<br></font><br><font color =#E41B17><b>PhD and dataBase</b></font> </li></ul>
<font color =#0B610B> Word word word.<br></font></li></ul><dd><font color =#F07500> »» Word wordWord word.<br></font>
<font color =#0B610B>...</font>
. For example:<font color =#0B610B> Word word wordWord word.<br></font>
<font color =#0B610B> Word word word: Word word.<br></font>
(<font color =#0B610B\b[^>]*>)(.*?</font>)
<font color =#0B610B>...</font>
, and it works fine in finding one instance by one instance of <font color =#0B610B>...</font>
.(<font color =#0B610B\b[^>]*>)(.*?[a-z])([A-Z].*?</font>)
<font color =#0B610B>...</font>
in one line regardless of other font-color tags, and replaces other unwanted instances.<font color =#0B610B>...</font>
, not grabbing everything starting by <font color =#0B610B>
and ending in </font>
[a-z][A-Z]
) within a matching <font color=#0B610B></font>
pair, but there is in a neighbouring <font></font>
. For example:<font color=#0B610B>word word</font><font color=#000000>word wordWord</font>
<font color=#0B610B>word word</font><font color=#000000>word word
and the rest of the string Word</font>
, and so this is what the regex matches (since if it can match it will!)(<font color =#0B610B\b[^>]*>)(([^<]|<(?!/font))*?[a-z])([A-Z].*?</font>)
<font colour=xxxx>
tag, followed by either an angle bracket <
not followed by /font
, OR anything else, and again followed by the [a-z][A-Z]
".
So it tries to make sure that the match doesn't go over a </font>
boundary.