正则表达式 [英] regex expressions

查看:138
本文介绍了正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在子字符串仅在项目中仅出现一次时,有什么方法可以限制匹配吗?

因此,例如,如果我对子字符串"ark"感兴趣,我们有以下项目:

鲨鱼
aaaaarkdd
darkdayshark

我应该只得到前两个项目.子字符串"ark"在第三项中出现两次.可以使用正则表达式来完成吗?还是您必须编写循环脚本?

我已经尝试过使用数量限定符{x,y},但它似乎仅适用于连续出现的情况(大多数教程对此并不太明确).

我已经尝试过使用反向引用,但是它似乎不是为否定而构建的,或者还没有找到任何示例.此外,使用字符的方式以及子字符串在任何地方都可能多次出现,因此不确定它是否仍然可以正常工作.

我已经尝试过非贪婪的量词,但它们并没有真正的帮助.首先,在iMac的终端级别上,非贪婪的量词似乎根本不起作用.即使在简单的示例中,无论我使用贪婪还是非贪婪的量词,我的结果也都没有差别,即使我使用-o进行grep,这样您也可以看到实际匹配的内容.但是即使他们确实工作了,我仍然不确定他们会提供帮助.例如,在第三个项目上,是否匹配至黑暗,或者是否一直匹配至Darkdayshark结束都无关紧要?它仍然是一个匹配项,因此它返回该项目,这是不正确的.仅当我提取匹配的内容时,它才有帮助.在这里,我要满足条件的完整项目,这不是一回事.

我已经看过了往前看,但是从我读到的内容来看,它们是用于查看紧接的上一个或下一个项目.子字符串可能会在第一次出现的任何距离处多次出现.

这似乎不应该那么困难,但是还没有发现任何可行的方法.除了方括号中的^以外,没有太多可用于否定评估的方法.

Is there some way to restrict matches to when a substring appears only ONCE in an item?

so for instance, if I am interested in the substring "ark" and we have these items:

shark
aaaaarkdd
darkdayshark

I should only get the first two items. The substring "ark" appears twice in the third entry. Can this be done with straight regex? Or do you have to write a looping script?

I''ve tried the count quantifier {x,y} but it only seems to apply to consecutive occurrences, (something most tutorials don''t make very clear by the way).

I''ve tried using backreferences, But it doesn''t seem built for negation, or haven''t found any examples. Besides, with the way characters are consumed, and because the substring could occur many times anywhere, not sure it would work anyway.

I''ve tried the non-greedy quantifiers, but they don''t really help. First of all, at terminal level on a iMac, the non-greedy quantifiers don''t seem to work at all. Even on simple examples, I get no difference in results whether I use greedy or non-greedy quantifiers, even if I grep with -o so you can see what was actually matched. But even if they did work, I''m still not sure they would help. On the third item for example, does it matter whether it matches up to dark, or whether it matches all the way up to the end of darkdayshark? It''s still a match, so it returns the item, which is incorrect. It would only help if I was extracting what was matched. Here I want the complete item that meets the condition, which is not the same thing.

I''ve looked at lookahead lookbehind, but from what I''ve read, they are for looking at the immediately previous or next item. The substring could appear multiple times at any distance from the first occurrence.

This doesn''t seem like it ought to be that hard, but haven''t found anything that works. Except for the ^ inside square brackets, there''s not much provided for negative evaluation.

推荐答案

这似乎有用(.Net变体的正则表达式) :
This seems to work (a regular expression of the .Net variety):
(\r\n|^)((?!.*ark.*ark.*)(.*))ark((?!.*ark.*)(.*))(\r\n|


)


可能需要进行一些调整(例如,换行符,并且可能不需要第二次否定提前查询),但这似乎在大多数情况下都是有效的.


That might need some tweaking (e.g., newlines and that second negative lookahead may not be required), but it seems to work for the most part.


这篇关于正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆