在egrep中匹配As后跟相同数量的B [英] Matching As followed by the same number of Bs in egrep

查看:29
本文介绍了在egrep中匹配As后跟相同数量的B的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我想用一个完全相同的字符A和B来匹配一个模式,使得正好有n个A,后跟正好是nB.例如,可以匹配以下字符串.

Suppose I want to match a pattern with the exact same number of characters A and B such that there are exactly n A's followed by exactly n B's. For example, the following strings can be matched.

  1. AB
  2. AABB
  3. AAABBB

另一方面,这些字符串无法匹配

On the other hand, these strings cannot be matched

  1. BA
  2. AAABB
  3. AABBB
  4. ABAB

要解决此问题,我正在考虑重复计数,所以我的尝试看起来像这样

To approach the problem, I am thinking about the repetition counts, so my attempt looks like this

egrep 'A{n}B{n}'

但是,当然,不能隐式定义大括号内的重复计数n.

of course, however, the repetition count n inside the curly bracket cannot be defined implicitly.

虽然我知道如何编写与之匹配的程序,但我正在Mac终端上对其进行测试,因此我试图利用egrep的任何可能功能来编写一个句子模式.

While I know how to write programs to match it, I am testing this on Mac terminal, hence I am trying to exploit any possible features of egrep to write the one sentence pattern.

所以任何人都可以帮助我解决这个问题,我们将不胜感激.

So could anyone please help me solve this problem and any help will be appreciated.

推荐答案

如果您有 gnu grep ,则可以使用此递归PCRE正则表达式:

If you have gnu grep then you can use this recursive PCRE regex:

grep -P '^(A(?1)?B)$' file

AB
AABB
AAABBB

否则,您可以通过 awk 使用这种非正则表达式方法:

Or else, you can use this non-regex approach using awk:

awk '(n=index($0, "B")) && length(substr($0, 1, n-1)) == length(substr($0, n))' file

AB
AABB
AAABBB

awk 使用 index 函数查找第一个 B 的存在,并提取2个子字符串,即所有 As 和如果 As 子字符串的长度与 Bs 子字符串的长度相同,则所有 Bs 并打印每条记录.

This awk find presence of first B using index function and extracts 2 substrings i.e all the As and all the Bs and prints each record if length of As substring is same length of Bs substring.

这篇关于在egrep中匹配As后跟相同数量的B的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆