匹配 As 后跟 egrep 中相同数量的 B [英] Matching As followed by the same number of Bs in egrep

查看:12
本文介绍了匹配 As 后跟 egrep 中相同数量的 B的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我想匹配具有完全相同数量的字符 A 和 B 的模式,以便正好有 n 个 A 后跟正好 n 个 B.例如,可以匹配以下字符串.

Suppose I want to match a pattern with the exact same number of characters A and B such that there are exactly n A's followed by exactly n B's. For example, the following strings can be matched.

  1. AB
  2. AABB
  3. AAABBB

另一方面,这些字符串无法匹配

On the other hand, these strings cannot be matched

  1. 学士
  2. AAAB
  3. AABBB
  4. ABAB

为了解决这个问题,我在考虑重复次数,所以我的尝试看起来像这样

To approach the problem, I am thinking about the repetition counts, so my attempt looks like this

egrep 'A{n}B{n}'

当然,但是,大括号内的重复次数 n 不能隐式定义.

of course, however, the repetition count n inside the curly bracket cannot be defined implicitly.

虽然我知道如何编写程序来匹配它,但我正在 Mac 终端上测试它,因此我试图利用 egrep 的任何可能的特性来编写一个句子模式.

While I know how to write programs to match it, I am testing this on Mac terminal, hence I am trying to exploit any possible features of egrep to write the one sentence pattern.

任何人都可以帮我解决这个问题,任何帮助将不胜感激.

So could anyone please help me solve this problem and any help will be appreciated.

推荐答案

如果你有 gnu grep 那么你可以使用这个递归 PCRE 正则表达式:

If you have gnu grep then you can use this recursive PCRE regex:

grep -P '^(A(?1)?B)$' file

AB
AABB
AAABBB

或者,您可以使用 awk 使用这种非正则表达式方法:

Or else, you can use this non-regex approach using awk:

awk '(n=index($0, "B")) && length(substr($0, 1, n-1)) == length(substr($0, n))' file

AB
AABB
AAABBB

这个 awk 使用 index 函数找到第一个 B 的存在并提取 2 个子字符串,即所有的 As 和如果 As 子串的长度与 Bs 子串的长度相同,则打印所有 Bs 并打印每条记录.

This awk find presence of first B using index function and extracts 2 substrings i.e all the As and all the Bs and prints each record if length of As substring is same length of Bs substring.

这篇关于匹配 As 后跟 egrep 中相同数量的 B的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆