如何找到与模式匹配的文本? [英] How do I find the text that matches a pattern?

查看:42
本文介绍了如何找到与模式匹配的文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意::这不是任何现有问题的重复,目的在于说明为什么这样一个极为常见且看似简单的问题无法回答,并为发布此类问题的人如何修改它们提供了指导使他们负责,因此我们几乎不必每天在评论中提供相同的指导,而只需参考它即可.

NOTE: This is not a duplicate of any existing question, it's intended to show why such an extremely common and seemingly simple question is unanswerable and provide guidance on how people posting such questions can modify them to make them answerable so we don't have to keep providing the same guidance in comments almost every day and can just refer to this instead.

给出以下输入文件:

foo
o.b
bar

我需要输出与模式 o.b 匹配的所有行,因此我的预期输出是:

I need to output all lines that match the pattern o.b so my expected output is:

o.b

并且我已经尝试过 awk'"o.b"'文件,但是输出所有行(添加该部分是为了避免抱怨问题中未发布尝试的解决方案).

and I have tried awk '"o.b"' file but that output all lines (this part just added to avoid complaints that no attempted solution was posted in the question).

推荐答案

虽然从表面上看这似乎是一个简单的问题,但答案很明显,实际上并不是由于两个因素造成的:

While on the surface this seems to be a simple question with an obvious answer it actually is not because of 2 factors:

  1. pattern 一词不明确-我们不知道OP是否要进行正则表达式匹配或字符串匹配,并且
  2. match 一词含糊不清-我们不知道OP是否希望在每一行进行完全匹配(为简单起见,请考虑行并记录同义词),或者在一行上的特定子字符串(例如单词"或字段)或每行的一部分或其他内容的部分匹配.
  1. The word pattern is ambiguous - we don't know if the OP wants to do a regexp match or a string match, and
  2. The word match is ambiguous - we don't know if the OP wants to do a full match on each line (consider line and record synonymous for simplicity of this answer) or a full match on specific substrings (e.g. "words" or fields) on a line or a partial match on part of each line or something else.

这两种方法都会从发布的样本输入中产生预期的输出:

Either of these would produce the expected output from the posted sample input:

  1. awk'/o.b/'文件
  2. awk'/^o.b$/'文件
  3. awk'index($ 0,'o.b')'文件
  4. awk'$ 0 =='o.b''文件

但是我们不知道哪一个是正确的(如果有的话),我们所知道的就是它们从问题中的特定样本输入中产生了预期的输出.

but we don't know which is correct, if any, all we know is that they produce the expected output from the specific sample input in the question.

考虑一下,如果OP的真实数据包含这样的附加字符串,而不仅仅是问题中显示的最小示例,那么每个对象的行为如何:

Consider how each would behave if the OPs real data contains additional strings like this rather than just the minimal example shown in the question:

$ cat file
foo
foo.bar
foobar
o.b
orb
bar

那么这里有4个可能的答案,在给出问题的样本输入的情况下,将全部产生预期的输出,但是在输入稍有不同的情况下,将产生非常不同的输出,我们只是无法从被询问的问题中知道哪个输出将对OP的需求是正确的:

then here are 4 possible answers that will all produce the expected output given the sample input from the question but will produce very different output given just slightly different input and we just have no way of knowing from the question as asked which output would be correct for the OPs needs:

  1. 部分正则表达式匹配:

$ awk '/o.b/' file
foo.bar
foobar
o.b
orb

  1. 全行正则表达式匹配:

$ awk '/^o.b$/' file
o.b
orb

  1. 部分字符串匹配:

$ awk 'index($0,"o.b")' file
foo.bar
o.b

  1. 全行字符串匹配:

$ awk '$0 == "o.b"' file
o.b

当您考虑对每行中的特定子字符串进行全字词,全字段和其他类型的匹配时,还有其他各种可能性也可能是正确的答案.

There are various other possibilities that might also be the correct answer when you consider full-word, full-field, and other types of matching against specific substrings on each line.

因此,每当您问有关将某些文本与其他文本进行匹配的问题时:

So whenever you ask a question about matching some text against other text:

  1. 请不要使用 pattern 一词,而应使用 string regexp (无论您指的是什么),并且
  2. 始终说明您是希望匹配项位于一行的全行还是部分,还是整个子串(例如单词或字段)还是行的子串的一部分.
  1. Never use the word pattern but instead use string or regexp, whichever it is you mean, and
  2. Always state whether you want the match to be on a full line or part of a line or full substring (e.g. word or field) or part of a substring of a line.

否则,您可能最终得到了一个您没有的问题的解决方案,该问题可能效率低下和/或根本就是错误的,即使它为您现在针对它运行的某些特定输入集产生了预期的输出,也可能以后再与其他输入集配合使用时,请回来咬你.

Otherwise you may end up with a solution to a problem that you don't have which could be inefficient and/or simply wrong and even if it produces the expected output for some specific input set you run it against now, it may well come back to bite you when run against some other input set later.

有关更多示例,请参见 https://unix.stackexchange.com/a/631532/133219 这个问题.

Also see https://unix.stackexchange.com/a/631532/133219 for more examples of this issue.

这篇关于如何找到与模式匹配的文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆