Grep正则表达式无法按预期工作 [英] Grep regular expression not working as expected

查看：81 发布时间：2020/11/20 21:13:32 regex grep

本文介绍了Grep正则表达式无法按预期工作的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个简单的grep命令，试图仅获取CSV文件的第一列(包括逗号).就像这样...

I have a simple grep command trying to get only the first column of a CSV file including the comma. It goes like this...

grep -Eo '^[^,]+,' some.csv

所以在我的脑海中，它的意思是"只给我匹配行的一部分，其中每行以至少一个不是逗号的字符开头，然后是一个逗号".

So in my head, that reads like "get me only the matching part of the line where each line starts with at least one character that is not a comma, followed by a single comma."

在文件some.csv上，看起来像这样:

So on a file, some.csv, that looks like this:

column1,column2,column3,column4
column1,column2,column3,column4
column1,column2,column3,column4

我希望得到这样的输出:

I'm expecting this output:

column1,
column1,
column1,

但我得到以下输出:

column1,
column2,
column3,
column1,
column2,
column3,
column1,
column2,
column3,

那是为什么? 我的grep/regex中缺少什么?我的预期输出不正确吗?

Why is that? What am I missing from my grep/regex? Is my expected output incorrect?

如果我删除了正则表达式中尾部逗号的要求，该命令将按我期望的那样工作.

If I remove the requirement of the trailing comma in the regex, the command works as I expect.

grep -Eo '^[^,]+' some.csv

给我:

column1
column1
column1

注意:我在使用grep版本的macOS High Sierra:grep (BSD grep) 2.5.1-FreeBSD

NOTE: I'm on macOS High Sierra with grep version: grep (BSD grep) 2.5.1-FreeBSD

推荐答案

BSD grep通常是儿童车.请参阅以下相关文章:

BSD grep is buggy in general. See the following related posts:

Why does this BSD grep result differ from GNU grep?
grep strange behaviour with single letter words
How to make BSD grep respect start-of-line anchor

上面的最后一个链接提到了您的情况:使用-o选项时，grep出于某种原因会忽略^锚点. FreeBSD错误:

That last link above mentions your case: when -o option is used, grep ignores the ^ anchor for some reason. This issue is also described in a FreeBSD bug:

我注意到相同版本的grep还有更多问题.我不知道它们是否相关，但是我现在将它们附加在这里.

I've noticed some more issues with the same version of grep. I don't know whether they're related, but I'll append them here for now.

$ printf abc | grep -o '^[a-c]'

应该只打印'a'，而是针对每个字母给出三次匹配输入的文本.

should just print 'a', but instead gives three hits, against each letter of the incoming text.

作为一种解决方法，最好只安装可以正常工作.

As a workaround, it might be a better idea to just install GNU grep that works as expected.

或者，将sed与BRE POSIX模式一起使用:

Or, use sed with a BRE POSIX pattern:

sed -i '' 's/^\([^,]*,\).*/\1/' file

模式匹配的地方

^-一行的开头
$[^,]*,$-第1组(后来从RHS引用为\1反向引用):
- [^,]*-除,
- ,-一个,字符
- ^ - start of a line
- $[^,]*,$ - Group 1 (later referred to with \1 backreference from the RHS):
  - [^,]* - zero or more chars other than ,
  - , - a , char
  请注意，-i将就地更改文件内容.如果需要，请使用-i.bak创建备份文件(然后，虽然您不需要下一个空的'').
  
  Note that -i will change the file contents inplace. Use -i.bak to create a backup file if needed (then, you wouldn't need the next empty '' though).
  
  这篇关于Grep正则表达式无法按预期工作的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Grep正则表达式无法按预期工作 [英] Grep regular expression not working as expected

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Grep正则表达式无法按预期工作 [英] Grep regular expression not working as expected

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭