不区分大小写的搜索并替换为 sed [英] Case-insensitive search and replace with sed
问题描述
我正在尝试使用 SED 从日志文件中提取文本.我可以毫不费力地进行搜索和替换:
I'm trying to use SED to extract text from a log file. I can do a search-and-replace without too much trouble:
sed 's/foo/bar/' mylog.txt
但是,我想让搜索不区分大小写.从我用谷歌搜索的内容来看,似乎将 i
附加到命令的末尾应该可以:
However, I want to make the search case-insensitive. From what I've googled, it looks like appending i
to the end of the command should work:
sed 's/foo/bar/i' mylog.txt
然而,这给了我一个错误信息:
However, this gives me an error message:
sed: 1: "s/foo/bar/i": bad flag in substitute command: 'i'
这里出了什么问题,我该如何解决?
What's going wrong here, and how do I fix it?
推荐答案
更新:从macOS Big Sur (11.0)开始,sed
现在 does 支持 I
标志用于不区分大小写的匹配,所以问题中的命令现在应该可以工作(BSD sed
不报告其版本,但您可以查看 man
页面底部的日期,该日期应该是 March 27, 2017
或更近期);一个简单的例子:
Update: Starting with macOS Big Sur (11.0), sed
now does support the I
flag for case-insensitive matching, so the command in the question should now work (BSD sed
doesn't reporting its version, but you can go by the date at the bottom of the man
page, which should be March 27, 2017
or more recent); a simple example:
# BSD sed on macOS Big Sur and above (and GNU sed, the default on Linux)
$ sed 's/ö/@/I' <<<'FÖO'
F@O # `I` matched the uppercase Ö correctly against its lowercase counterpart
注意:I
(大写)是标志的文档形式,但 i
也能工作.
Note: I
(uppercase) is the documented form of the flag, but i
works as well.
同样,从 macOS Big Sur (11.0) awk
现在是区域设置感知(awk --version
应该报告 20200816
或更新的):
Similarly, starting with macOS Big Sur (11.0) awk
now is locale-aware (awk --version
should report 20200816
or more recent):
# BSD awk on macOS Big Sur and above (and GNU awk, the default on Linux)
$ awk 'tolower($0)' <<<'FÖO'
föo # non-ASCII character Ö was properly lowercased
以下适用于 macOS 直到 Catalina (10.15):
The following applies to macOS up to Catalina (10.15):
要明确:在 macOS 上,sed
- 这是 BSD 实现 - 不支持不区分大小写的匹配 - 很难相信, 但是是真的.以前接受的答案,它本身显示了一个 GNU sed
命令,由于评论中提到的基于 perl
的解决方案而获得该状态.
To be clear: On macOS, sed
- which is the BSD implementation - does NOT support case-insensitive matching - hard to believe, but true. The formerly accepted answer, which itself shows a GNU sed
command, gained that status because of the perl
-based solution mentioned in the comments.
要使 Perl 解决方案也适用于外来字符,请通过 UTF-8,使用以下内容:
To make that Perl solution work with foreign characters as well, via UTF-8, use something like:
perl -C -Mutf8 -pe 's/öœ/oo/i' <<< "FÖŒ" # -> "Foo"
-C
为流和文件开启 UTF-8 支持,假设当前语言环境是基于 UTF-8 的.-Mutf8
告诉 Perl 将源代码解释为 UTF-8(在这种情况下,字符串传递给-pe
) -这是更详细的-e 'use utf8;' 的较短等价物.
谢谢,马克·里德-C
turns on UTF-8 support for streams and files, assuming the current locale is UTF-8-based.-Mutf8
tells Perl to interpret the source code as UTF-8 (in this case, the string passed to-pe
) - this is the shorter equivalent of the more verbose-e 'use utf8;'.
Thanks, Mark Reed
(请注意,使用 awk
也不是一种选择,因为 awk
在 macOS 上(即 BWK awk 和 BSD awk) 似乎完全不知道语言环境 - 它的 tolower()
和 toupper()
函数忽略外来字符(和 sub()
/gsub()
开始时没有不区分大小写的标志.)
(Note that using awk
is not an option either, as awk
on macOS (i.e., BWK awk and BSD awk) appears to be completely unaware of locales altogether - its tolower()
and toupper()
functions ignore foreign characters (and sub()
/ gsub()
don't have case-insensitivity flags to begin with).)
关于 sed
和 awk
与 POSIX 标准的关系的注释:
A note on the relationship of sed
and awk
to the POSIX standard:
BSD sed
和 awk
将它们的功能主要限制在 POSIX sed
和POSIX awk
规范要求,而他们的GNU 对应版本实现了更多扩展.
BSD sed
and awk
limit their functionality mostly to what the POSIX sed
and
POSIX awk
specs mandate, whereas their GNU counterparts implement many more extensions.
这篇关于不区分大小写的搜索并替换为 sed的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!