如何使grep [A-Z]独立于语言环境？ [英] How to make grep [A-Z] independent of locale?

查看：72 发布时间：2018/5/28 19:19:08 grep locale

本文介绍了如何使grep [A-Z]独立于语言环境？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

  $ echo T | grep [AZ]

无匹配

T怎么不在AZ范围内？

我改变了正则表达式：

  $ echo T | grep [AY]

一场比赛！

哇！ T是如何在AY内但不在AZ内的？

显然这是因为我的环境设置为爱沙尼亚语语言环境，其中Y在字母表的末尾，但Z在某处中间：ABCDEFGHIJKLMNOPQRSŠZZTUVWÕÄÖÜXY

$ echo $ LANG et_EE.UTF-8
这对我来说有点令人震惊。 99％的时间我grep电脑代码，而不是爱沙尼亚文学。我一直在用错误的方式使用grep吗？我曾经因为这个原因而犯过什么样的错误？

在尝试了几件事情后，我得到了以下解决方案：

$ echo T | LANG = C grep [AZ]
这是推荐使grep locale-independent ？

更多...定义一个别名是安全的：
$ alias grep =LANG = C grep
PS。我也想知道为什么字符范围像 [AZ] locale依赖于第一个地方，而 \\ \\ w 似乎不受语言环境的影响（尽管手册中说 \w 相当于 [[：alnum： ]] - 但我发现后者取决于语言环境，而 \w 不）。
解决方案
POSIX正则表达式，其中Linux和FreeBSD grep自然支持，其他一些支持请求支持，有一系列[：xxx：]模式来承认语言环境。
grep'[[：upper：]]'
由于[]是模式名称的一部分，所以您也需要使用外部[]，无论它看起来有多奇怪。

随着这些：对经典\ w等进行编码，严格保持在C语言环境中。因此，您选择的模式决定了grep是否使用当前语言环境。

[AZ]应该遵循语言环境，但您可能需要设置LC_ALL而不是LANG，特别是如果系统将LC_ALL设置为您的不同值。
I was doing some everyday grepping and suddenly discovered that something seemingly trivial does not work:
$ echo T | grep [A-Z]
No match.

How come T is not within A-Z range?

I changed the regex a tiny bit:
$ echo T | grep [A-Y]
A match!

Whoa! How is T within A-Y but not within A-Z?

Apparently this is because my environment is set to Estonian locale where Y is at the end of the alphabet but Z is somewhere in the middle: ABCDEFGHIJKLMNOPQRSŠZŽTUVWÕÄÖÜXY
$ echo $LANG et_EE.UTF-8
This all came as a bit of a shock to me. 99% of the time I grep computer code, not Estonian literature. Have I been using grep the wrong way all the time? What all kind of mistakes have I made because of this in the past?

After trying several things I arrived at the following solution:
$ echo T | LANG=C grep [A-Z]
Is this the recommended way to make grep locale-independent?

Further more... would it be safe to define an alias like that:
$ alias grep="LANG=C grep"
PS. I'm also wondering of why are the character ranges like [A-Z] locale dependent in the first place while \w seems to be unaffected by locale (although the manual says \w is equivalent of [[:alnum:]] - but I found out the latter depends on locale while \w does not).
解决方案
POSIX regular expressions, which Linux and FreeBSD grep support naturally, and some others support on request, have a series of [:xxx:] patterns that honor locales. See the man page for details.
grep '[[:upper:]]'
As the []s are part of the pattern name you need the outer [] as well, regardless of how strange it looks.

With the advent of these : codes the classic \w, etc., remain strictly in the C locale. Thus your choice of patterns determines if grep uses the current locale or not.

[A-Z] should follow locale, but you may need to set LC_ALL rather than LANG, especially if the system sets LC_ALL to a different value for your.

这篇关于如何使grep [A-Z]独立于语言环境？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使grep [A-Z]独立于语言环境？ [英] How to make grep [A-Z] independent of locale?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何使grep [A-Z]独立于语言环境？ [英] How to make grep [A-Z] independent of locale?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭