用awk更改字符串的大小写 [英] Changing the case of a string with awk

查看:622
本文介绍了用awk更改字符串的大小写的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是awk新手,所以请多包涵.

I'm an awk newbie, so please bear with me.

目标是更改字符串的大小写,以使每个单词的首字母大写,其余字母小写. (为使示例简单,此处将单词"严格定义为字母字符;所有其他字符均视为分隔符.)

The goal is to change the case of a string such that the first letter of every word is uppercase and the remaining letters are lowercase. (To keep the example simple, "word" is defined here as strictly alphabetic characters; all others are considered separators.)

我学会了一种很好的方法,可以使用以下awk命令从该网站上的另一篇文章中将每个单词的首字母大写:

I learned a nice way to make the first letter of every word uppercase from another post on this website using the following awk command:

echo 'abce efgh ijkl mnop' | awk '{for (i=1;i <= NF;i++) {sub(".",substr(toupper($i),1,1),$i)} print}'-> Abcd Efgh Ijkl Mnop

echo 'abce efgh ijkl mnop' | awk '{for (i=1;i <= NF;i++) {sub(".",substr(toupper($i),1,1),$i)} print}' --> Abcd Efgh Ijkl Mnop

通过在awk命令之前加上tr命令,可以轻松地使其余字母变为小写:

Making the remaining letters lowercase is easily accomplished by preceding the awk command with a tr command:

echo 'aBcD EfGh ijkl MNOP' | tr [A-Z] [a-z] | awk '{for (i=1;i <= NF;i++) {sub(".",substr(toupper($i),1,1),$i)} print}'-> Abcd Efgh Ijkl Mnop

echo 'aBcD EfGh ijkl MNOP' | tr [A-Z] [a-z] | awk '{for (i=1;i <= NF;i++) {sub(".",substr(toupper($i),1,1),$i)} print}' --> Abcd Efgh Ijkl Mnop

但是,为了更多地了解awk,我想使用类似的awk构造将除第一个字母以外的所有字母的大小写更改为小写.我使用正则表达式\B[A-Za-z]+来匹配单词中除第一个字母之外的所有字母,而awk命令substr(tolower($i),2)则以小写形式提供相同的字母,如下所示:

However, in the interest of learning more about awk, I wanted to change the case of all but the first letter to lowercase with a similar awk construct. I used the regular expression \B[A-Za-z]+ to match all letters of a word but the first, and the awk command substr(tolower($i),2) to provide those same letters in lowercase, as follows:

echo 'ABCD EFGH IJKL MNOP' | awk '{for (i=1;i <= NF;i++) {sub("\B[A-Za-z]+",substr(tolower($i),2),$i)} print}'-> Abcd EFGH IJKL MNOP

echo 'ABCD EFGH IJKL MNOP' | awk '{for (i=1;i <= NF;i++) {sub("\B[A-Za-z]+",substr(tolower($i),2),$i)} print}' --> Abcd EFGH IJKL MNOP

请注意,第一个单词已正确转换,但其余单词保持不变.我非常感谢您解释为什么其余的单词不能正确转换以及如何使它们转换.

Notice that the first word converted properly, but the remaining words are left unchanged. I would be very grateful for an explanation of why the remaining words did not convert properly and how to get them to do so.

推荐答案

问题是\B(零宽度非单词边界)似乎仅在行的开头匹配,因此$1可行,但$2和随后的字段与正则表达式不匹配,因此它们不会被替换并保持大写.不确定为什么\B除了第一个字段外都不匹配... B应该匹配任何单词内的任何地方:

The issue is that \B (zero-width non-word boundary) only seems to match at the beginning of the line, so $1 works but $2 and following fields do not match the regex, so they are not substituted and remain uppercase. Not sure why \B doesn't match except for the first field... B should match anywhere within any word:

echo 'ABCD EFGH IJKL MNOP' | awk '{for (i=1; i<=NF; ++i) { print match($i, /\B/); }}'
2   # \B matches ABCD at 2nd character as expected
0   # no match for EFGH
0   # no match for IJKL
0   # no match for MNOP

无论如何要获得结果(仅大写该行的第一个字符),您都可以在$0(整行)上进行操作,而不必使用for循环:

Anyway to achieve your result (capitalize only the first character of the line), you can operate on $0 (the whole line) instead of using a for loop:

echo 'ABCD EFGH IJKL MNOP' | awk '{print toupper(substr($0,1,1)) tolower(substr($0,2)) }'

或者如果您仍然想将每个单词分别大写但仅使用awk:

Or if you still wanted to capitalize each word separately but with awk only:

awk '{for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); } print }'

这篇关于用awk更改字符串的大小写的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆