如何对少于4个字符的所有单词进行grep处理? [英] How do I grep for all words that are less than 4 characters?
问题描述
我有一本词典,单词之间用换行符分隔.
I have a dictionary with words separated by line breaks.
推荐答案
您可以这样做:
egrep -x '.{1,3}' myfile
这也将跳过空白行,从技术上讲,这些空白行不是单词.不幸的是,上面的正则表达式会计算缩略词中的撇号以及连字符的复合词中的字母和连字符.连字复合词在如此低的字母数下不是问题,但是我不确定您是否要在收缩中计算撇号(例如,我是).您可以尝试使用正则表达式,例如:
This will also skip blank lines, which are technically not words. Unfortunately, the above reg-ex will count apostrophes in contractions as letters as well as hyphens in hyphenated compound words. Hyphenated compound words are not a problem at such a low letter count, but I am not sure whether or not you want to count apostrophes in contractions, which are possible (e.g., I'm). You can try to use a reg-ex such as:
egrep -x '\w{1,3}' myfile
...,但这仅匹配大写/小写字母,根本不匹配紧缩或带连字符的复合词.
..., but this will only match upper/lower case letters and not match contractions or hyphenated compound words at all.
这篇关于如何对少于4个字符的所有单词进行grep处理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!