gawk正则表达式可查找具有除正则表达式模式中字符类指定的字符以外的其他字符的记录 [英] gawk regex to find any record having characters other then the specified by character class in regex pattern

查看:116
本文介绍了gawk正则表达式可查找具有除正则表达式模式中字符类指定的字符以外的其他字符的记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在文本文件中有电子邮件地址列表.我有一个具有字符类的模式,该类指定了电子邮件地址中允许的字符. 现在,从该输入文件中,我只想搜索具有除所允许字符以外的字符的电子邮件地址. 我正在尝试为它编写一个gawk,但无法使其正常工作. 这是我要尝试的方法:

I have list of email addresses in a text file. I have a pattern having character classes that specifies what characters are allowed in the email addresses. Now from that input file, I want to only search the email addresses that has the characters other than the allowed ones. I am trying to write a gawk for the same, but not able to get it to work properly. Here is the gawk that I am trying:

gawk -F "," ' $2!~/[[:alnum:]@\.]]/ { print "has invalid chars" }' emails.csv

我面临的问题是,上面的gawk命令仅匹配字母数字@和无的记录. (点)在其中.但是我要查找的是具有允许的字符但也不允许的字符的记录.

The problem I am facing is that the above gawk command only matches the records that has NONE of the alphanumeric, @ and . (dot) in them. But what I am looking for is the records that are having the allowed characters but along with them the not-allowed ones as well.

例如,上述命令将找到

"_-()& (()%"

因为上面只有字符不在正则表达式模式中,但找不到

as the above only has the characters not in regex pattern, but will not find

"abc-123 @ xyz,com"

"abc-123@xyz,com"

.因为它还具有正则表达式模式中指定字符类中存在的字符.

. as it also has the characters that are present in specified character classes in regex pattern.

推荐答案

一起进行几次测试:包含一个数字和一个@以及一个点和一个无效字符

How about several tests together: contains an alnum and an @ and a dot and an invalid character

$2 ~ /[[:alnum:]]/ && $2 ~ /@/ && $2 ~ /\./ && $2 ~ /[^[:alnum:]@.]/

这篇关于gawk正则表达式可查找具有除正则表达式模式中字符类指定的字符以外的其他字符的记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆