(grep) 正则表达式匹配非 ASCII 字符? [英] (grep) Regex to match non-ASCII characters?

查看：40 发布时间：2021/11/28 22:36:02 regex unicode grep ascii non-ascii-characters

本文介绍了(grep) 正则表达式匹配非 ASCII 字符?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在 Linux 上，我有一个包含大量文件的目录.其中一些包含非 ASCII 字符，但它们都是有效的 UTF-8.一个程序有一个错误，阻止它使用非 ASCII 文件名，我必须找出有多少受到影响.我打算用 find 做这个，然后做一个 grep打印非 ASCII 字符，然后执行 wc -l 以找到数字.它不一定是 grep；我可以使用任何标准的 Unix 正则表达式，例如 Perl、sed、AWK 等

On Linux, I have a directory with lots of files. Some of them have non-ASCII characters, but they are all valid UTF-8. One program has a bug that prevents it working with non-ASCII filenames, and I have to find out how many are affected. I was going to do this with find and then do a grep to print the non-ASCII characters, and then do a wc -l to find the number. It doesn't have to be grep; I can use any standard Unix regular expression, like Perl, sed, AWK, etc.

但是，是否有针对非 ASCII 字符的任何字符"的正则表达式?

However, is there a regular expression for 'any character that's not an ASCII character'?

推荐答案

这将匹配单个非 ASCII 字符:

This will match a single non-ASCII character:

[^x00-x7F]

这是一个有效的 PCRE(Perl 兼容的正则表达式).

This is a valid PCRE (Perl-Compatible Regular Expression).

您也可以使用 POSIX 简写:

You can also use the POSIX shorthands:

[[:ascii:]] - 匹配单个 ASCII 字符
[^[:ascii:]] - 匹配单个非 ASCII 字符

[[:ascii:]] - matches a single ASCII char
[^[:ascii:]] - matches a single non-ASCII char

[^[:print:]] 对你来说可能就足够了.**

[^[:print:]] will probably suffice for you.**

这篇关于(grep) 正则表达式匹配非 ASCII 字符?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

(grep) 正则表达式匹配非 ASCII 字符? [英] (grep) Regex to match non-ASCII characters?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

(grep) 正则表达式匹配非 ASCII 字符? [英] (grep) Regex to match non-ASCII characters?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭