从Bash中的字符串中删除所有特殊字符 [英] Removing all special characters from a string in Bash
问题描述
我有很多小写文本,唯一的问题是,有很多特殊字符,我也想用数字将其全部删除.
I have a lot of text in lowercase, only problem is, that there is a lot of special characters, which I want to remove it all with numbers too.
下一个命令不够强大:
tr -cd '[alpha]\n '
如果是éćščž和其他一些字符,则返回?"但我想删除所有这些.有更强大的命令吗?
In case of éćščž and some others it returns "?" But I want to remove all of them. Is there any stronger command?
我使用的是Linux Mint 4.3.8(1)-发行版
I use linux mint 4.3.8(1)-release
推荐答案
您可以使用tr
从下面的字符串中仅打印可打印的字符.只需在输入文件上使用以下命令即可.
You can use tr
to print only the printable characters from a string like below. Just use the below command on your input file.
tr -cd "[:print:]\n" < file1
标志-d
用于删除输入流的参数中定义的字符集,而-c
用于补充这些字符集(反转提供的内容).因此,在没有-c
的情况下,该命令将从输入流中删除所有可打印字符,并使用该命令通过删除 non-printable 字符对它进行补充.我们还保留换行符\n
,以将行尾保留在输入文件中.删除它只会在最后一行中产生最终输出.
The flag -d
is meant to the delete the character sets defined in the arguments on the input stream, and -c
is for complementing those (invert what's provided). So without -c
the command would delete all printable characters from the input stream and using it complements it by removing the non-printable characters. We also keep the newline character \n
to preserve the line endings in the input file. Removing it would just produce the final output in one big line.
[:print:]
只是一个 POSIX括号表达式,它是表达式的组合[:alnum:]
,[:punct:]
和空格. [:alnum:]
与[0-9A-Za-z]
相同,并且[:punct:]
包含字符!
"
#
$
%
&
'
(
)
*
+
,
-
.
/
:
;
<
=
>
?
@
[
\
]
^
_
`
{
|
}
~
The [:print:]
is just a POSIX bracket expression which is a combination of expressions [:alnum:]
, [:punct:]
and space. The [:alnum:]
is same as [0-9A-Za-z]
and [:punct:]
includes characters !
"
#
$
%
&
'
(
)
*
+
,
-
.
/
:
;
<
=
>
?
@
[
\
]
^
_
`
{
|
}
~
这篇关于从Bash中的字符串中删除所有特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!