在 R 中,使用 gsub 删除除句点之外的所有标点符号 [英] in R, use gsub to remove all punctuation except period
问题描述
我是 R 的新手,所以我希望你能帮助我.
I am new to R so I hope you can help me.
我想使用 gsub 删除除句点和减号之外的所有标点符号,以便我可以在数据中保留小数点和负号.
I want to use gsub to remove all punctuation except for periods and minus signs so I can keep decimal points and negative symbols in my data.
示例
我的数据框 z 有以下数据:
My data frame z has the following data:
[,1] [,2]
[1,] "1" "6"
[2,] "2@" "7.235"
[3,] "3" "8"
[4,] "4" "$9"
[5,] "£5" "-10"
我想用 gsub("[[:punct:]]", "", z)
去除标点符号.
I want to use gsub("[[:punct:]]", "", z)
to remove the punctuation.
电流输出
> gsub("[[:punct:]]", "", z)
[,1] [,2]
[1,] "1" "6"
[2,] "2" "7235"
[3,] "3" "8"
[4,] "4" "9"
[5,] "5" "10"
但是,我想保留-"号和."号.签字.
I would like, however, to keep the "-" sign and the "." sign.
期望输出
PSEUDO CODE:
> gsub("[[:punct:]]", "", z, except(".", "-") )
[,1] [,2]
[1,] "1" "6"
[2,] "2" "7.235"
[3,] "3" "8"
[4,] "4" "9"
[5,] "5" "-10"
有什么想法可以使某些字符免于 gsub() 函数吗?
Any ideas how I can make some characters exempt from the gsub() function?
推荐答案
您可以像这样放回一些匹配项:
You can put back some matches like this:
sub("([.-])|[[:punct:]]", "\1", as.matrix(z))
X..1. X..2.
[1,] "1" "6"
[2,] "2" "7.235"
[3,] "3" "8"
[4,] "4" "9"
[5,] "5" "-10"
这里我保留了 .
和 -
.
Here I am keeping the .
and -
.
我猜,下一步是将您的结果强制转换为数字矩阵,所以在这里我将两个步骤组合如下:
And I guess , the next step is to coerce you result to a numeric matrix, SO here I combine the 2 steps like this:
matrix(as.numeric(sub("([.-])|[[:punct:]]", "\1", as.matrix(z))),ncol=2)
[,1] [,2]
[1,] 1 6.000
[2,] 2 7.235
[3,] 3 8.000
[4,] 4 9.000
[5,] 5 -10.000
这篇关于在 R 中,使用 gsub 删除除句点之外的所有标点符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!