在 R 中,使用 gsub 删除除句点之外的所有标点符号 [英] in R, use gsub to remove all punctuation except period

查看:39
本文介绍了在 R 中,使用 gsub 删除除句点之外的所有标点符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 R 的新手,所以我希望你能帮助我.

I am new to R so I hope you can help me.

我想使用 gsub 删除除句点和减号之外的所有标点符号,以便我可以在数据中保留小数点和负号.

I want to use gsub to remove all punctuation except for periods and minus signs so I can keep decimal points and negative symbols in my data.

示例

我的数据框 z 有以下数据:

My data frame z has the following data:

     [,1] [,2]   
[1,] "1"  "6"    
[2,] "2@"  "7.235"
[3,] "3"  "8"    
[4,] "4"  "$9"   
[5,] "£5" "-10" 

我想用 gsub("[[:punct:]]", "", z) 去除标点符号.

I want to use gsub("[[:punct:]]", "", z) to remove the punctuation.

电流输出

> gsub("[[:punct:]]", "", z)
     [,1] [,2]  
[1,] "1"  "6"   
[2,] "2"  "7235"
[3,] "3"  "8"   
[4,] "4"  "9"   
[5,] "5"  "10" 

但是,我想保留-"号和."号.签字.

I would like, however, to keep the "-" sign and the "." sign.

期望输出

 PSEUDO CODE:  
> gsub("[[:punct:]]", "", z, except(".", "-") )
         [,1] [,2]  
    [1,] "1"  "6"   
    [2,] "2"  "7.235"
    [3,] "3"  "8"   
    [4,] "4"  "9"   
    [5,] "5"  "-10" 

有什么想法可以使某些字符免于 gsub() 函数吗?

Any ideas how I can make some characters exempt from the gsub() function?

推荐答案

您可以像这样放回一些匹配项:

You can put back some matches like this:

 sub("([.-])|[[:punct:]]", "\1", as.matrix(z))
     X..1. X..2.  
[1,] "1"   "6"    
[2,] "2"   "7.235"
[3,] "3"   "8"    
[4,] "4"   "9"    
[5,] "5"   "-10"  

这里我保留了 .-.

Here I am keeping the . and -.

我猜,下一步是将您的结果强制转换为数字矩阵,所以在这里我将两个步骤组合如下:

And I guess , the next step is to coerce you result to a numeric matrix, SO here I combine the 2 steps like this:

matrix(as.numeric(sub("([.-])|[[:punct:]]", "\1", as.matrix(z))),ncol=2)
   [,1]    [,2]
[1,]    1   6.000
[2,]    2   7.235
[3,]    3   8.000
[4,]    4   9.000
[5,]    5 -10.000

这篇关于在 R 中,使用 gsub 删除除句点之外的所有标点符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆