从R中的字符串中删除所有特殊字符? [英] Remove all special characters from a string in R?

查看:2518
本文介绍了从R中的字符串中删除所有特殊字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何删除R中给定字符串中的所有特殊字符并用空格替换每个特殊字符?

How to remove all special characters in a given string in R and replace each special character with space ?

要删除的特殊字符为:〜!@#$%^& *(){} _ +:<>?,。/;'[] - =

The special characters to remove are : ~!@#$%^&*(){}_+:"<>?,./;'[]-=

regex [:punct:] 将要完成一半的工作。

regex [:punct:] going to make half of the job.

问题2:要删除那些疯狂的角色:âíüáá

Question2 : How to delete remowe those crazy characters : â í ü Â á ?

答案2:尝试替换[^ [:alnum :]]与[^ a-zA-Z0-9]与 regex regexpr

Answer2 : Try replacing [^[:alnum:]] with [^a-zA-Z0-9] with regex or regexpr.

推荐答案

您需要使用正则表达式来标识不需要的字符对于最容易读取的代码,您需要 str_replace_all stringr 包,但 gsub

You need to use regular expressions to identify the unwanted characters. For the most easily readable code, you want the str_replace_all from the stringr package, though gsub from base R works just as well.

精确的正则表达式取决于你想要做什么。您可以删除您在问题中提供的那些特定字符,但更容易删除所有标点符号。

The exact regular expression depends upon what you are trying to do. You could just remove those specific characters that you gave in the question, but it's much easier to remove all punctuation characters.

x <- "a1~!@#$%^&*(){}_+:\"<>?,./;'[]-=" #or whatever
str_replace_all(x, "[[:punct:]]", " ")

c $ c> gsub([[:punct:]],,x)。)

(The base R equivalent is gsub("[[:punct:]]", " ", x).)

所有非字母数字字符。

str_replace_all(x, "[^[:alnum:]]", " ")

请注意,构成字母或数字或标点符号的定义会根据您的区域设置,因此您可能需要尝试一下才能准确得到您想要的内容。

Note that the definition of what constitutes a letter or a number or a punctuatution mark varies slightly depending upon your locale, so you may need to experiment a little to get exactly what you want.

这篇关于从R中的字符串中删除所有特殊字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆