R:在数据框的多列中替换特殊字符 [英] R: replacing special character in multiple columns of a data frame

查看:59
本文介绍了R:在数据框的多列中替换特殊字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试用 oe替换数据框中的德语特殊字符ö。该字符出现在多个列中,因此我希望能够不必指定单个列而一次完成所有操作。
这是数据框的一个小示例

I try to replace the german special character "ö" in a dataframe by "oe". The charcter occurs in multiple columns so I would like to be able to do this all in one by not having to specify individual columns. Here is a small example of the data frame

data <- data.frame(a=c("aö","ab","ac"),b=c("bö","bb","ab"),c=c("öc","öb","acö"))

我尝试过:

data[data=="ö"]<-"oe"

工作,因为我需要在这里使用正则表达式。但是,当我尝试:

but this did not work since I would need to work with regular expressions here. However when I try :

data[grepl("ö",data)]<-"oe"

我没有得到想要的东西。
最后的数据框应如下所示:

I do not get what I want. The dataframe at the end should look like:

> data
   a  b   c
1 aoe boe  oec
2 ab bb  oeb
3 ac ab acoe
> 

文件是我通过read.csv导入的csv导入。但是,似乎没有选择更改以使用import语句解决此问题。
如何获得所需的结果?

The file is a csv import that I import by read.csv. However, there seems to be no option to change to fix this with the import statement. How do I get the desired outcome?

推荐答案

这里是一种方法:

data <- apply(data,2,function(x) gsub("ö",'oe',x))

说明:

您的 grepl 不起作用,因为 grepl 只是返回一个布尔矩阵( TRUE / FALSE )对应于您的数据框中与正则表达式匹配的元素。然后,赋值操作不仅替换您要替换的字符,还替换整个字符串。要替换字符串的 part ,您需要 sub (如果要在每个字符串中仅替换一次)或 gsub (如果您希望替换所有出现的内容)。要将其应用于每个列,请使用 apply 遍历各列。

Your grepl doesn't work because grepl just returns a boolean matrix (TRUE/FALSE) corresponding to the elements in your data frame for which the regex matches. What the assignment then does is replace not just the character you want replaced but the entire string. To replace part of a string, you need sub (if you want to replace just once in each string) or gsub (if you want all occurrences replaces). To apply that to every column you loop over the columns using apply.

这篇关于R:在数据框的多列中替换特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆