使用PURR迭代替换DataFrame列中的字符串 [英] Using purrr to iteratively replace strings in a dataframe column
本文介绍了使用PURR迭代替换DataFrame列中的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我希望使用purrr
通过gsub()
函数在DataFrame列上迭代运行多个字符串替换。
这是数据帧示例:
df <- data.frame(Year = "2019",
Text = c(rep("a aa", 5),
rep("a bb", 3),
rep("a cc", 2)))
> df
Year Text
1 2019 a aa
2 2019 a aa
3 2019 a aa
4 2019 a aa
5 2019 a aa
6 2019 a bb
7 2019 a bb
8 2019 a bb
9 2019 a cc
10 2019 a cc
这就是我通常运行字符串替换的方式,并得到了期望的结果。
df$Text <- gsub("aa", "One", df$Text, fixed = T)
df$Text <- gsub("bb", "Two", df$Text, fixed = T)
df$Text <- gsub("cc", "Three", df$Text, fixed = T)
> df
Year Text
1 2019 a One
2 2019 a One
3 2019 a One
4 2019 a One
5 2019 a One
6 2019 a Two
7 2019 a Two
8 2019 a Two
9 2019 a Three
10 2019 a Three
然而,随着字符串替换列表的增长,使用这种方法是不现实的,所以我尝试使用purrr
通过patterns
和replacements
的列表迭代这样的更改,但我只设法生成了错误消息。我希望代码迭代通过text_pattern
和text_replacement
,并针对每个模式/替换对运行gsub
。下面是该示例以及错误消息。
text_pattern <- c("aa", "bb", "cc")
text_replacement <- c("One", "Two", "Three")
walk2(text_pattern, text_replacement, function(...){
gsub(text_pattern, text_replacement, df$Text, fixed = F)
}
)
Warning messages:
1: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
2: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
3: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
4: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
5: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
6: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
是否可以使用purrr
中的函数完成此操作?或者,我是否尝试使用了错误的工具,是否应该使用不同的功能?
推荐答案
我们可以使用reduce2
library(purrr)
library(stringr)
df$Text <- reduce2(text_pattern, text_replacement, ~ str_replace(..1, ..2, ..3),
.init = df$Text)
df$Text
#[1] "a One" "a One" "a One" "a One" "a One" "a Two" "a Two" "a Two" "a Three" "a Three"
或不使用匿名函数调用
reduce2(text_pattern, text_replacement, .init = df$Text, str_replace)
这篇关于使用PURR迭代替换DataFrame列中的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文