R gsub从x列中的单词中删除y列中的单词 [英] R gsub remove words in column y from words in column x

查看:90
本文介绍了R gsub从x列中的单词中删除y列中的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用gsub删除x列中y列中的单词/文本.

I'm trying to use gsub to remove words / text in column y that are in column x.

x = c("a","b","c")
y = c("asometext", "some, a b text", "c a text")
df = cbind(x,y)
df = data.frame(df)
df$y = gsub(df$x, "", df$y)

如果我运行上面的代码,它将仅删除第x列第1行中的文本,而不是所有行:

If I run the code above, it removes only the text from column x row 1 and not all the rows:

> df
  x             y
1 a      sometext
2 b some,  b text
3 c       c  text

我希望最终结果是:

> df
  x             y
1 a      sometext
2 b      some,   text
3 c      text

因此,应从y列中删除x列中的所有单词/字母. gsub有可能吗?

So all the words / letters from column x should be removed from the column y. Is this possible with gsub?

推荐答案

通常gsub使用三个参数1)模式,2)替换和3)向量替换值.

Normally gsub takes three arguments 1) pattern, 2) replacement and 3) vector to replace values.

模式必须是单个字符串.与更换相同.向量中对多个值开放的函数的唯一部分.因此,我们称其为矢量化.

The pattern must be a single string. And the same for the replacement. The only part of the function that is open to multiple values is the vector. We call it vectorized because of this.

gsub(df$x, "", df$y)  #doesn't work because 'df$x' isn't one string

模式参数未向量化,但我们可以使用mapply完成任务.

The pattern argument is not vectorized, but we can use mapply to complete the task.

应用和gsub(bffs)

x = c("a","b","c")
y = c("asometext", "some, a b text", "c a text")
repl = ""

#We do
mapply(gsub, x, repl, y)

#On the inside
gsub(x[[1]], repl[[1]], y[[1]])
gsub(x[[2]], repl[[2]], y[[2]])
gsub(x[[3]], repl[[3]], y[[3]])

您可能会问,但是我只有一个replrepl[[2]]repl[[3]]如何工作?该函数对我们注意到了这一点,并重复执行"repl",直到它等于其他成员的长度为止.

You may be asking, but I only have one repl, how does repl[[2]] and repl[[3]] work? The function noticed that for us and repeated 'repl' until it equaled the length of the others.

这篇关于R gsub从x列中的单词中删除y列中的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆