替换包含特定字符串的整个表达式 [英] Replace entire expression that contains a specific string
问题描述
我有一个数据框,其中的列包含大量文件名,例如:
I have data frame that has a column with large number of file names like:
d <- c("harry11_scott80_norm.avi","harry11_norm.avi","harry11_scott80_lpf.avi",
"joel51_lpf.avi","rich82_joel51_lpf.avi")
我希望R将所有具有两个人名的表达式(如harry11_scott80_norm.avi
替换为incongruent
)并将所有具有单个人名的表达式(如将harry11_norm.avi
替换为congruent
).我可以使用gsub
来做到这一点:
I want R to replace all expressions with two people names like harry11_scott80_norm.avi
with the expression incongruent
and all the ones with single person name like harry11_norm.avi
with congruent
. I could use gsub
to do that:
dd <- gsub("harry11_scott80_norm.avi", "incongruent", d)
但是我有很多这样的名字,所以这将是一个非常笨拙的解决方案.因此,理想情况下,我想将包含_scott80_
之类的字符串的ENTIRE表达式替换为"incongruent".我以为gsub
可以做到这一点,但是当我运行它时:
but I got a lot of those names, so it would be a very clunky solution. So ideally I want to replace the ENTIRE expression that contains strings like _scott80_
with "incongruent". I thought that gsub
can do this, but when I run it:
dd <- gsub("_scott80_", "incongruent", d)
它返回harry11incongruentnorm.avi
,这显然是因为它只是替换了完全匹配的字符串.我确认有一种方法可以告诉gsub
完全替换包含所选字符串的表达式,但是我找不到它.
it returns with harry11incongruentnorm.avi
, which is obviously because it simply replace the exact string match. I recon there is some way to tell gsub
to replace expression entirely that contains selected string, but I can't find it.
Side bonus question - based on @GSee answer, is there any function that allows you to pass a list of strings that you want to replace? For example, gsub(c(".*_scott80_.*", ".*_harry11_.*"), "incongruent", d)
won't work.
推荐答案
这是一种方法
> gsub(".*_scott80_.*", "incongruent", d)
[1] "incongruent" "harry11_norm.avi" "incongruent"
[4] "joel51_lpf.avi" "rich82_joel51_lpf.avi"
或使用grep
> d[grep("_scott80_", d)] <- "incongruent"
> d
[1] "incongruent" "harry11_norm.avi" "incongruent"
[4] "joel51_lpf.avi" "rich82_joel51_lpf.avi"
为了解决您的修改,我相信可以做到这一点(使用|
表示或")
To address your edit, I believe this will do it (using |
to mean "or")
gsub(".*(_scott80_|_harry11_).*", "incongruent", d)
当然,在d
中没有与"_harry11_"
这篇关于替换包含特定字符串的整个表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!