数据帧中的R部分匹配 [英] R partial match in data frame

查看:76
本文介绍了数据帧中的R部分匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何解决数据框中的部分匹配? 可以说这是我的df df

   V1  V2  V3 V4
1 ABC 1.2 4.3  A
2 CFS 2.3 1.7  A
3 dgf 1.3 4.4  A

,并且我想添加一个列V5,其中仅当V1中的值名称中包含"f"时才包含数字111,并且仅当V1中的值包含"gf"时才包含数字222.我会因为几个值包含一个"f"而遇到问题吗?还是我执行命令的顺序可以解决这个问题?

我尝试过类似的事情:

df$V5<- ifelse(df$V1 = c("*f","*gf"),c=(111,222) )

但它不起作用.

主要问题是如何告诉R查找部分匹配"?

感谢您的帮助!

解决方案

除了按顺序为"f", "gf", ...设置值的解决方案外,还有必要了解零宽超前/后向的正则表达式功能.

如果您要grep包含"f"但不包含"gf"的所有行,则可以

v1 <- c("abc", "f", "gf" )
grep( "(?<![g])f" , v1, perl= TRUE )
[1] 2

,如果您只想grep包含"f"但不包含"fg"

的内容

v2 <- c("abc", "f", "fg")
grep( "f(?![g])" , v2, perl= TRUE )
[1] 2

当然可以混合使用:

v3 <- c("abc", "f", "fg", "gf")
grep( "(?<![g])f(?![g])" , v3, perl= TRUE )
[1] 2

因此,您可以这样做

df[ grep( "(?<![g])f" , df$V1, perl= TRUE ), "V5" ] <- 111
df[ grep( "gf" , df$V1, perl= TRUE ), "V5" ] <- 222

How can I address a partial match in a data frame? Lets say this is my df df

   V1  V2  V3 V4
1 ABC 1.2 4.3  A
2 CFS 2.3 1.7  A
3 dgf 1.3 4.4  A

and I want to add a column V5 containing a number 111 only if the value in V1 contains a "f" in the name and a number 222 only if the value in V1 contains a "gf". Will I get problems since several values contain an "f" - or does the order I ender the commands will take care of it?

I tried something like:

df$V5<- ifelse(df$V1 = c("*f","*gf"),c=(111,222) )

but it does not work.

Main problem is how can I tell R to look for "partial match"?

Thanks a million for your help!

解决方案

Besides the solution setting the values in a sequence for "f", "gf", ... it's worth to have a look at regular expressions capability for zero-width lookahead / lookbehind.

If you want to grep all rows which contain "f" but not "gf" you can

v1 <- c("abc", "f", "gf" )
grep( "(?<![g])f" , v1, perl= TRUE )
[1] 2

and if you want to grep only those which contain "f" but not "fg"

v2 <- c("abc", "f", "fg")
grep( "f(?![g])" , v2, perl= TRUE )
[1] 2

And of course you can mix that:

v3 <- c("abc", "f", "fg", "gf")
grep( "(?<![g])f(?![g])" , v3, perl= TRUE )
[1] 2

So for your case you can do

df[ grep( "(?<![g])f" , df$V1, perl= TRUE ), "V5" ] <- 111
df[ grep( "gf" , df$V1, perl= TRUE ), "V5" ] <- 222

这篇关于数据帧中的R部分匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆