数据帧的减法运算 [英] Minus operation of data frames
问题描述
我有 2 个数据框 df1
和 df2
.
I have 2 data frames df1
and df2
.
df1 <- data.frame(c1=c("a","b","c","d"),c2=c(1,2,3,4) )
df2 <- data.frame(c1=c("c","d","e","f"),c2=c(3,4,5,6) )
> df1
c1 c2
1 a 1
2 b 2
3 c 3
4 d 4
> df2
c1 c2
1 c 3
2 d 4
3 e 5
4 f 6
我需要对这两个数据框进行设置操作.我使用 merge(df1,df2,all=TRUE)
和 merge(df1,df2,all=FALSE)
方法来获取这些数据帧的并集和交集并得到所需的输出.获得这些数据帧的负数的功能是什么,即一个数据帧上存在的所有位置,但另一个数据帧上不存在?我需要以下输出.
I need to perform set operation of these 2 data frames. I used merge(df1,df2,all=TRUE)
and merge(df1,df2,all=FALSE)
method to get the union and intersection of these data frames and got the required output. What is the function to get the minus of these data frames,that is all the positions existing on one data frame but not the other? I need the following output.
c1 c2
1 a 1
2 b 2
推荐答案
我记得几个月前遇到过这个exact问题.设法筛选了我的 Evernote 单行字.
I remember coming across this exact issue quite a few months back. Managed to sift through my Evernote one-liners.
注意:这不是我的解决方案.归功于写它的人(我目前似乎找不到).
Note: This is not my solution. Credit goes to whoever wrote it (whom I can't seem to find at the moment).
如果你不担心 rownames
那么你可以这样做:
If you don't worry about rownames
then you can do:
df1[!duplicated(rbind(df2, df1))[-seq_len(nrow(df2))], ]
# c1 c2
# 1 a 1
# 2 b 2
<小时>
data.table
解决方案:
dt1 <- data.table(df1, key="c1")
dt2 <- data.table(df2)
dt1[!dt2]
或更好的单线(从 v1.9.6+ 开始):
or better one-liner (from v1.9.6+):
setDT(df1)[!df2, on="c1"]
这将返回 df1
中 df2$c1
与 df1$c1
不匹配的所有行.
This returns all rows in df1
where df2$c1
doesn't have a match with df1$c1
.
这篇关于数据帧的减法运算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!