数据帧的减法运算 [英] Minus operation of data frames

查看:12
本文介绍了数据帧的减法运算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 2 个数据框 df1df2.

I have 2 data frames df1 and df2.

df1 <- data.frame(c1=c("a","b","c","d"),c2=c(1,2,3,4) )
df2 <- data.frame(c1=c("c","d","e","f"),c2=c(3,4,5,6) )

> df1
  c1 c2
1  a  1
2  b  2
3  c  3
4  d  4

> df2
  c1 c2
1  c  3
2  d  4
3  e  5
4  f  6

我需要对这两个数据框进行设置操作.我使用 merge(df1,df2,all=TRUE)merge(df1,df2,all=FALSE) 方法来获取这些数据帧的并集和交集并得到所需的输出.获得这些数据帧的负数的功能是什么,即一个数据帧上存在的所有位置,但另一个数据帧上不存在?我需要以下输出.

I need to perform set operation of these 2 data frames. I used merge(df1,df2,all=TRUE) and merge(df1,df2,all=FALSE) method to get the union and intersection of these data frames and got the required output. What is the function to get the minus of these data frames,that is all the positions existing on one data frame but not the other? I need the following output.

 c1 c2
1  a  1
2  b  2

推荐答案

我记得几个月前遇到过这个exact问题.设法筛选了我的 Evernote 单行字.

I remember coming across this exact issue quite a few months back. Managed to sift through my Evernote one-liners.

注意:这不是我的解决方案.归功于写它的人(我目前似乎找不到).

Note: This is not my solution. Credit goes to whoever wrote it (whom I can't seem to find at the moment).

如果你不担心 rownames 那么你可以这样做:

If you don't worry about rownames then you can do:

df1[!duplicated(rbind(df2, df1))[-seq_len(nrow(df2))], ]
#   c1 c2
# 1  a  1
# 2  b  2

<小时>

data.table解决方案:

dt1 <- data.table(df1, key="c1")
dt2 <- data.table(df2)
dt1[!dt2]

或更好的单线(从 v1.9.6+ 开始):

or better one-liner (from v1.9.6+):

setDT(df1)[!df2, on="c1"]

这将返回 df1df2$c1df1$c1 不匹配的所有行.

This returns all rows in df1 where df2$c1 doesn't have a match with df1$c1.

这篇关于数据帧的减法运算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆