在data.table的i中使用match [英] Use of match within i of data.table

查看:162
本文介绍了在data.table的i中使用match的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

%in%运算符是匹配函数返回与x相同长度的向量的包装器。例如:

The %in% operator is a wrapper for the match function returning "a vector of the same length as x". For instance:

> match(c("a", "b", "c"), c("a", "a"), nomatch = 0) > 0
## [1]  TRUE FALSE FALSE

> of data.table,但

When used within i of data.table, however

(dt1 <- data.table(v1 = c("a", "b", "c"), v2 = "dt1"))
   v1  v2
1:  a dt1
2:  b dt1
3:  c dt1
(dt2 <- data.table(v1 = c("a", "a"), v2 = "dt2"))
   v1  v2
1:  a dt2
2:  a dt2
dt1[v1 %in% dt2$v1]
   v1  v2
1:  a dt1
2:  a dt1

个重复项。如果data.table的 i 中的%in%的预期行为不会给出与

duplicates are obtained. Should the expected behaviour of %in% within i of data.table not give the same result as

dt1[dt1$v1 %in% dt2$v1]  
   v1  v2
1:  a dt1

ie没有重复?

推荐答案

这是 data.table V < 1.9.5自动索引在V> = 1.9.5中固定。

This was a bug in data.table V < 1.9.5 automatic indexing that was fixed in V >= 1.9.5.

我可以想到3种可能的解决方法:

I can think of 3 possible workarounds:


  1. 停用自动索引功能,并使用中的%in

  1. Disable the auto indexing and use base R %in% as in

options(datatable.auto.index = FALSE)
dt1[v1 %in% dt2$v1]
##    v1  v2
## 1:  a dt1


  • 使用内置的%chin %运算符,更高效,没有此错误(仅适用于字符向量比较)

  • Use the built in %chin% operator which both more efficient and doesn't have this bug (works only on character vectors comparison)

    dt1[v1 %chin% dt2$v1]
    ##    v1  v2
    ## 1:  a dt1
    

    li>

  • 从Github安装开发版本(先关闭所有R会话,然后重新打开一个)

  • Install the development version from Github (Close all your R sessions first and reopen just one)

    library(devtools)
    install_github("Rdatatable/data.table", build_vignettes = FALSE)
    library(data.table)
    dt1 <- data.table(v1 = c("a", "b", "c"), v2 = "dt1")
    dt2 <- data.table(v1 = c("a", "a"), v2 = "dt2")
    dt1[v1 %in% dt2$v1]
    ##    v1  v2
    ## 1:  a dt1
    


  • 这篇关于在data.table的i中使用match的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆