唯一键不再使用键作为默认键 [英] Unique doesn't use keys as default anymore

查看:80
本文介绍了唯一键不再使用键作为默认键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我主要在Mac中使用Rstudio。最近,我不得不开始使用Windows。但是,我发现unique()不会基于键在data.table中提供唯一行。下面是一个示例:

I mainly use Rstudio in Mac. Recently I had to start using Windows. However, I found out that unique() does not provide unique rows in data.table based on the key. Here is an example:

a=c(2,3,3,3,3,5,6,7)
b=c("a","a","f","g","a","d","t","l")
e=data.table(a,b)
setkey(e, a)
key(e) # this works fine
unique(e) 

unique()仅删除整行(第5行)的重复项。完全相同的代码在我的Mac上运行良好。

unique() only removes the duplicate for the entire line (line 5). The exact same code runs fine on my mac.

推荐答案

这是因为您在两个版本上都有不同的data.table版本。在Mac上,您使用的是< 1.9.8版本(仍默认使用密钥),而在Windows上,您使用的是较新版本(没有)。

That's because you have different data.table versions on both. On Mac you have a <1.9.8 version (which still uses keys as default), while on Windows you have a newer version (which doesn't).

?unique 中所述(在data.table V1.9.8 +中):

As stated in ?unique (in data.table V1.9.8+):


默认情况下,所有列都被使用。最近为了与data.frame方法保持
的一致性而对此进行了更改。在版本< 1.9.8默认值为
键(x)

By default all columns are being used. This was changed recently for consistency to data.frame methods. In version < 1.9.8 default was key(x)

含义,从现在开始,您需要显式指定 by 变量,即使您已经设置了键,否则将只使用所有列。

Meaning, from now on, you need to explicitly specify the by variable even if you already have keys set, otherwise it will just use all the columns.

对于您的特定示例,这项工作

For your specific example, this works

unique(e, by = "a")
#    a b
# 1: 2 a
# 2: 3 a
# 3: 5 d
# 4: 6 t
# 5: 7 l

或者如注释中提到的@Frank一样,您也可以在 by 参数中使用 unique(a,by = key(a))

Or as @Frank mentioned in comments, you can also specify the the key in the by param using unique(a, by = key(a)).

这篇关于唯一键不再使用键作为默认键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆