data.table - does setkey(...)创建索引或物理重新排序数据表中的行? [英] data.table - does setkey(...) create an index or physically reorder the rows in a data table?

查看:225
本文介绍了data.table - does setkey(...)创建索引或物理重新排序数据表中的行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这(非常基本的)问题是此处的交换结果。

This (very basic) question is the result of an exchange here.

setkey()的说明文件:


setkey()排序data.table并将其标记为sorted。排序的列
是键。键可以是任何顺序的任何列。列是
始终以升序排序。表格已更改
参考
...(已添加强调)

setkey() sorts a data.table and marks it as sorted. The sorted columns are the key. The key can be any columns in any order. The columns are sorted in ascending order always. The table is changed by reference... (emphasis added)

解释这意味着 setkey()创建索引,而不是物理重新排列数据表的行(类似于索引数据库表)。但是如果这是真的,那么删除键(使用 setkey(DT,NULL)),应该删除索引并将数据表恢复为原始的未排序顺序。这不是发生的情况:

I have always interpreted this to mean that setkey() creates an index, rather than physically rearranging the rows of the data table (similar to indexing a database table). But if this was true then removing the key (using setkey(DT,NULL)), should remove the index and restore the data table to it's original, unsorted order. This is not what happens:

library(data.table)
DT <- data.table(a=3:1, b=1:3, c=5:7); DT
   a b c
1: 3 1 5
2: 2 2 6
3: 1 3 7
setkey(DT,a); DT
   a b c
1: 1 3 7
2: 2 2 6
3: 3 1 5
setkey(DT,NULL)
   a b c
1: 1 3 7
2: 2 2 6
3: 3 1 5

这两个问题:

1:如果行被重新排序(排序),那么通过引用改变是什么意思?

1: If the rows are rearranged (sorted), then what does "changed by reference" mean?

2: setkey(DT,NULL)是什么?

推荐答案


  1. 行被排序。

  1. The rows are sorted. "Changed by reference" here means there is no copying of the entire table and rows are just swapped.

setkey(DT,NULL) )等效于 setattr(DT,sorted,NULL)。它只是取消了sorted属性。

setkey(DT, NULL) is equivalent to setattr(DT, "sorted", NULL). It simply unsets the "sorted" attribute.

这篇关于data.table - does setkey(...)创建索引或物理重新排序数据表中的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆