R-如何通过行和列上的键值访问数据表数据 [英] R - how to access a data table data by key value on rows and columns

查看:68
本文介绍了R-如何通过行和列上的键值访问数据表数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据表:

> COUNT_ID_CATEGORY
                id 706 799 1703 1726 2119 2202 3203 3504 3509 4401 4517 5122 5558 5616 5619 5824 6202 7205 9115 9909
     1:      86246   9   0   15    4   28    0   15   63   39    5    7   25   27   43   12   64    1   16    0   96
     2:      86252   3   0   17    6   21    0    6   62   24    6    7   12   25   32    6   49    1   26    0  103
     3:   12262064   3   0    1    1   12    0    0    2    1    0    0    0    2    4    0    4    0    0    0   12
     4:   12277270   2   0    0    0    1    0    3    0    3    0    0    0    0   24    0    6    2    5    0   60
     5:   12332190   2   0    2    0    4    0    1    2    0    0    0    1    0    3    0    1    3    2    0   46
    ---                                                                                                             
310661: 4837642552   0   0    0    0    0    0    1    0    0    0    0    0    0    0    0    1    0    0    1    0
310662: 4843417324   0   0    0    0    0    0    0    0    0    0    0    0    0    0    1    2    0    0    0    0
310663: 4847628950   2   0    1    1   16    0    0    2    3    0    0    2    9    5    0    3    3    2    3   14
310664: 4847787712   0   0    0    0    1    0    0    0    1    0    0    0    0    0    0    0    0    0    0    0
310665: 4853598737   0   0    0    0    0    0    0    0    0    0    0    0    1    0    0    1    0    0    0    0
> class(COUNT_ID_CATEGORY)
[1] "data.table" "data.frame"
>

,我希望尽快读取数据,如下所示:

and I wish to read the data as quickly as possible as follows:

COUNT_ID_CATEGORY for (id == 86246) & (category == 706)

,该值应返回9(表格的左上角)。
(例如)

which should return the value 9 (top left in the table). (for example)

我可以使用以下行:

COUNT_ID_CATEGORY[id==86246,]

但是如何获取该列? / p>

but how do I get the column?

> dput(head(COUNT_ID_CATEGORY))
structure(list(id = c(86246, 86252, 12262064, 12277270, 12332190, 
12524696), `706` = c(9L, 3L, 3L, 2L, 2L, 0L), `799` = c(0L, 0L, 
0L, 0L, 0L, 0L), `1703` = c(15L, 17L, 1L, 0L, 2L, 0L), `1726` = c(4L, 
6L, 1L, 0L, 0L, 0L), `2119` = c(28L, 21L, 12L, 1L, 4L, 0L), `2202` = c(0L, 
0L, 0L, 0L, 0L, 0L), `3203` = c(15L, 6L, 0L, 3L, 1L, 0L), `3504` = c(63L, 
62L, 2L, 0L, 2L, 11L), `3509` = c(39L, 24L, 1L, 3L, 0L, 3L), 
    `4401` = c(5L, 6L, 0L, 0L, 0L, 1L), `4517` = c(7L, 7L, 0L, 
    0L, 0L, 1L), `5122` = c(25L, 12L, 0L, 0L, 1L, 0L), `5558` = c(27L, 
    25L, 2L, 0L, 0L, 1L), `5616` = c(43L, 32L, 4L, 24L, 3L, 18L
    ), `5619` = c(12L, 6L, 0L, 0L, 0L, 0L), `5824` = c(64L, 49L, 
    4L, 6L, 1L, 10L), `6202` = c(1L, 1L, 0L, 2L, 3L, 6L), `7205` = c(16L, 
    26L, 0L, 5L, 2L, 4L), `9115` = c(0L, 0L, 0L, 0L, 0L, 0L), 
    `9909` = c(96L, 103L, 12L, 60L, 46L, 1L)), .Names = c("id", 
"706", "799", "1703", "1726", "2119", "2202", "3203", "3504", 
"3509", "4401", "4517", "5122", "5558", "5616", "5619", "5824", 
"6202", "7205", "9115", "9909"), sorted = "id", class = c("data.table", 
"data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x043a24a0>)


推荐答案

首先 setkey ,可使用 data.table 的二进制搜索/子集功能进行快速查找:

First setkey for fast lookup using data.table's binary search/subset feature:

setkey(COUNT_ID_CATEGORY, id)

然后您可以执行以下操作:

Then you can do:

COUNT_ID_CATEGORY[J(86246)][, '706']

第一部分 COUNT_ID_CATEGORY [J(86246)] 使用二进制搜索执行快速子集。您可以阅读有关 J(。)及其作用的更多信息此处

The first part COUNT_ID_CATEGORY[J(86246)] performs fast subset using binary search. You can read more about J(.) and what it does here.

下一部分 [,'706', with = FALSE] 提取子集结果,即 data.table 并仅选择列 706

The next part [, '706', with=FALSE] takes the subset result, which is a data.table and selects just the column 706.

只需完整一点,显示了从 data.table 选择/子集列的更多方式。

Just to be complete, this post shows more ways of selecting/subsetting columns from a data.table.

这篇关于R-如何通过行和列上的键值访问数据表数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆