从数据帧的第n列获取值，每行的n个不同 [英] Take value from nth column of a data frame, for n different for each row

查看：101 发布时间：2017/3/26 2:18:46 r dataframe vectorization

本文介绍了从数据帧的第n列获取值，每行的n个不同的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何从某些数据框架的 n 列中构造一个值向量，其中 n 是一个在一些向量中定义的每行值？示例：

 > df<  -  data.frame（a = c（100,110,120,130,140），
b = c（200,210,220,230,240），
c = c（300， 310,320,330,340））
> df 
abc 
 1 100 200 300 
 2 110 210 310 
 3 120 220 320 
 4 130 230 330 
 5 140 240 340 
> ; cl（c，c，1，3，3，2，1）
> some.function（df，cl）

将导致：

  [1] 100 310 320 230 140

解决方案

您可以按照2列矩阵进行索引 - 第一列是行号，第二列是列号。

  df [cbind（seq（cl），cl）] 
＃[1] 100 310 320 230 140 
  pre> 
 
 这是一个向量化的操作，应该比循环遍历具有类似 sapply 的行更快，并抓取该行的适当值：
 ＃稍微更大的例子，1000行
 set.seed（144）
 df<  -  matrix（rnorm（3000），nrow = 1000）
 cl<  -  sample（3,1000，replace = TRUE）
 all.equal（df [cbind ），b（b）（c）），b（b）（b）（b） $ b microbenchmark（df [cbind（seq（cl），cl）]，sapply（seq（nrow（df）），function（i）df [i，cl [i]]）
＃单位：微秒
＃expr min lq平均值
＃df [cbind（seq（cl），cl）] 23.828 26.335 34.26012 30.0350 
＃sapply（seq（nfd（df） （i）df [i，cl [i]]）855.481 922.449 1178.47502 996.3815 
＃uq max neval 
＃38.0315 135.894 100 
＃1111.3960 3414.374 100 
  
 
How do I construct a vector of values from nth column of some data frame, where n is a per-row value defined in some vector? Example:
> df <- data.frame(a=c(100, 110, 120, 130, 140),
                   b=c(200, 210, 220, 230, 240),
                   c=c(300, 310, 320, 330, 340))
> df
    a   b   c
1 100 200 300
2 110 210 310
3 120 220 320
4 130 230 330
5 140 240 340
> cl <- c(1, 3, 3, 2, 1)
> some.function(df, cl)
would result in:
[1] 100 310 320 230 140

 解决方案 
You can index by a 2-column matrix -- the first column is the row number and the second is the column number.
df[cbind(seq(cl), cl)]
# [1] 100 310 320 230 140
This is a vectorized operation that should be quicker than looping through the rows with something like sapply and grabbing the appropriate value from that row:
# Slightly larger example, with 1000 rows
set.seed(144)
df <- matrix(rnorm(3000), nrow=1000)
cl <- sample(3, 1000, replace=TRUE)
all.equal(df[cbind(seq(cl), cl)], sapply(seq(nrow(df)), function(i) df[i, cl[i]]))
# [1] TRUE
library(microbenchmark)
microbenchmark(df[cbind(seq(cl), cl)], sapply(seq(nrow(df)), function(i) df[i, cl[i]]))
# Unit: microseconds
#                                             expr     min      lq       mean   median
#                           df[cbind(seq(cl), cl)]  23.828  26.335   34.26012  30.0350
#  sapply(seq(nrow(df)), function(i) df[i, cl[i]]) 855.481 922.449 1178.47502 996.3815
#         uq      max neval
#    38.0315  135.894   100
#  1111.3960 3414.374   100


                        
这篇关于从数据帧的第n列获取值，每行的n个不同的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从数据帧的第n列获取值，每行的n个不同 [英] Take value from nth column of a data frame, for n different for each row

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从数据帧的第n列获取值，每行的n个不同 [英] Take value from nth column of a data frame, for n different for each row

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭