在数据帧中每行选择一个单元格 [英] Choose one cell per row in data frame

查看:110
本文介绍了在数据帧中每行选择一个单元格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个向量告诉我,对于日期框架中的每一行,应该更新此行中值的列索引。

I have a vector that tells me, for each row in a date frame, the column index for which the value in this row should be updated.

> set.seed(12008); n <- 10000; d <- data.frame(c1=1:n, c2=2*(1:n), c3=3*(1:n))
> i <- sample.int(3, n, replace=TRUE)
> head(d); head(i)
  c1 c2 c3
1  1  2  3
2  2  4  6
3  3  6  9
4  4  8 12
5  5 10 15
6  6 12 18
[1] 3 2 2 3 2 1

这意味着对于行1和4,应更新c3;对于行2,3和5,c2应该被更新(等等)。使用向量化操作,即没有应用和朋友,在R中实现这一点的最干净的方法是什么? 编辑:而且,如果可能的话,没有R循环?

This means that for rows 1 and 4, c3 should be updated; for rows 2, 3 and 5, c2 should be updated (among others). What is the cleanest way to achieve this in R using vectorized operations, i.e, without apply and friends? And, if at all possible, without R loops?

我考虑过转换 d 到矩阵中,然后使用一维向量寻址矩阵元素。但是我没有找到一个干净的方式来计算行和列索引中的一维地址。

I have thought about transforming d into a matrix and then address the matrix elements using an one-dimensional vector. But then I haven't found a clean way to compute the one-dimensional address from the row and column indexes.

推荐答案

如果您愿意首先将您的data.frame转换为矩阵,您可以使用两列矩阵对要替换的元素进行索引。 (从 R-2.16.0 开始,这可以直接用data.frames)。索引矩阵应该在其第一列和第二列索引中具有行索引柱。

If you are willing to first convert your data.frame to a matrix, you can index elements-to-be-replaced using a two-column matrix. (Beginning with R-2.16.0, this will be possible with data.frames directly.) The indexing matrix should have row indices in its first column and column indices in its second column.

这里有一个例子:

## Create a subset of the your data
set.seed(12008); n  <- 6 
D  <- data.frame(c1=1:n, c2=2*(1:n), c3=3*(1:n))
i <- seq_len(nrow(D))            # vector of row indices
j <- sample(3, n, replace=TRUE)  # vector of column indices 
ij <- cbind(i, j)                # a 2-column matrix to index a 2-D array 
                                 # (This extends smoothly to higher-D arrays.)  

## Convert it to a matrix    
Dmat <- as.matrix(D)

## Replace the elements indexed by 'ij'
Dmat[ij] <- NA
Dmat
#      c1 c2 c3
# [1,]  1  2 NA
# [2,]  2 NA  6
# [3,]  3 NA  9
# [4,]  4  8 NA
# [5,]  5 NA 15
# [6,] NA 12 18






R-2.16.0 开始,您将能够对数据帧使用相同的语法(即无需先转换数据帧到矩阵)。


Beginning with R-2.16.0, you will be able to use the same syntax for dataframes (i.e. without having to first convert dataframes to matrices).

R-devel 新闻文件: / p>

From the R-devel NEWS file:


现在支持使用两列数字索引对数据帧进行矩阵索引以进行替换和提取。

Matrix indexing of dataframes by two column numeric indices is now supported for replacement as well as extraction.

使用当前的 R-devel 快照,这是什么样子:

Using the current R-devel snapshot, here's what that looks like:

D[ij] <- NA
D
#   c1 c2 c3
# 1  1  2 NA
# 2  2 NA  6
# 3  3 NA  9
# 4  4  8 NA
# 5  5 NA 15
# 6 NA 12 18

这篇关于在数据帧中每行选择一个单元格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆